
This article was originally published by informationweek.com and can be viewed in full here
Microsoft updated its R statistical modeling language product lineup, Yahoo released a massive machine learning data set to the academic community, Baidu released some of its machine learning developments around speech recognition to open source, and IBM acquired a real-time fraud detection and analytics company. We’ve got those stories and more in this week’s big data roundup.
Let’s start with Microsoft. It’s been a year since the company put a big stake in the ground by acquiring Revolution Analytics, a distributor of the open source R statistical modeling language. Back then, the move was viewed as a way for Microsoft to supplement its growing big data and analytics toolbox as well as to show that it understands the importance of open source. This week, the company announced the rebranding of its R servers and development tools under the Microsoft name, yet continuing its commitment to offering many of those tools for free to the development community.
Meanwhile, another tech company showed that it cares about the development community, too. Yahoo released a massive machine learning data set to the academic research community. This data set includes the surfing and search habits of 20 million anonymous users.
Yahoo’s move is designed to be used by researchers for context-aware learning, large-scale learning algorithms, user behavior modeling, and content enrichment. Yahoo said the information includes data about how users interacted with the Yahoo home page, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Movies and Yahoo Real Estate. The data set is available as part of the Yahoo Labs Webscope data-sharing program, a reference library of datasets composed of anonymous user data for non-commercial use.
The research arm of Baidu, which has sometimes have been described as the Google of China, has released some of its machine learning software called Warp-CTC under an open source Apache license and posted it on GitHub. The Warp-CTC builds on previous algorithms and was developed as Baidu worked on its Deep Speech recognition system that has been shown to work for English and Mandarin. The company said in a FAQ that it is releasing the development to open source because “We want to make end-to-end deep learning easier and faster so researchers can make more rapid progress. … We want to start contributing to the machine learning community by sharing an important piece of code that we created.” Baidu said that it expects to release additional open source AI tools in the future.
IBM announced Jan. 15 that it has acquired IRIS Analytics, a privately held company specializing in real-time analytics for combatting payment fraud. IRIS Analytics is focused on the problem of detecting fraud as it is attempted instead of after it has happened. IRIS provides a real-time fraud analytics engine that leverages machine learning to generate rapid anti-fraud models while supporting the creation and modification of ad-hoc models, IBM said. Financial terms of the deal were not disclosed.
Databricks, the company whose founders developed the widely popular big data platform Apache Spark, has announced a series of top management changes. Ion Stoica is leaving his job as CEO and will assume the role of executive chairman. Current VP of engineering and product Ali Ghodsi has been named as CEO. Patrick Wendell will move into the role of VP of engineering and Ron Gabrisko has joined the company as SVP of worldwide sales.
Databricks sells and services an implementation of Apache Spark, and these executive moves reflect the two-year-old company’s efforts to get serious about the commercial market and enterprise customers. “As the creators and drivers of the Spark engine, Databricks is at an inflection point where the pace of innovation coming from the community positions us for tremendous growth and opportunity in 2016,” Stoica said in a prepared statement. “Ali [Ghodsi] is positioned to enable both Databricks and Spark to seek widespread enterprise adoption, momentum, and customer acquisition.”
Data platform analytics company Looker this week announced it has closed a $48 million Series C funding round led by Kleiner Perkins Caufield & Byers, with participation from previous investors, too. The company said it will use the new capital to accelerate its growth through investments in sales, marketing, engineering, and international expansion.
Lastly, digital crowd-sourced encyclopedia Wikipedia is marking its 15th anniversary. To help celebrate the occasion, the folks over at FiveThirtyEight.com have collected the three most edited entries for each year since Wikipedia launched in 2001, which you can see in the article. Spoilers: many of highly edited entries are related to big news events for each year, particularly if those events were in any way controversial. For instance, in 2008 the entry most edited was for then US vice presidential candidate Sarah Palin. Wikipedians are also obsessed with tracking deaths, major weather events and systems, popular culture, politics and “the esoteric and arcane.”


Archive
- October 2024(44)
- September 2024(94)
- August 2024(100)
- July 2024(99)
- June 2024(126)
- May 2024(155)
- April 2024(123)
- March 2024(112)
- February 2024(109)
- January 2024(95)
- December 2023(56)
- November 2023(86)
- October 2023(97)
- September 2023(89)
- August 2023(101)
- July 2023(104)
- June 2023(113)
- May 2023(103)
- April 2023(93)
- March 2023(129)
- February 2023(77)
- January 2023(91)
- December 2022(90)
- November 2022(125)
- October 2022(117)
- September 2022(137)
- August 2022(119)
- July 2022(99)
- June 2022(128)
- May 2022(112)
- April 2022(108)
- March 2022(121)
- February 2022(93)
- January 2022(110)
- December 2021(92)
- November 2021(107)
- October 2021(101)
- September 2021(81)
- August 2021(74)
- July 2021(78)
- June 2021(92)
- May 2021(67)
- April 2021(79)
- March 2021(79)
- February 2021(58)
- January 2021(55)
- December 2020(56)
- November 2020(59)
- October 2020(78)
- September 2020(72)
- August 2020(64)
- July 2020(71)
- June 2020(74)
- May 2020(50)
- April 2020(71)
- March 2020(71)
- February 2020(58)
- January 2020(62)
- December 2019(57)
- November 2019(64)
- October 2019(25)
- September 2019(24)
- August 2019(14)
- July 2019(23)
- June 2019(54)
- May 2019(82)
- April 2019(76)
- March 2019(71)
- February 2019(67)
- January 2019(75)
- December 2018(44)
- November 2018(47)
- October 2018(74)
- September 2018(54)
- August 2018(61)
- July 2018(72)
- June 2018(62)
- May 2018(62)
- April 2018(73)
- March 2018(76)
- February 2018(8)
- January 2018(7)
- December 2017(6)
- November 2017(8)
- October 2017(3)
- September 2017(4)
- August 2017(4)
- July 2017(2)
- June 2017(5)
- May 2017(6)
- April 2017(11)
- March 2017(8)
- February 2017(16)
- January 2017(10)
- December 2016(12)
- November 2016(20)
- October 2016(7)
- September 2016(102)
- August 2016(168)
- July 2016(141)
- June 2016(149)
- May 2016(117)
- April 2016(59)
- March 2016(85)
- February 2016(153)
- December 2015(150)