During Big Community’s stint in Singapore at the Strata-Hadoop Conference, we managed to squeeze out some time from the busy schedule of Doug Cutting and Mike Olson to get their views on what they see for the future for Big Data in the region.
We spoke to them about the latest technologies in what could be improving upon Hadoop and beyond.
Doug explains the two areas related to Hadoop. “There are two senses in this case. The most precise sense is the Hadoop Apache open source project and the more general sense that people often times mean, are all the projects that have built upon Hadoop. In the more precise sense, the technologies in Apache Hadoop are already being replaced. We’ve got a storage system HDFS, we’ve got a scheduler YARN and we’ve got an executioner in MapReduce”.
Doug Cutting
There already are competitors for each of these. In some cases the competitors are beating out these technologies. Such as Spark for instance being a better solution than MapReduce. We only see a few cases where MapR does excel. New systems in HDFS such as Apache Kudu are better in most but not all cases, therefore having a general file system like HDFS can come in useful at times.
“With YARN we see many competitors that aren’t an exact replacement but I think it’s a good thing, because it shows the success of this model of having a loosely connected set of open source projects that are independently run. It also makes Cloudera’s role clear as a curator of this ecosystem of open source projects, while not being tied to any one project specifically”, he added.
Mike Olson went on to say that the early design of the Hadoop project had 3 separate components. “It was modular from the beginning. When Cloudera started it, we started shipping MapReduce and HDFS as the core project. Today we bundle 27 projects that include Apache Spark, Impala and many other eco system projects”.
Mike Olson
Hadoop has expanded dramatically from its initial days. Although some of those projects are not used as often, the ecosystem as a whole, has become robust. The pace at which these new products have been created and the problems they are addressing, has been nothing short of impressive.
Talking on machine learning and artificial intelligence, Doug says that it does have a big impact but at the moment, that hype is all the press doing.
“We find a majority of the customers are finding value from simpler technologies. From just being able to count things which they couldn’t count previously, the customer already finds that incredibly valuable. Also figuring out what needs to be counted and what can’t be counted”.
Mike Olson took an opposite approach by looking at what Big Data’s effect was on machine learning instead of what effect machine learning had on big data.
“Finally we can collect these very large data sets and we can train the machine learning systems. We can build much more accurate models than we could have before”, he says.
Especially with financial services, banks and credit card companies that need to pay attention to fraud and monitor money laundering activities. The use of machine learning techniques to go over historical data and to recognise fraudulent events in the timeline is crucial in locating fraudulent behaviour in the system. They also need to recognise the activities when it happens in real time.
“Same is true in cyber security”, Mike added. “When the bad guys try to break into your network, it helps to know what your network looks like normally. Training your models on historical data lets you recognise different behaviour in the future”.
Talking on the dangers of new technology falling in the wrong hands, Mike views any technology as a double-edged sword. Big Data being no different. Ethics and practices within the legal framework should be adhered to in maintaining and protecting peoples interest.
“There’s been a lot of great work done on data privacy”, Mike adds. “Around the world, we make sure that privacy is protected with strong encryption and good access controls; noticing log-ins, who touches the data and what they do with it. As with any technology, the companies that build these systems, need to look at the law and ethical considerations. This isn’t a new problem. It’s really a new system running into an old problem.”
His experience with companies he deals with is that they work hard to be responsible and ethical. He believes everyone recognises the fact that public attention is looking at this very closely. No one wants bad press in these times of discovery so he believes that companies are taking the right approach in setting good standards to be adhered to.
Big Data is being used to understand consumer behaviour, driving better engagement and real-time interaction with customers and has also delivered real results for many corporations throughout the region as expressed by many keynote speakers from the banking and service industries.
“The partner ecosystems, the solutions providers as well as the applications run on the platform are growing quite quickly in this region. We’re seeing big banks and big insurance companies, the big data consumers, adopt the technology much more than a few years ago”, Mike shares.
Doug agreed wholeheartedly and says that the technology has really taken off in the last 2 years and its showing in how their business is growing phenomenally in the region.
Archive
- October 2024(44)
- September 2024(94)
- August 2024(100)
- July 2024(99)
- June 2024(126)
- May 2024(155)
- April 2024(123)
- March 2024(112)
- February 2024(109)
- January 2024(95)
- December 2023(56)
- November 2023(86)
- October 2023(97)
- September 2023(89)
- August 2023(101)
- July 2023(104)
- June 2023(113)
- May 2023(103)
- April 2023(93)
- March 2023(129)
- February 2023(77)
- January 2023(91)
- December 2022(90)
- November 2022(125)
- October 2022(117)
- September 2022(137)
- August 2022(119)
- July 2022(99)
- June 2022(128)
- May 2022(112)
- April 2022(108)
- March 2022(121)
- February 2022(93)
- January 2022(110)
- December 2021(92)
- November 2021(107)
- October 2021(101)
- September 2021(81)
- August 2021(74)
- July 2021(78)
- June 2021(92)
- May 2021(67)
- April 2021(79)
- March 2021(79)
- February 2021(58)
- January 2021(55)
- December 2020(56)
- November 2020(59)
- October 2020(78)
- September 2020(72)
- August 2020(64)
- July 2020(71)
- June 2020(74)
- May 2020(50)
- April 2020(71)
- March 2020(71)
- February 2020(58)
- January 2020(62)
- December 2019(57)
- November 2019(64)
- October 2019(25)
- September 2019(24)
- August 2019(14)
- July 2019(23)
- June 2019(54)
- May 2019(82)
- April 2019(76)
- March 2019(71)
- February 2019(67)
- January 2019(75)
- December 2018(44)
- November 2018(47)
- October 2018(74)
- September 2018(54)
- August 2018(61)
- July 2018(72)
- June 2018(62)
- May 2018(62)
- April 2018(73)
- March 2018(76)
- February 2018(8)
- January 2018(7)
- December 2017(6)
- November 2017(8)
- October 2017(3)
- September 2017(4)
- August 2017(4)
- July 2017(2)
- June 2017(5)
- May 2017(6)
- April 2017(11)
- March 2017(8)
- February 2017(16)
- January 2017(10)
- December 2016(12)
- November 2016(20)
- October 2016(7)
- September 2016(102)
- August 2016(168)
- July 2016(141)
- June 2016(149)
- May 2016(117)
- April 2016(59)
- March 2016(85)
- February 2016(153)
- December 2015(150)