
Platforms used for big data are a bit of a conundrum. Big data and data science are two of the biggest business buzzwords, and the biggest companies around the world are hard at work to get ahead of the data curve. Normally, when it comes to big money opportunities, the resources behind them would be expected to carry a heavy price tag. Big data, however, has its roots and future in open source technologies. Companies big and small are sharing what they know, and that’s the way it’s going to stay.
FASTER DEVELOPMENT
The biggest names in data are open source. Many of them are even part of the same Apache family: Spark, Hadoop, Kafka, Cassandra. RapidMiner and Orange are there for data mining, and open source databases are chipping away at Oracle. Though closed source databases are still incredibly popular, open source alternatives are growing at rapid speed. It is very clear that, if they keep growing, those closed source databases won’t be big for much longer. Solid IT co-founder Matthias Gelbmann describes several database management systems in one blog post, noting that, “we often see, that once Redis is installed for caching, and people experience its speed and reliability, they start moving more and more functionality there.” Redis, an open source database management system, has continued to grow despite the company, in their own words, having small business resources and no “intentional” marketing.
There are several reasons for the growth of these open source systems, one of which is the way it allows different people in different areas to effectively work together. When companies share their work and allow others to contribute, the result is outside eyes finding new holes and new possibilities. Deep learning technology owes a lot to big players like Google and Facebook, who actively give their data and resources back to the community. Technology appears to develop very quickly, but it is not an instantaneous process. If companies were to attempt to tackle big data software on their own, with no input or help from open-source softwares, it would be a painfully slow process. There is a serious need to keep up with the times, and big data is a rapidly growing field. One study from McKinsey highlights the shortage of talent in the data science field at length, noting that data science jobs in the United States will exceed 490,000 by 2018, but there will be fewer than 200,000 scientists to fill those jobs. This does not effect only the smaller businesses looking to keep up with the times, but major investors that could change the course of business at large. Companies are looking to rapidly expand their data science departments and usages, but the talent pool and technology is not yet there. Open sourcing that data and technology at least eases the burden, and allows companies to move forward at an even pace.
Community also means that users have the chance to ask questions and get helpful answers. Instead of going into a tailspin when a problem arises, a user will likely find several others in the community who have the answer, or, more likely, know how to find it. Creative open-source users also tend to look for ways to work economically and save money. They are likely to find or tweak inexpensive hardware, whereas a major software company with a monopoly may push users to buy very specific and expensive gear.
SMALLER BUSINESSES
Once companies move to put their data to use, they often find themselves in a “data lake.” Without the proper resources, or, for smaller companies, the funds to harness them, data is absolutely useless. If a small company were to pay for every bit of software (and education) required to use data, there would be a much smaller incentive to try to integrate big data into the work place. Open source, however, has that “try before you buy” mentality. For companies like Talend, who offers products based on open source software, potential customers are often familiar and comfortable with the open-source aspects of products. Those who stick to open-source software also get the chance to try before buying into the entire big data scheme. New users can take a chance on data with little risk. Experts can move between different solutions with relative ease.
Talend’s CEO, Mike Tuchen, even told InfoWorld that “the entire next-generation data platform will be open source,” which means gains for open-source companies and those who build upon then. “It’s the new normal,” he says. Even education in the area supports the “open-source” community by very often remaining free. While university degrees will certainly prove useful, many businesses and programmers are simply looking for further education on big data topics to add to their arsenal. Free online courses in data are abundant and programs from Udactiy, Big Data University, and others are trying to fill the gap between data science wannabes and users. Even Google held a free course on how to use data.
PROOF OF FUNCTION
The incredible growth experienced by open source programs is the real proof that it is the future of data. Companies powered in part by Hadoop include Amazon, Facebook, and even IBM. The companies who are making great strides with data are the ones also pushing open source. This proves not only their effectiveness, but shows where finances in data is headed, and just where companies are placing their eggs. Further proof comes from none other than Russia, where companies are changing their mind about open source. Whereas data scientists once shunned full-scale use of open source big data technology, they are now turning the other direction. According to Computer Weekly, smaller companies are now turning to opensource solutions, and larger companies, including Russia’s home grown search company Yandex are in the business of paying close attention to Hadoop solutions and developments, to make sure they don’t get left behind.
The past, present, and future of big data is strongly rooted in open source tech, and that will be one of its greatest strengths. With the shortage of data scientists and skilled workers, it will be paramount that companies and individuals have easy access to powerful and up-to-date solutions without fear of paying every last penny to stay in the game. Especially as companies like Google and Facebook share their knowledge, the future of data will only get better and more powerful.
This article was originally published on www.dataconomy.com and can be viewed in full


Archive
- October 2024(44)
- September 2024(94)
- August 2024(100)
- July 2024(99)
- June 2024(126)
- May 2024(155)
- April 2024(123)
- March 2024(112)
- February 2024(109)
- January 2024(95)
- December 2023(56)
- November 2023(86)
- October 2023(97)
- September 2023(89)
- August 2023(101)
- July 2023(104)
- June 2023(113)
- May 2023(103)
- April 2023(93)
- March 2023(129)
- February 2023(77)
- January 2023(91)
- December 2022(90)
- November 2022(125)
- October 2022(117)
- September 2022(137)
- August 2022(119)
- July 2022(99)
- June 2022(128)
- May 2022(112)
- April 2022(108)
- March 2022(121)
- February 2022(93)
- January 2022(110)
- December 2021(92)
- November 2021(107)
- October 2021(101)
- September 2021(81)
- August 2021(74)
- July 2021(78)
- June 2021(92)
- May 2021(67)
- April 2021(79)
- March 2021(79)
- February 2021(58)
- January 2021(55)
- December 2020(56)
- November 2020(59)
- October 2020(78)
- September 2020(72)
- August 2020(64)
- July 2020(71)
- June 2020(74)
- May 2020(50)
- April 2020(71)
- March 2020(71)
- February 2020(58)
- January 2020(62)
- December 2019(57)
- November 2019(64)
- October 2019(25)
- September 2019(24)
- August 2019(14)
- July 2019(23)
- June 2019(54)
- May 2019(82)
- April 2019(76)
- March 2019(71)
- February 2019(67)
- January 2019(75)
- December 2018(44)
- November 2018(47)
- October 2018(74)
- September 2018(54)
- August 2018(61)
- July 2018(72)
- June 2018(62)
- May 2018(62)
- April 2018(73)
- March 2018(76)
- February 2018(8)
- January 2018(7)
- December 2017(6)
- November 2017(8)
- October 2017(3)
- September 2017(4)
- August 2017(4)
- July 2017(2)
- June 2017(5)
- May 2017(6)
- April 2017(11)
- March 2017(8)
- February 2017(16)
- January 2017(10)
- December 2016(12)
- November 2016(20)
- October 2016(7)
- September 2016(102)
- August 2016(168)
- July 2016(141)
- June 2016(149)
- May 2016(117)
- April 2016(59)
- March 2016(85)
- February 2016(153)
- December 2015(150)