Nicolas Spyratos, Professor Emeritus of Computer Science, at the University of Paris-South, France, Learning and Optimisation Group, at the Laboratory for Research in Informatics (LRI), Affiliated Scientist, at the Institute of Computer Science (FORTH-ICS), Greece, took some time off his busy schedule while he was in Malaysia to talk to Big Community and answer a few in-depth questions around Big Data Analytics.
We started the discussion on whether there is a technology that has improved on what Hadoop first started. He says although Hadoop is much more popular, Spark is able to perform tasks better than Hadoop. “Tasks that are repetitive are easily performed in Spark whereas not so easily if you were using Hadoop”.
When asked on the relevance of structured databases based on SQL he went on to say that SQL will remain as a very relevant aspect to Big Data because of its history. People are used to asking questions in SQL. It is highly unlikely that that process will change.
With regards to ETL Technology (Extract, Transform, Load), Prof says this is a very important 2 part component in Big Data. Without quality data, all the information collected will have little value. The 2 parts are in the data collection, and in throwing out the data that isn’t useful.
“When we look at the quality of data, we look at selecting data that can be used, and then clearing data that can’t be used”, he said.
So what would constitute ‘Good Data’?
That question can only be answered by the domain experts he quips. The domain experts are the ones who will look at the whole process of data collection, from how it is collected, to its final destruction, and decide which is the best procedures to use in looking for the right types of data.
For instance, if data was needed to determine if there has been soil erosion on beaches in areas where dredging is being conducted, then the expert in the field of environmental science will need to decide what type of data needs to be collected. Will it be wind speeds? The tides? Water temperature? The expert will then need to use sensory equipment to collect that data, and from that data make a decision as to what will be used for his/her findings.
Understanding this process is a key ingredient in making Big Data Analytics a powerful adversary in our everyday lives. Once the right data can be attained, processing that data becomes easy. Insights from that data will be able to benefit many people and governments in finding solutions to problems like soil erosion and environmental issues.
He tells Big Community that before there were the tools available today to process data, data such as what was needed to look at soil erosion would take years to tabulate before being able to gain any insights. Hundreds of man power hours would be taken up in those years to collect millions of data points. Today that process can be done in mere seconds, and can be taken from years before and projected into the future to make predictions.
He adds that he is very excited to see how Big Data Analytics will shape the world in the next 5 to 10 years.
Prof had just spoken at the Data Matters Series, ‘Big Data Analytics with R’ which was promoted by the Malaysian R User Group (MyRUG) and Malaysian Digital Economy Corporation (MDEC). He was also scheduled to speak at a forum on ‘Big Data Science And Applications’ at the International University of Malaya-Wales the following day where we caught up with him.
Archive
- October 2024(44)
- September 2024(94)
- August 2024(100)
- July 2024(99)
- June 2024(126)
- May 2024(155)
- April 2024(123)
- March 2024(112)
- February 2024(109)
- January 2024(95)
- December 2023(56)
- November 2023(86)
- October 2023(97)
- September 2023(89)
- August 2023(101)
- July 2023(104)
- June 2023(113)
- May 2023(103)
- April 2023(93)
- March 2023(129)
- February 2023(77)
- January 2023(91)
- December 2022(90)
- November 2022(125)
- October 2022(117)
- September 2022(137)
- August 2022(119)
- July 2022(99)
- June 2022(128)
- May 2022(112)
- April 2022(108)
- March 2022(121)
- February 2022(93)
- January 2022(110)
- December 2021(92)
- November 2021(107)
- October 2021(101)
- September 2021(81)
- August 2021(74)
- July 2021(78)
- June 2021(92)
- May 2021(67)
- April 2021(79)
- March 2021(79)
- February 2021(58)
- January 2021(55)
- December 2020(56)
- November 2020(59)
- October 2020(78)
- September 2020(72)
- August 2020(64)
- July 2020(71)
- June 2020(74)
- May 2020(50)
- April 2020(71)
- March 2020(71)
- February 2020(58)
- January 2020(62)
- December 2019(57)
- November 2019(64)
- October 2019(25)
- September 2019(24)
- August 2019(14)
- July 2019(23)
- June 2019(54)
- May 2019(82)
- April 2019(76)
- March 2019(71)
- February 2019(67)
- January 2019(75)
- December 2018(44)
- November 2018(47)
- October 2018(74)
- September 2018(54)
- August 2018(61)
- July 2018(72)
- June 2018(62)
- May 2018(62)
- April 2018(73)
- March 2018(76)
- February 2018(8)
- January 2018(7)
- December 2017(6)
- November 2017(8)
- October 2017(3)
- September 2017(4)
- August 2017(4)
- July 2017(2)
- June 2017(5)
- May 2017(6)
- April 2017(11)
- March 2017(8)
- February 2017(16)
- January 2017(10)
- December 2016(12)
- November 2016(20)
- October 2016(7)
- September 2016(102)
- August 2016(168)
- July 2016(141)
- June 2016(149)
- May 2016(117)
- April 2016(59)
- March 2016(85)
- February 2016(153)
- December 2015(150)