Covering Disruptive Technology Powering Business in The Digital Age

image
Professor Nicolas Spyratos talks to Big Community
image

 

Nicolas Spyratos, Professor Emeritus of Computer Science, at the University of Paris-South, France, Learning and Optimisation Group, at the Laboratory for Research in Informatics (LRI), Affiliated Scientist, at the Institute of Computer Science (FORTH-ICS), Greece, took some time off his busy schedule while he was in Malaysia to talk to Big Community and answer a few in-depth questions around Big Data Analytics.

We started the discussion on whether there is a technology that has improved on what Hadoop first started. He says although Hadoop is much more popular, Spark is able to perform tasks better than Hadoop. “Tasks that are repetitive are easily performed in Spark whereas not so easily if you were using Hadoop”.

When asked on the relevance of structured databases based on SQL he went on to say that SQL will remain as a very relevant aspect to Big Data because of its history. People are used to asking questions in SQL. It is highly unlikely that that process will change.

With regards to ETL Technology (Extract, Transform, Load), Prof says this is a very important 2 part component in Big Data. Without quality data, all the information collected will have little value. The 2 parts are in the data collection, and in throwing out the data that isn’t useful.

img-20161125-wa0002

“When we look at the quality of data, we look at selecting data that can be used, and then clearing data that can’t be used”, he said.

So what would constitute ‘Good Data’?

That question can only be answered by the domain experts he quips. The domain experts are the ones who will look at the whole process of data collection, from how it is collected, to its final destruction, and decide which is the best procedures to use in looking for the right types of data.

For instance, if data was needed to determine if there has been soil erosion on beaches in areas where dredging is being conducted, then the expert in the field of environmental science will need to decide what type of data needs to be collected. Will it be wind speeds? The tides? Water temperature? The expert will then need to use sensory equipment to collect that data, and from that data make a decision as to what will be used for his/her findings.

Understanding this process is a key ingredient in making Big Data Analytics a powerful adversary in our everyday lives. Once the right data can be attained, processing that data becomes easy. Insights from that data will be able to benefit many people and governments in finding solutions to problems like soil erosion and environmental issues.

He tells Big Community that before there were the tools available today to process data, data such as what was needed to look at soil erosion would take years to tabulate before being able to gain any insights. Hundreds of man power hours would be taken up in those years to collect millions of data points. Today that process can be done in mere seconds, and can be taken from years before and projected into the future to make predictions.

He adds that he is very excited to see how Big Data Analytics will shape the world in the next 5 to 10 years.

Prof had just spoken at the Data Matters Series, ‘Big Data Analytics with R’ which was promoted by the Malaysian R User Group (MyRUG) and Malaysian Digital Economy Corporation (MDEC). He was also scheduled to speak at a forum on ‘Big Data Science And Applications’ at the International University of Malaya-Wales the following day where we caught up with him.

 

 

 

(0)(0)

Archive