Covering Disruptive Technology Powering Business in The Digital Age

image
Livy To Make Spark Consumption Easy
image
June 29, 2016 News

Microsoft together with Cloudera, recently announced an open source project to reduce the burden on developers leveraging on Spark. The open source project Livy, will run as a REST API for submitting, running, and managing Spark jobs and contexts.
Senior Director at Cloudera, Daniel Ng, in an interview with DSA said that Livy will enhance the developer experience for almost any Data Science use case. “By being able to submit code and jobs remotely, developers will have a greater ease of putting their code into production. More specifically, use cases like advanced analytics, machine learning, data wrangling/munging, and ad-hoc data exploration will benefit from having an easy-to-access and highly available rest api based interface.”
Does Livy have the potential to replace Data Scientists and Data Engineers altogether?
Not entirely. Data scientists are still needed to build, test, and help deploy models. Livy allows data scientists and data engineers to focus on delivering an API endpoint for application developers as opposed to being involved in every application and initiative.
The obvious gap in tools and experience for developers is what drove the two companies to start this initiative. They realized there was a major issue with how engineers had to pull models down to re-execute code. Which means taking resources offline. Also, when data scientists build their models, they would often have to embed the distributed systems logic into the app. Which is where Livy comes in. Now, they can simply point their app developers to an API, which is easier for the developer to integrate.
“Livy is targeted more for internal apps, such as operational applications – for example, when a developer has built a great way to analyze stock information and wants to embed that logic in an end client app or an app built for a sector of the business”, he said, adding that Livy is currently in alpha development and is available for free off the GitHub page and on their website (Livy.io).
Senior product manager, Anand Iyer has high hopes for Livy use for developers. “Spark gives you fast big data processing with a general purpose flexible API. We see a natural tendency among our customers and partners to want to leverage Spark’s capabilities from client applications that can easily interface with Spark, and Livy makes that possible. Livy will open Spark to new use cases, and we are hoping it attracts a community of developers that will not only build applications on top of Livy, but also contribute to it, help shape its API and enhance its functionality. It is still a very nascent project, and hence any contribution will have tremendous impact.”
To a question if Livy will be usable or relevant to Hadoop, Daniel says yes. “Cloudera has made the claim through our One Platform Initiative that Spark will replace MapReduce as the de-facto data processing technology. Livy is the key to how data scientists and engineers can leverage and interface with spark in a secure and easy way. Livy is still in its early alpha stage, but it promises to simplify application architecture while providing a service that allows users to use Spark as an application back-end.”

(0)(0)

Archive