Covering Disruptive Technology Powering Business in The Digital Age

image
IMS Health Big Data Factory Gets Support From Cloudera
image
August 3, 2016 News

IMS Health recently announced that it has selected Cloudera Enterprise to support its Big Data Factory, a cloud-based platform that transforms data into intelligence for life sciences and healthcare clients as they navigate dynamic markets, drive operational excellence and demonstrate the value of medicines. The collaboration with Cloudera – a leader in enterprise analytics data management powered by Apache Hadoop and the latest open source technologies – will accelerate and enhance data acquisition, processing and warehousing in the factory.

“As IMS Health continues to innovate with Big Data, Cloudera will advance our efforts to build and provision breakthrough data processing and data warehousing solutions,” said Karl Guenault, chief information officer and vice president of Operations, IMS Health. “We’re excited to apply Hadoop and other open source technologies for integrating, processing and interpreting more extensive and granular data – helping our customers drive healthcare performance.”

 

The combined solution will support:

 

  • Next-generation data processing.Cloudera’s latest open source technology platforms support machine learning – which uses cognitive systems to automate quality control, data bridging and modeling, and other repetitive tasks—to drive evidence-based decisions for clients. Using the Hadoop cluster technology, IMS Health can extract, store and analyze vast quantities of unstructured data for generating new, more comprehensive insights for customers.

 

  • Accelerated time to insight.The IMS Health Big Data Factory will harness Hadoop’s parallel processing technology to streamline data acquisition and management, delivering powerful insights with higher efficiency.

 

  • Scalable data management.The Hadoop distributed file system (HDFS) eliminates data transfer bottlenecks, storage capacity delays and time spent adding new servers to augment traditional data clusters – enabling IMS Health to improve the speed and scalability of Big Data management.

 

“The insight that IMS Health will gain through the platform can ultimately be measured in better patient outcomes, and Cloudera is thrilled to work with them on this endeavor,”said Shawn Dolley, Cloudera Industry Leader for Health and Life Sciences.  “As the leading provider of open source Hadoop solutions, Cloudera and IMS Health are uniquely positioned to support the growing needs of healthcare organizations looking to optimize their data utilization and provide better care.”

Big Community was privileged to get an interview with Daniel Ng, Senior Director, APAC Cloudera on his views for the project. Here are the questions we posed to him.

If patients’ data is required for the collaboration between Cloudera and IMS, then how will the doctor patient confidentiality be affected and what steps will be taken to mitigate other privacy issues?

Healthcare data arrives at IMS in de-identified and anonymous formats.  While Cloudera professional services have helped, trained and supported IMS, it is IMS’ trained information technology staff whose vision, skills, and expertise with healthcare data and leading edge technologies, have made their next generation data warehouse a reality.  As a result, Cloudera is not exposed to the anonymous and de-identified data, and doctor-patient confidentiality issues are made moot by the data providers (to IMS), even before IMS is involved.

 

What specific insights are you looking to retrieve from the data collected?

IMS Health can generate hundreds of insights in minutes through use of Cloudera; something that used to take days.  These include building custom cohorts of patients where authors can select previous and current medications from electronic health record values and previous procedures, patient characteristics, genotytpes and many more.  Insights may relate to drug interactions, non-compliance, adverse events, efficacy of drugs and causal relationships in outcomes, all at a scale of up to millions of patients from a hundreds of millions sized patient database.

 

As the medical fraternity holds data in multiple silos, and retrieval and processing has been a major issue in the past, will Cloudera be looking to address this issue and how?

The industry practitioners are beginning to believe in merging these silos into a large data lake. This does make sense assuming it can be secured, without enforcing any data models on data load, and then accessed in virtually limitless ways.  This ‘schema-on-read’ approach, that does not need an advanced understanding of physician, nurse, health economist, researcher, or other users’ data and analytic requirements, is the key to quickly driving data from sources into a hub.  At IMS, reducing the number of custom data warehouses or marts built for each end user, or question, saves days, and the statistical calculations within a silo often take longer to run than a massively parallel analysis on a data lake (especially when high-scale questions are involved).  Cloudera solves this by supporting the most open source, open standard Apache projects, certifying the most analytic data access tools available on the market, innovating in Search and Spark, developing and acquiring security technology, and thus delivering open source support to more healthcare companies than any other big data company in the world.

(0)(0)

Archive