
NOTE FROM BIG COMMUNITY SENIOR JOURNALIST
The big question about the big ‘C’. Can it be predicted and can it be cured.
The article succinctly discusses both the pro’s and con’s of the abilities Big Data Analytics has on the health industry as a whole.
Beginning at the very start of when data analytics first made its mark in the 17th century through John Graunt, right up till today where Big Data crunches millions of data entries like queries, logs and postings to find correlations and increase detection to possible cases, with a notable increase of predictability between 5 to 15 percent.
Yet Googles big failure in their flu predictions bid, while well meaning, had missed the mark by 140 percent. That’s not a small amount by any scale. However that hasn’t derailed efforts in finding the solution. Although Big Data is a big step, it’s not the only step.
“Big data” is a very 21st-century kind of buzzword, which ambiguously invokes the idea of using large sets of data to draw computer-assisted conclusions about trends, patterns and correlations, often about people and their behavior.
But if you wanted to trace the origin of using big data for health research, you’d have to go back — way back, to 17th-century England.
There, you’ll find a haberdasher by the name of John Graunt, who undertook a peculiar project. He began to study so-called bills of mortality, death records kept during the plague-riddled times, and compiled death details into tables, noting age, gender, cause, location and time.
This vital statistics research later turned into a 1662 tome. It marked a seminal moment in demography, the statistical study of populations, but also in epidemiology, the study of what causes diseases and how they spread among different groups of people.
“It was totally groundbreaking for its time. It was a much larger scale of looking at trends in disease than anyone had looked at previously,” says Stephen Mooney, an epidemiologist at Columbia University’s Mailman School of Public Health.
“At some point you have to think about what it means to put together a table and look at patterns in year-over-year,” he says. “For the time, that was big data.”
Of course, the groundbreaking big data of today is a far cry from hand-crafted tables. It allows researchers to use super-fast computers to query billions of digital records we leave in our wake on social media, on our wearable devices, in our search history — our “digital exhaust,” as Boston Children’s Hospital Chief Innovation Officer John Brownstein puts it.
And isn’t that a good thing?
The promise of big data for modern health is much extolled. This week came the latest feat. Scientists at Microsoft published a study showing that Web search queries (on Microsoft’s Bing search engine) may hold clues to a future diagnosis of pancreatic cancer, one of the fastest and most fatal.
In essence what Microsoft researchers did was this: They studied millions of anonymized searches on Microsoft’s Bing to find queries suggestive of a user’s recent diagnosis, such as “Why did I get cancer in pancreas” or “Just diagnosed with pancreatic cancer.” They then backtracked the digital footprints left by the same computer to locate searches for earlier symptoms of the disease, and to create a statistical model that they say could predict 5 percent to 15 percent of the ultimate diagnoses based on earlier search activity, with pretty low false positives.
“My take is that it’s exciting but preliminary,” says Mooney, who has studied the use of big data in public health. “The potential benefit is huge,” he says, but “it would be easy to naively assume we know more about this than we do.” It’s one thing to detect early digital clues to a diagnosis, but another to actually prevent or delay a death.
The Microsoft scientists themselves acknowledge this in the study. “Clinical trials are necessary to understand whether our learned model has practical utility, including in combination with other screening methods,” they write.
Therein lies the crux of this big data future: It’s a logical progression for the modern hyper-connected world, but one that will continue to require the solid grounding of a traditional health professional, to steer data toward usefulness, to avoid unwarranted anxiety or even unnecessary testing, and to zero in on actual causes, not just correlations within particular health trends.
“That’s why I think, if you talk to a lot of epidemiologists, they may be suspicious of some of these big data-type approaches,” says Mooney, “because they’d be concerned that there’s a loss of attention to causation.”
The most high-profile lesson in failed causation was Google Flu Trends.
In 2008, Google researchers decided to measure flu activity, in real time, based on users’ Web searches. It was a headline-grabbing project and worked well — for a while. Academic researchers who later did a postmortem on the project, David Lazer and Ryan Kennedy, wrote in Wired magazine:
“GFT failed — and failed spectacularly — missing at the peak of the 2013 flu season by 140 percent. …
“While Google’s efforts in projecting the flu were well meaning, they were remarkably opaque in terms of method and data — making it dangerous to rely on Google Flu Trends for any decision-making.
“For example, Google’s algorithm was quite vulnerable to overfitting to seasonal terms unrelated to the flu, like ‘high school basketball.’ … There were bound to be searches that were strongly correlated by pure chance, and these terms were unlikely to be driven by actual flu cases or predictive of future trends.”
The project’s failure, however, does not negate the promise of big data in health. Beyond analyses of large-scale trends, capturing passively created data on people’s sentiments, mental ups and downs, things you may not ever think to bring up with your physician can be “very powerful,” Brownstein says. (Of course, with proper privacy and security protections in mind.)
“It’s not data that can be used in a silo, it’s one gear in the system,” he says, “so it’s not like this holy grail. It’s just data that can be used, that can be harnessed, in conjunction with other types of information strains.”
To Mooney, Google Flu Trends was a case of a hype cycle, “this concept that technologies get overhyped and then are disappointing, but sometime after the disappointment, can often return to a sort of plateau of usefulness.”
And in that is a lesson on big data in health: It deserves both enthusiasm and caution.
“Ideally, I’d like people to embrace both of them,” says Mooney, “to recognize that it’s exciting and concerning at the same time. Because the world is messy and it’s possible to be exciting and concerning at the same time.”
This article was originally published on www.npr.com and can be viewed in full


Archive
- October 2024(44)
- September 2024(94)
- August 2024(100)
- July 2024(99)
- June 2024(126)
- May 2024(155)
- April 2024(123)
- March 2024(112)
- February 2024(109)
- January 2024(95)
- December 2023(56)
- November 2023(86)
- October 2023(97)
- September 2023(89)
- August 2023(101)
- July 2023(104)
- June 2023(113)
- May 2023(103)
- April 2023(93)
- March 2023(129)
- February 2023(77)
- January 2023(91)
- December 2022(90)
- November 2022(125)
- October 2022(117)
- September 2022(137)
- August 2022(119)
- July 2022(99)
- June 2022(128)
- May 2022(112)
- April 2022(108)
- March 2022(121)
- February 2022(93)
- January 2022(110)
- December 2021(92)
- November 2021(107)
- October 2021(101)
- September 2021(81)
- August 2021(74)
- July 2021(78)
- June 2021(92)
- May 2021(67)
- April 2021(79)
- March 2021(79)
- February 2021(58)
- January 2021(55)
- December 2020(56)
- November 2020(59)
- October 2020(78)
- September 2020(72)
- August 2020(64)
- July 2020(71)
- June 2020(74)
- May 2020(50)
- April 2020(71)
- March 2020(71)
- February 2020(58)
- January 2020(62)
- December 2019(57)
- November 2019(64)
- October 2019(25)
- September 2019(24)
- August 2019(14)
- July 2019(23)
- June 2019(54)
- May 2019(82)
- April 2019(76)
- March 2019(71)
- February 2019(67)
- January 2019(75)
- December 2018(44)
- November 2018(47)
- October 2018(74)
- September 2018(54)
- August 2018(61)
- July 2018(72)
- June 2018(62)
- May 2018(62)
- April 2018(73)
- March 2018(76)
- February 2018(8)
- January 2018(7)
- December 2017(6)
- November 2017(8)
- October 2017(3)
- September 2017(4)
- August 2017(4)
- July 2017(2)
- June 2017(5)
- May 2017(6)
- April 2017(11)
- March 2017(8)
- February 2017(16)
- January 2017(10)
- December 2016(12)
- November 2016(20)
- October 2016(7)
- September 2016(102)
- August 2016(168)
- July 2016(141)
- June 2016(149)
- May 2016(117)
- April 2016(59)
- March 2016(85)
- February 2016(153)
- December 2015(150)