Tag Archives: Big Data

The Dying Algorithm

CREDIT: NYT Article on the Dying Algorithm

This Cat Sensed Death. What if Computers Could, Too
By Siddhartha Mukherjee
Jan. 3, 2018

Of the many small humiliations heaped on a young oncologist in his final year of fellowship, perhaps this one carried the oddest bite: A 2-year-old black-and-white cat named Oscar was apparently better than most doctors at predicting when a terminally ill patient was about to die. The story appeared, astonishingly, in The New England Journal of Medicine in the summer of 2007. Adopted as a kitten by the medical staff, Oscar reigned over one floor of the Steere House nursing home in Rhode Island. When the cat would sniff the air, crane his neck and curl up next to a man or woman, it was a sure sign of impending demise. The doctors would call the families to come in for their last visit. Over the course of several years, the cat had curled up next to 50 patients. Every one of them died shortly thereafter.
No one knows how the cat acquired his formidable death-sniffing skills. Perhaps Oscar’s nose learned to detect some unique whiff of death — chemicals released by dying cells, say. Perhaps there were other inscrutable signs. I didn’t quite believe it at first, but Oscar’s acumen was corroborated by other physicians who witnessed the prophetic cat in action. As the author of the article wrote: “No one dies on the third floor unless Oscar pays a visit and stays awhile.”
The story carried a particular resonance for me that summer, for I had been treating S., a 32-year-old plumber with esophageal cancer. He had responded well to chemotherapy and radiation, and we had surgically resected his esophagus, leaving no detectable trace of malignancy in his body. One afternoon, a few weeks after his treatment had been completed, I cautiously broached the topic of end-of-life care. We were going for a cure, of course, I told S., but there was always the small possibility of a relapse. He had a young wife and two children, and a mother who had brought him weekly to the chemo suite. Perhaps, I suggested, he might have a frank conversation with his family about his goals?

But S. demurred. He was regaining strength week by week. The conversation was bound to be “a bummah,” as he put it in his distinct Boston accent. His spirits were up. The cancer was out. Why rain on his celebration? I agreed reluctantly; it was unlikely that the cancer would return.

When the relapse appeared, it was a full-on deluge. Two months after he left the hospital, S. returned to see me with sprays of metastasis in his liver, his lungs and, unusually, in his bones. The pain from these lesions was so terrifying that only the highest doses of painkilling drugs would treat it, and S. spent the last weeks of his life in a state bordering on coma, unable to register the presence of his family around his bed. His mother pleaded with me at first to give him more chemo, then accused me of misleading the family about S.’s prognosis. I held my tongue in shame: Doctors, I knew, have an abysmal track record of predicting which of our patients are going to die. Death is our ultimate black box.

In a survey led by researchers at University College London of over 12,000 prognoses of the life span of terminally ill patients, the hits and misses were wide-ranging. Some doctors predicted deaths accurately. Others underestimated death by nearly three months; yet others overestimated it by an equal magnitude. Even within oncology, there were subcultures of the worst offenders: In one story, likely apocryphal, a leukemia doctor was found instilling chemotherapy into the veins of a man whose I.C.U. monitor said that his heart had long since stopped.

But what if an algorithm could predict death? In late 2016 a graduate student named Anand Avati at Stanford’s computer-science department, along with a small team from the medical school, tried to “teach” an algorithm to identify patients who were very likely to die within a defined time window. “The palliative-care team at the hospital had a challenge,” Avati told me. “How could we find patients who are within three to 12 months of dying?” This window was “the sweet spot of palliative care.” A lead time longer than 12 months can strain limited resources unnecessarily, providing too much, too soon; in contrast, if death came less than three months after the prediction, there would be no real preparatory time for dying — too little, too late. Identifying patients in the narrow, optimal time period, Avati knew, would allow doctors to use medical interventions more appropriately and more humanely. And if the algorithm worked, palliative-care teams would be relieved from having to manually scour charts, hunting for those most likely to benefit.

Avati and his team identified about 200,000 patients who could be studied. The patients had all sorts of illnesses — cancer, neurological diseases, heart and kidney failure. The team’s key insight was to use the hospital’s medical records as a proxy time machine. Say a man died in January 2017. What if you scrolled time back to the “sweet spot of palliative care” — the window between January and October 2016 when care would have been most effective? But to find that spot for a given patient, Avati knew, you’d presumably need to collect and analyze medical information before that window. Could you gather information about this man during this prewindow period that would enable a doctor to predict a demise in that three-to-12-month section of time? And what kinds of inputs might teach such an algorithm to make predictions?
Avati drew on medical information that had already been coded by doctors in the hospital: a patient’s diagnosis, the number of scans ordered, the number of days spent in the hospital, the kinds of procedures done, the medical prescriptions written. The information was admittedly limited — no questionnaires, no conversations, no sniffing of chemicals — but it was objective, and standardized across patients.

These inputs were fed into a so-called deep neural network — a kind of software architecture thus named because it’s thought to loosely mimic the way the brain’s neurons are organized. The task of the algorithm was to adjust the weights and strengths of each piece of information in order to generate a probability score that a given patient would die within three to 12 months.

The “dying algorithm,” as we might call it, digested and absorbed information from nearly 160,000 patients to train itself. Once it had ingested all the data, Avati’s team tested it on the remaining 40,000 patients. The algorithm performed surprisingly well. The false-alarm rate was low: Nine out of 10 patients predicted to die within three to 12 months did die within that window. And 95 percent of patients assigned low probabilities by the program survived longer than 12 months. (The data used by this algorithm can be vastly refined in the future. Lab values, scan results, a doctor’s note or a patient’s own assessment can be added to the mix, enhancing the predictive power.)

So what, exactly, did the algorithm “learn” about the process of dying? And what, in turn, can it teach oncologists? Here is the strange rub of such a deep learning system: It learns, but it cannot tell us why it has learned; it assigns probabilities, but it cannot easily express the reasoning behind the assignment. Like a child who learns to ride a bicycle by trial and error and, asked to articulate the rules that enable bicycle riding, simply shrugs her shoulders and sails away, the algorithm looks vacantly at us when we ask, “Why?” It is, like death, another black box.

Still, when you pry the box open to look at individual cases, you see expected and unexpected patterns. One man assigned a score of 0.946 died within a few months, as predicted. He had had bladder and prostate cancer, had undergone 21 scans, had been hospitalized for 60 days — all of which had been picked up by the algorithm as signs of impending death. But a surprising amount of weight was seemingly put on the fact that scans were made of his spine and that a catheter had been used in his spinal cord — features that I and my colleagues might not have recognized as predictors of dying (an M.R.I. of the spinal cord, I later realized, was most likely signaling cancer in the nervous system — a deadly site for metastasis).
It’s hard for me to read about the “dying algorithm” without thinking about my patient S. If a more sophisticated version of such an algorithm had been available, would I have used it in his case? Absolutely. Might that have enabled the end-of-life conversation S. never had with his family? Yes. But I cannot shake some inherent discomfort with the thought that an algorithm might understand patterns of mortality better than most humans. And why, I kept asking myself, would such a program seem so much more acceptable if it had come wrapped in a black-and-white fur box that, rather than emitting probabilistic outputs, curled up next to us with retracted claws?

Siddhartha Mukherjee is the author of “The Emperor of All Maladies: A Biography of Cancer” and, more recently, “The Gene: An Intimate History.”

Precision Wellness at Mt Sinai

My Sinai announcement

Mount Sinai to Establish Precision Wellness Center to Advance Personalized Healthcare

Mount Sinai Health System Launches Telehealth Initiatives

Joshua Harris, co-Founder of Apollo Global Management, and his wife, Marjorie has made a $5 million gift to the Icahn School of Medicine at Mount Sinai to establish the Harris Center for Precision Wellness. As part of the part of the Icahn Institute for Genomics and Multiscale Biology, the new center will leverage innovative approaches to health monitoring and wellness management by integrating emerging technologies in digital health, data science, and genomics to enable people’s health to be treated in precise, highly individualized ways.

A first-of-its-kind at a major U.S. academic medical institution, the precision wellness research programs will be closely tied to clinical initiatives across the Mount Sinai Health System. The Harris Center’s immediate efforts will focus on digital health, molecular profiling, and data science. The Center is evaluating the usability of wearable devices to see how effective they can measure activity, stress, sleep, cognitive functioning, mood, and environmental exposures and using sequencing technology to bring DNA, Microbiome, and immune system profiles into predictive models of wellness.

Additionally, the new Center will apply state-of-the-science analytics and machine learning to the wealth of individualized metrics to produce actionable, data-driven insights into key aspects of wellness, and to help lead the way to a nextgen healthcare that is scalable and far superior to anything now available.

Joel Dudley, PhD, a highly regarded genomics and bioinformatics expert at the Icahn Institute, and by Gregory Stock, PhD, an accomplished life-science entrepreneur and technology-innovation expert will serve as the Harris Center directors.

“We are deeply grateful to Mr. Harris for his generosity, vision, and passion,” said Dr. Dudley. “His gift will help realize the promise we see in new digital health technologies such as wearable sensors and mobile applications. By drawing upon the core competencies in genomics, multiscale biology, bioinformatics, data science, population health, and clinical trial design at the Icahn Institute, the Harris Center initiatives will further enhance Mount Sinai’s reputation as one of the world’s premier innovators in personalized healthcare. It is exciting to have an opportunity to integrate and apply these emerging technologies in a meaningful and scientific way in the pursuit of optimal wellness, vitality, and preventive care.”

IBM’s Watson

Refer to: April 2015 Blog Post on Apple, J&J, and Watson

IBM says each person generates one million gigabytes of health-related data across his or her lifetime, the equivalent of more than 300 million books. IBM launched the new Watson Health business unit to help patients, physicians, researchers and insurers use data to achieve better health and wellness for all.

Rhee on IBM’s Watson

IEEE on IBM’s Watson

Feature BiomedicalDiagnostics
IBM’s Dr. Watson Will See You…Someday

The game-show-winning AI struggles to find the answers in health care

By Brandon Keim
Posted 29 May 2015 | 8:00 GMT

Four years ago, Neil Mehta was among the 15 million people who watched Ken Jennings and Brad Rutter—the world’s greatest “Jeopardy!” players—lose to an IBM-designed collection of circuits and algorithms called Watson.

“Jeopardy!” is a television game show in which the host challenges contestants with answers for which they must then supply the questions—a task that involves some seriously complicated cognition. Artificial-intelligence experts described Watson’s triumph as even more extraordinary than IBM supercomputer Deep Blue’s history-making 1997 defeat of chess grandmaster Garry Kasparov.

To an AI aficionado, Watson was a tour de force of language analysis and machine reasoning. To Mehta, a physician and professor at the world-renowned Cleveland Clinic, Watson was a question unto itself: What might be possible were Watson’s powers turned to medicine? “I love technology, and I was rooting for Watson,” says Mehta. “I knew that the world was changing. And if not Watson, then something like it, with artificial intelligence, was needed to help us.”

Mehta wasn’t the first doctor to dream of a computer coming to his rescue. There’s a rich history of medical AIs, from Internist-1—a 1970s-era program that encoded the expertise of internal-medicine guru Jack Myers and gave rise to the popular Quick Medical Reference program—to contemporary software like Isabel and DXplain, which can outperform human doctors in making diagnoses. Even taken-for-granted ubiquities like PubMed literature searches and automated patient-alert systems demonstrate forms of intelligence.

Powerful as those programs may be, though, they’re not always considered smart. Watson, with its ability to process natural language, make inferences, and learn from mistakes, embodied something much more sophisticated. It also arrived at an opportune time: Health care, particularly in the United States, was finally experiencing a digital overhaul.

These days, clinical findings, research databases, and journal articles are all available in machine-readable form, making them that much easier to feed to a computer. And federal mandates have made electronic medical records nearly universal. Therefore, software is more tightly integrated than ever into medicine, and there’s a sense that making health care more effective and less expensive requires improved programming.

So it’s no wonder that shortly after Watson’s “Jeopardy!” triumph, IBM announced that it would make Watson available for medical applications. The tech press buzzed in anticipation of “Dr. Watson.” What was medicine, after all, but a series of logical inferences based on data? Four years later, however, the predicted revolution has yet to occur. “They are making some headway,” says Robert Wachter, a specialist in hospital medicine at the University of California, San Francisco, and author of The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age (McGraw-Hill, 2015). “But in terms of a transformative technology that is changing the world, I don’t think anyone would say Watson is doing that today.”

Where’s the delay? It’s in our own minds, mostly. IBM’s extraordinary AI has matured in powerful ways, and the appearance that things are going slowly reflects mostly on our own unrealistic expectations of instant disruption in a world of Uber and Airbnb. Improving health care represents a profound challenge, and “going to medical school,” as so many headlines have quipped of Watson, takes time.

Impressive as that original “Jeopardy!”-blitzing Watson was, in medical contexts such an automaton is not really useful. After all, that version of Watson was fine-tuned specifically for one trivia game. It couldn’t play The Settlers of Catan, much less make useful recommendations about a 68-year-old man with diabetes and heart palpitations. “ ‘Watson, given my medical record, which is hundreds of pages long, what is wrong with me?’ That’s a question,” says Watson software engineer Mike Barborak. “But it wasn’t a good question for Watson to answer.”

Watson’s engine was powerful, but it needed to be adapted for medicine and, within that broad field, to specific disciplines and tasks. Watson is not a singular program; rather, in the words of Watson research director Eric Brown, it’s a “collection of cognitive-computing technologies that are combined and instantiated in different ways for each of the solutions.”

So there are many different Watsons now being applied to medicine. Some of the first could be found at the Cleveland Clinic, Memorial Sloan Kettering Cancer Center, MD Anderson Cancer Center, and insurance company WellPoint (now called Anthem), each of which started working with IBM to develop its own health-care-adapted version of Watson about three years ago. Two years later, as the hardware shrank from room-size to small enough for a server rack, another round of companies signed on to collaborate with IBM. Among these are Welltok, makers of personal health advisory software; @Point of Care, which is trying to customize treatments for multiple sclerosis; and Modernizing Medicine, which uses Watson to analyze hundreds of thousands of patient records and build treatment models so doctors can see how similar cases have been handled.

Watson’s training is an arduous process, bringing together computer scientists and clinicians to assemble a reference database, enter case studies, and ask thousands of questions. When the program makes mistakes, it self-adjusts. This is what’s known as machine learning, although Watson doesn’t learn alone. Researchers also evaluate the answers and manually tweak Watson’s underlying algorithms to generate better output.

Here there’s a gulf between medicine as something that can be extrapolated in a straightforward manner from textbooks, journal articles, and clinical guidelines, and the much more complicated challenge of also codifying how a good doctor thinks. To some extent those thought processes—weighing evidence, sifting through thousands of potentially important pieces of data and alighting on a few, handling uncertainty, employing the elusive but essential quality of insight—are amenable to machine learning, but much handcrafting is also involved.

It’s slow going, especially as each iteration of Watson needs to be tested with new questions. There’s a certain irony: While modern AI researchers tend to look down on earlier medical AIs like Internist-1 as primitive rules-based attempts to codify expertise, today’s medical Watsons are trying to do something similar, albeit in a far more sophisticated way.

Expectations have also been altered in another respect. Watson’s text-processing powers (its “Jeopardy!” database contained some 200 million pages of text) seemed to make it an ideal tool for handling the rapid growth of medical literature, which doubles in size every five years or so. But a big pool of information isn’t always better. Sometimes, as when modeling the decisions of top lung-cancer specialists at Memorial Sloan Kettering, there aren’t yet journal articles or clinical guidelines with the right answers. The way to arrive at them is by watching doctors practice. And even when relevant data do exist, they are often most useful when presented in smaller expert-curated sets.

Another issue is data quality. WatsonPaths, which Mehta has been developing at the Cleveland Clinic, is the closest thing yet to that archetypal Dr. Watson, but it can work only if the AI can make sense of a patient’s records. As of now, electronic medical records are often an arcane collection of error-riddled data originally structured with hospital administration in mind rather than patient care.

Mehta’s other project, then, is the Watson Electronic Medical Records Assistant, in which the computer is trained to distill those records into something doctors and the program itself might actually use. “That has been a challenge,” says Mehta. “We are not there yet.”

The issues with electronic records underscore the fact that each Watson, whatever its theoretical potential, is deployed in the all-too-human—and often all-too-inhuman—reality of modern health care. Watson can’t make up for the shortage in primary-care physicians or restore the crucial doctor-patient bond lost in an era of 5-minute office visits.

Most fundamentally, Watson alone can’t change the fee-for-service reimbursement structure, common in the United States, which makes the quantity of care—the number of tests, treatments, and specialist visits—more profitable than bottom-line quality. “It’s not just a technology problem,” says Wachter. “It’s a social, clinical, policy, ethical, and institutional problem.”

Watson can’t address all those issues, but it might, perhaps, ameliorate some of them. Better medical-record processing could make for extra minutes with patients instead of extra screens. And helping doctors to analyze hospital and research data could make it easier for them to practice effective evidence-based medicine.

While its “Jeopardy!” triumph was “a great shot in the arm” for the field, says Mark Musen, a professor of medical informatics at Stanford, IBM is just one of many companies and institutions in the medical-AI space. Indeed, mention of Watson sometimes raises hackles within that community. It’s a response rooted partly in competitiveness, but also in a sense that attention to Watson has obscured the accomplishments of others.

Take the AI that Massachusetts General Hospital developed called QPID (Queriable Patient Inference Dossier), which analyzes medical records and was used in more than 3.5 million patient encounters last year. Diagnostic programs like DXplain and Isabel are already endorsed by the American Medical Association, and startup company Enlitic is working on its own diagnostics. The American Society of Clinical Oncology built its big-data-informed CancerLinQ program as a demonstration of what the Institute of Medicine, part of the U.S. National Academies, called a “learning health system.” Former Watson developer Marty Kohn is now at Sentrian, designing programs to analyze data generated from home-based health-monitoring apps.

Meanwhile, IBM is making its own improvements. In addition to refinements in learning techniques, Watson’s programmers have recently added speech recognition and visual-pattern analysis to their toolbox. Future versions might, like the fictional HAL 9000 of sci-fi fame, see and hear. They might also collaborate: Innovations in individual deployments could eventually be shared across the platform, turning the multiplicity of Watsons into a giant laboratory for developing new tools.

How will all this shake out? When will AI transform medicine, or at least help improve it in significant ways? It’s too soon to say. Medical AI is about where personal computers were in the 1970s, when IBM was just beginning to work on desktop computers, Bill Gates was writing Altair BASIC, and a couple of guys named Steve were messing around in a California garage. The application of artificial intelligence to health care will, similarly, take years to mature. But it could blossom into something big.

This article originally appeared in print as “Dr. Watson Will See You… Someday.”

An abridged version of this article appeared in the June 2015 issue of IEEE Spectrum.


The human brain’s wiring diagram – with 100 trillion connections between neurons – is called the “connectome”. The idea has been around since the 1960’s but there is a new explosion in understanding.

The last time there was this much excitement was in 1986 when Sydney Brenner, Nobel Laureate, was given the entirety of The Philosophical Transactions of the Royal Society of London (this was Isaac Newton’s venue). Brenner published “the Book” where he documented the mapping of a transparent worm and its 302 neurons.

Even so, “The race to map the connector has hardly left the starting line”.

“If the cells and fiber in one human brain were all stretched out end to end, they would certainly reach to the moon and back. Yet the fact that they are not arranged end to end enabled man to go there himself. The astonishing tangle within our heads makes us what we are.”
–Colin Blakemore, a physiologist from the UK

Since 2005, Sebastian Seung at MIT has been trying to map this incredible phenomenon, and NYT Magazine wrote about his work this Sunday, January 11, 2015 (“Mind Games” by Garath Cook). He just left MIT in 2013, and now joins his mentor David Tank at the Princeton Bezos Center for Neural Circuit Dynamics.

He started by studying (in Germany, with two graduate students in 2006) the high-resolution brain imaging analysis of Winfried Denk, a scientist who built a device. The device, according to David Tank, imaged brain tissue with enough resolution to make out the connections between individual neurons. The problem was that the images were very blurry and articulating them, mapping each one, etc was a “herculean effort’. So the big problem to solve was – could this task be automated?

Obviously, this relates to the phenomenal leaps ahead in mapping made possible by computer analysis. Another example of this is the Human Genome Project, which mapped the DNA that provides every cell’s genetic instructions. This was obviously breakthrough work, and following after this work was work on Proteome (proteins), Foldome (folding of proteins). Note the U.S. Government has “The Brain Initiative”, which is a 12-year, $4.5 billion brain-mapping project.

So the “connectome” is the brain’s physical structure, which must be mapped. At the same time, a major effort is underway that is separate – namely, to map the areas of the brain that “light up” and therefore are related to certain cognitive functions.

This reference to “physical structure” is meaningful – because people tend to relate to the brain in terms of movement….a dynamic “flow” like a river, and not a physical structure like a river bed.

Haim Sompolinsky studied this structure to understand “aha” moments in learning. This idea relates to an ancient idea- from Plato and Aristotle – that meaning emerges from the ones between things. And in the 21st century, it appears that their is physical terrain that describes this ancient concept: the likes between neurons (note William James described mental processes as associations).

A typical human neuron has thousands of connections. A neuron can be as narrow as one ten-thousandth of a millimeter and yet it can stretch from one side of the head to the other!


Wikipedia on Connectome

NYT Article on Connectome

Santiago Ramon y Cajal – illustrations of neurons and neural networks

Nobel Laureate Sydney Brenner mapped the 302 neurons of a transparent worm in the seventies. He wanted to understand how behaviors emerges from a biological system.

EyeWire – online game that challenges the public to trace neuronal wiring in the retina of a mouse’s eye (has 165,000 players in 164 countries).

connectionism – a cross-disciplinary idea that simple units, connected in the right ways, can lead to surprising abilities (memory, recognition, reasoning).

Harald Hess – a genius in creating scientific tools.”MERLIN” – new brain imaging system. Janella Reseach Campus. They believe they will have mapped a fruit fly’s neural network within two years

Google: announced in September, 2014 at the White House that they had launched their own connector project. Tom Dean is a Google Research Scientist, who also works with the Allen Institute. He wants the “Google Earth of the Brain”!!!!

Weather: Big Data&Big Forecast Progress

Weather Channel Article

Behind the Scenes: How We Make Billions of Forecasts 96 Times a Day

David Kenny
Chairman & CEO at The Weather Company

In the past several months, we have improved the accuracy of precipitation forecasts by another 15% and the accuracy of temperature forecasts by 2 degrees Fahrenheit. This is entirely because we are now forecasting for every location on Earth (at least 3 billion named locations) instead of mapping people to the approximately 2MM locations where forecasts have traditionally been done.

….mobile access is always current. So this means that our current day forecasts are now updated 96 times per day, and days 2-15 are updated at least 24 times per day.

….Our new forecasting platform is the largest application in the Amazon Web Services cloud. We are currently scaled to deliver 115,000 unique forecasts per second, or more than 10 Billion forecasts per day.

….We are voracious consumers of data to make this happen. We build on the incredibly skillful global models of government agencies, such as NOAA and ECMWF, as well as models built by some of the world’s great universities and research organizations. We add the data we collect from airplanes (who provide that data as part of our aviation weather service contracts with most airlines and private operators.) We also work with tens of thousands of individuals who buy and build personal weather stations, connect those stations to our Weather Underground network and send us their own weather data every 2.5 seconds. And we are exploring new potential observations from home sensors, windshield wipers, smartphone air pressure gages, and literally billions of additional sources that can sustain this faster rate of innovation and improved forecasting skill.

….Two years ago, we began creating this new forecasting system. The bulk of the work was building the statistical optimization to calibrate and blend the world’s best global weather models. And then we supported these algorithms with more and faster computing power, as well as more observations, to enable constant machine learning. At the same time, machine learning and algorithmic forecasts sometimes miss enough color for people to truly know what the forecast means, and how to act. So we invented an exception process when our human meteorologists and social scientists need to interpret and verbalize certain forecasts, especially when life and property-threatening storm systems are predicted.

Weather Channel Dashboard

History of Computing


See also 2001 and 2007 posts (this is a more extensive more current update):