Computational Techniques, AI and Machine Learning : Will Big Data Save Psychiatry?
Computational techniques illuminate such illnesses as depression and schizophrenia, forging new paradigms for classifying and treating the thorniest psychiatric disorders.
Each year, before their annual meeting on machine learning, a group of electrical engineers who work with huge sets of data dream up a problem for their peers to solve. Most of the themes have dealt with arcane engineering topics, but three years ago organizers wanted to tackle something a bit more practical.
Could artificial intelligence be used to solve problems in medicine?
One of the competition's organizers, Rogers Silva, a postdoctoral researcher at the Mind Research Network in Albuquerque, had been experimenting with data from brain scans. He wanted to know how to extract information from brain-imaging data sets "to make more accurate inferences about diseases, causes, and what features would be good for a diagnostic," he says. A skilled physician can determine a lot from a brain scan, but Silva was looking for something more—subtle patterns invisible to the naked eye.
Silva posed this question to his colleagues: Can experts in big data look at brain scans from patients with schizophrenia and healthy controls and determine which is which?
The brains of people with schizophrenia are different in countless subtle ways from those of people without the disease. The differences are likely due to multiple genetic variants in the brain, many of which might not be harmful by themselves.
For the competition, Silva collected the brain scans of 144 people, about half of whom had schizophrenia. The data from the scans were given to some 245 competing teams, which produced 2,000 entries (teams were allowed to submit more than one).
The patient scans were divided into two groups. The competing teams were told which patients among the first group did and did not have schizophrenia. Based on that data, the teams looked for differences between patients in the cohort for whom no diagnostic information was shared, those with schizophrenia and those without. Using that information, the teams were able to predict which of the patients from the second unlabeled cohort had schizophrenia. Silva and his colleagues were impressed with the results. The best submissions distinguished between patients and controls with only about a 10 percent rate of error—far better than chance alone.
Could these techniques be used to diagnose patients routinely? "That's conceivable," Silva says. "The work we did is a first step." The goal is not to replace psychiatrists with circuits, but to help psychiatrists do what they are already doing. Silva expects that to happen in less than five years.
In this form of applied artificial intelligence, "Machines have the ability to guide their own learning, if you will," says Hugh Garavan, an associate professor of psychiatry at the University of Vermont. Machine learning is the term for allowing a computer program to wander through data and attempt to identify what's important, without any instructions about where to look or what is likely to be crucial. It's a way of finding unexpected connections.
Machine learning is being used in a study he is conducting on teens and binge drinking. There are a lot of data on dopamine and its reward system. Yet, he says, "That system might be blunted in kids, and if alcohol activates that system, these kids are more likely to try it again."
One could design a study pursuing that idea, Garavan says, but it would have shortcomings, as "you are essentially confirming what you already know." With machine learning, there is no presumption that the dopamine system, or anything else, is involved in binge drinking. "We can tell the algorithm: Sample the brain. Look at 1,000 brain regions. See which region at a certain age correlates best with binge drinking at a later age." With a second pass, the first associations can be confirmed. "Then you can repeat this multiple times." The algorithm the machine uses is designed by the machine, not by humans, who couldn't possibly grapple with all the data that machine learning can. Some of that information is useful and relevant to understanding disease, some is not.
The approach is about a decade old. In 2006, Read Montague, a computational neuroscientist at the Virginia Tech Carilion Research Institute, was among the first to start a computational psychiatry unit. He and a colleague, Peter Dayan, of University College London, worked on computer models that could mimic the action of the neurotransmitter dopamine when looking at diseases and disorders such as addiction, Parkinson's, and some forms of psychosis.
They can now use their models to ask what happens when dopamine systems are overactive or get changed, to understand components of addiction or obsessive-compulsive disorder, for instance. This is just one way of using computation in psychiatry. The schizophrenia challenge and the binge-drinking study were data-driven. That is, they were intended to draw useful correlations from data. But Montague and Dayan's work starts with a theory and tests it. Their computer models can show how dopamine systems might work, and researchers can then determine to what extent the theoretical models apply to humans.
Computational psychiatry was slow to take off, Montague says, partly because of the success of biological approaches. Prozac and similar drugs that followed it, known collectively as SSRIs, have been quite effective at treating depression. They "can take people from suicidal to functional," he says, "but that's not an explanation" of what's wrong with the individual. Developing models of dopamine systems and other systems in the brain can help connect the dots between molecules and mental illness, as can the modeling done in machine learning.
One obvious difficulty is that poking around in the brains of living people is not welcomed by those individuals. Montague got around this problem, however, by working with neurosurgeons. He asked conscious patients to engage in certain games, such as betting, to monitor dopamine release while surgeons were performing surgery. "The model tells us what to look for in the brain," he says.
Some patients were asked to play Go, the complex Asian strategy game. "We thought Go was profoundly deep and complicated," he says. But by tracking brain activity and making computer models of what was observed, "we could discover things that the best players didn't see." It was yet another example of how computational psychiatry can be used to identify brain activity that couldn't be spotted by even the most expert psychiatrist.
The same idea—extracting important information from noisy data—can be used to study Parkinson's disease, a disorder of movement and balance caused by a shortage of dopamine-producing neurons. Parkinson's disease is unusual because it can take many different forms. Montague can look at the various systems affected by dopamine and determine which are affected in a given patient. Once again, the new techniques can find clarity in complicated data, and they are not limited to diagnosable conditions. "You're going to be able to collect people with traditional psychotherapies and subclassify them using these approaches," he says.
Machine learning to study adolescent drinking behavior
Robert Whelan, a psychologist at the Trinity Institute of Neurosciences in Dublin, is using machine learning to study adolescent drinking behavior. He wanted to answer an important question about adolescents: Could information from 14-year-olds enable him to predict who would become a binge drinker at age 16? Machine learning was suited for this, because Whelan didn't know what the answers were likely to be. He had a lot of data to work with—the brain scans of 2,500 adolescents at age 14. "We asked the machine to look through all of the data at 14 to pick out which are important, which could indicate who would become a binge drinker at age 16."
He and his colleagues ran a number of tests. "We looked at the brain structure, did cognitive tests, behavioral tests, life histories, asked whether their parents were divorced." After all the computational work was complete, Whelan reported partial success. The prediction was 70 to 75 percent accurate. "We found that it was possible to some extent to predict who was going to become a drinker," he says. "The accuracy wasn't great, but it was probably real."
But the experiment did more than show limited success at predicting binge drinking. It also revealed things that researchers hadn't expected—a virtue of machine learning. One was that "people who were more extroverted at age 14 had a higher probability of becoming binge drinkers," Whelan says. But the best predictor was that people who were more conscientious about their jobs and considerate of others were also more likely to binge drink at 16. So were adolescents who entered puberty early. These findings were slightly mysterious and unexpected, which is partly why they are so important.
Whelan worked on the binge-drinking study as a post-doctoral fellow with Hugh Garavan at the University of Vermont. One interesting finding, he says, was that drinking was more likely in adolescents who had bigger brains—not the kind of hypothesis that researchers would likely develop on their own. But once they discovered this, it made sense. As the brain develops during adolescence, it tends to shrink. Neurons and the connections between them get smaller as unnecessary synapses are pruned. Kids with bigger brains are less mature than those whose brains are further along in the pruning process, and kids with immature brains are more likely to drink.
One of the things that made the study possible was the large set of brain scans made available to Whelan and Garavan. To arrive at meaningful findings, researchers need huge data sets. Until recently, they didn't exist.
In any given study, Garavan says, "you might have 20 kids," far fewer than the thousands needed. There are two problems with that, he says. One is that even the best computational work cannot uncover small abnormalities without a big collection of data. And without a lot of data, spurious connections can seem real: If the 20 kids in a pilot study just happen to be taller than usual, one could conclude that the taller an adolescent is the more likely she is to be depressed, let's say. A random group of 2,000 students makes it much less likely that they will be taller on average than the population overall.
Big Data Collection
The National Institutes of Health is spending $300 million to collect data on 11,000 kids, starting at age 9—including brain scans, genetic tests, hormone analysis, and psychological assessments, according to Garavan. The plan is to follow up with the group periodically until they reach age 20.
For an upcoming project, Whelan is trying to use the same techniques to predict which patients will respond best to a given treatment. "When a person visits a psychiatrist, he's given treatment, then is sent away for eight weeks to see if the treatment will work." Whelan is trying to devise a way to increase the chances that patients will get the treatments that work best for them from the outset. Such an indicator of effectiveness would save money and time, and get patients feeling better much more quickly.
He's excited about the future of these studies. The strategy of the binge-drinking study is now being adopted by other researchers to look for more correlations. And that highlights an important caveat of these studies. "Our research is always correlational," he says. "We don't know what causes kids' drinking or is a consequence of their drinking—or whether they're related somehow." But the research does give experts strong leads in their search for causes.
While the work is exciting, what does it mean for doctors who want to apply the findings, or for patients who might benefit from them? Are there specific brain regions that make you more likely to commit suicide? Or to become depressed? "If we find a brain marker that is predictive, we might be able to come up with some sort of pencil-and-paper test to take advantage of that," Whelan says.
The results of the binge-drinking study, for example, could conceivably generate a 10-item questionnaire that might identify kids likely to become binge drinkers, giving schools and parents a chance to intervene. It's not magic, but it's rational and scientific—unlike a high-school vice principal making guesses about kids who might be at risk.
Computational techniques are also providing insights in areas that have already attracted a lot of attention in conventional studies. Arjun Krishnan is a computational biologist at Michigan State University doing pioneering work in the study of autism. While he studied genetics and majored in biology, he has also had an interest in computer science. As he continued his studies, he realized he could bridge the two fields.
"Biologists used to say to me, 'You don't know enough biology,' and computer scientists would say, 'You don't know enough computer science.' " But that turned out to be ideal. "I knew more computer science than biologists, and I know more biology than computer scientists. The two camps couldn't talk to each other, but both could talk to me."
Krishnan has used his cross-domain straddle to examine a fundamental problem in genetics: How do genes specialize and communicate with each other? "Cells in our body can do very different things even though they have the same set of genes," he says. "A brain cell has the same genes as a heart cell." But brain cells can't contract like heart muscles, and heart cells can't make your skin crawl when Dr. Frankenstein shouts, "It's alive! It's alive!"
This has immediate practical implications, because "every disease is associated with a very specific cell type in our bodies," Krishnan says. "If genes 'break' and cause autism, the effect is through brain-related tissues."
But genes don't work in isolation. Genes are social creatures, and they participate in social networks in the body. Likewise, Facebook uses computational models to decide who to recommend to you as friends—people with whom you should connect because they're in your social network. Krishnan has used a similar idea to study biological networks—looking through the networks of known disease-related genes to find hitherto unknown ones also related to disease.
When Krishnan turned his focus to autism, only a few culprit genes had been identified. The big data he is using comes from the Simons Foundation Autism Research Initiative. It's a collection of genetic samples from 2,600 so-called "simplex" families, meaning that each family has only one individual with an autism spectrum disorder—siblings and parents are unaffected. Krishnan uses his computational methods to look for mutations that are present only in the child with autism.
When he began his work five years ago, only 15 or 16 genes had been linked to autism. That figure has grown to about 65, Krishnan says. Researchers thought there had to be more; 10 years ago they were predicting that 600 to 1,000 genes might ultimately be found.
"We thought we'd take known autism genes and see if they interact as part of networks in the brain," he says. The human genome contains 25,000 to 30,000 genes. Krishnan and his colleagues decided to give every one of those genes an autism score, ranked from the most highly predictive to the least predictive. "Our ranking contains many genes at the top that have never been studied before." In August 2016, he and his colleagues published their research in Nature Neuroscience, raising the number of candidate autism-linked genes from about 65 to 2,500. It's a potentially significant advance in understanding the genetics of autism. The question now is whether that can help clinicians discover the cause, or causes, of autism, and then treat it.
The key is that Krishnan doesn't need to understand every gene. "That would be like saying I need to know about every individual buyer to form a marketing campaign," he says. Genes form networks, just like grocery-store shoppers, and if he learns what some of them do, he can draw conclusions about the others.
The discovery also underscores the idea that autism "is not a single disorder at all. It's one of the most diverse disorders," he says. Autism can have a variety of symptoms: problems with sleep, digestive difficulties, issues with sensory perception, and others. Yet, too often, all get put in one box. Krishnan adds, "We don't yet understand which sets of genes contribute to which parts of autism." It's the same with other conditions, such as obesity and heart disease, which can also have varying symptoms and multiple causes. He plans to turn his attention next to Alzheimer's disease and heart disease, using the same computational techniques.
The problem with cutting-edge computational research is that the tale usually ends with a promise of an exciting future and a reminder that more research is needed before any of the work will help patients. While that is true of this research, some findings are already reaching patients with depression. Adam Chekroud, a doctoral student at Yale, founded a company called Spring, to take advantage of his findings on depression. He notes that the care Americans get for depression is generally less than perfect. "Physicians probably detect only half of those who have a depressive episode. Of those who are identified, 30 percent don't return for treatment. Of those who do come back, 70 percent don't recover. Only a handful of people get optimal care all the way through."
Chekroud has employed computational methods to brighten that picture. He used his data analyses to develop a questionnaire for patients that can not only help diagnose them but also predict what is likely the best treatment.
In a paper earlier this year in The Lancet, Chekroud reported that some 30 to 40 percent of patients who recover from depression after treatment will relapse. About 30 percent of those who relapse recover with one of the SSRI antidepressants. But he was able to identify the 60 percent of that group likely to respond specifically to the SSRI citalopram, or Celexa. That could spare patients hit-or-miss trials of drugs and get them directly to the one that can help. "We showed [the prediction] was also statistically reliable when we tested it in a completely independent clinical trial," he says.
People believe that childhood trauma is one factor associated with a higher risk of depression, Chekroud says. So is being female. "Machine learning combines all of these small effects and looks at overall symptoms."
Chekroud, eager to expand the research, created Spring with two Yale computer science graduates. The company is now working with a clinic in the Bronx, in New York, to evaluate Spring's diagnostic questionnaire. Patients who walk into the clinic, most of them Spanish-speaking and on Medicare or Medicaid, are handed an iPad with the questionnaire. It takes about two minutes. When they are called, they take the iPad with them and give it to the doctor. The diagnostics on the iPad tell the doctor which treatments might work best, provide a list of options, and warn about potential side effects.
Having started a few months ago, Chekroud and his colleagues are revising the protocol at least once a week in response to feedback from doctors and nurses at the clinic. The system can now diagnose depression with about 70 percent accuracy—before the patient even sees the doctor.
The information-technology director at the clinic is pleased with the results of the pilot project. "We're about halfway through the pilot, and we've seen acceptance rates rise dramatically." That is, patients are welcoming the questionnaire. They are eager to reach for the iPad when they come in. "It's definitely an incremental step, but I think it's leading toward a revolution in health care. Big data is something we can't get away from and we won't be able to do without."
Chekroud sees a bright future, and so does Microsoft, which sponsors his research and gives him free access to its cloud computing resources. He plans to incorporate more treatments into the protocol, including such things as exercise and psychotherapy.
Read Montague bubbles with enthusiasm when asked about the future of computational research in psychiatry and neuroscience. He's excited about it and so are the students he's training. "It's going to be the big application for computer science," he says. "I'm making a bet, but I think it's a good bet—mental illness and neurological disease have touched nearly everyone on the planet." With further use of computational techniques, depression will no longer be simply depression. People with traditional psychopathologies can be subclassified into different categories, and that can lead to much more careful treatment—and ease the symptoms of mental illness more rapidly, and more effectively.