Making Sense of Information Overload

An Interview With Atul Butte

This feature requires the newest version of Flash. You can download it here.

Editor's Note:
In this episode, Atul Butte, MD, PhD, distinguished professor and director of the Institute for Computational Health Sciences, University of California, San Francisco, is interviewed about leveraging the reams of electronic healthcare data generated to facilitate precision medicine.

Robert A. Harrington, MD: Hi. This is Bob Harrington on Medscape Cardiology and Over the course of the past few years, we've had discussions with people doing work in the area of cardiovascular medicine, [and] healthcare generally, that's touching on topical issues.

We've had some discussion in the past on this show about big data, and what that might mean for how to think about genomics in cardiovascular health. Today, what I'd like to do is explore a bit the intersection of big data and precision medicine. These are two phrases we hear a lot about; we hear them in the lay press, we see them in the medical literature.

What I want to approach today is three components. One is definitional, and [I'll] get our expert guest to talk about how he thinks about the concepts of precision medicine and big data. Second, [I'll ask him] how he sees computational methods and analytics playing a major role today in clinical research and clinical medicine, and then finally get some insight from his own work about what we might expect in these areas in the future.

Today, I'm really privileged and honored to have a friend and colleague as our guest; he is Atul Butte. Atul is the director of the Institute for Computational Health Sciences and a professor of pediatrics at the University of California, San Francisco (UCSF). Atul, thanks for joining us here today on Medscape Cardiology.

Atul Butte, MD, PhD: Thanks for having me. It's a real, real pleasure to be here.

Dr Harrington: Atul, you continue to amaze me as I follow you through the literature and where I see you most, which is in the Twittersphere—being somebody who is really at the cutting edge of what I'll call "precision medicine," and the use of computational methods in investigation.

Understanding that we have a clinical audience largely with us on Medscape, why don't you spend a few minutes and define for us what we should be thinking about when we hear the phrase "precision medicine"?

Dr Butte: Obviously, a lot of people are using that phrase, all the way up to the president of the United States. And not everyone has the same definition, let's put it that way.

To me, precision medicine means the customization of healthcare for an individual, on the basis of the measurements that we get from that individual. But it's also using the learning, and the data, and the experience that we've gotten from the rest of the population.

It's not just about you, or me, my molecules, my measurements, or my DNA, for example. It's also what have we learned from everyone else whom we've ever taken care of before—all the measurements [and] the molecules from everyone else. In some ways, there's a subtle difference between personalized medicine, which people think is just about you, and precision medicine, which means learning from every one and applying it to you.

It's a broad definition, but obviously, you need a lot of data collected on everyone else—what are the drugs we're ordering, how do they work, what are the molecules that we're able to measure—and then bring it all to bear with useful tools in that one exam room, when the doctor is taking care of the patient.

Dr Harrington: You know, what I love about your definition is that you've included two things that I think about a lot. One is [that] there has been this, I'll call it a dichotomy, created in that you're either practicing population-level medicine or you're practicing personalized medicine. I love your definition because you're bringing together both of those pieces, and maybe we can get you to comment on the population side.

On the other side, what you're saying is that we really need to create a set of tools to help the clinician who is interacting with that patient at the point of care. I'm also guessing your thinking is that you want to develop a set of tools for our patients, for individuals. Maybe take the population health question first, and then morph into the tools question.

Dr Butte: I also agree, there is this dichotomy—it could be a false dichotomy—where precision medicine is seen as maybe the opposite of public health. I think precision medicine can only be truly practiced if we actually have organized our population or public health.

But there is this dichotomy, especially around the costs of this type of delivery, right? People seem to think that precision medicine is going to be expensive, perhaps more expensive [than healthcare now]. What we're trying to get to, though, is a story—a narrative about how precision medicine will save money in the end, because it's not just about adding things to medicine; it's also maybe removing things from medical practice that we've learned from the data really don't seem to work. These two concepts of precision medicine, and population or public health really do come hand in hand.

You also asked about the apps. Data by themselves do nothing, right? We have to have data in motion; that's really how we generate voltage in medicine. When the data are in motion—whether they're talking to us through apps that are delivered to the doctor before the patients come into the exam room, during that encounter, or maybe after, but certainly perhaps via apps delivered to the patient—[then patients] are able to take care of themselves better between visits with physicians.

Dr Harrington: One of the things I've loved that you guys are doing up at UCSF is the Health eHearts Program. [It's] part of that attempt to reach the patients, the people in our communities, [and] to be able to extract information from them directly and put it to use in both an aggregated population way, but also allowing you to dig down deep about what's specific about that individual.

A lot of the conversation we're having is around this notion of big data and what you've described at the beginning—what we know about your molecules, your genes, maybe your social media presence, your neighborhood, [and] the environment to which you're exposed. How are we going to make sense of all of this? Give the clinician who's practicing every day some sense of what people like you are doing in your research group to harness the data. "Data in motion"—I love that phrase; I'm going to steal it from you.

Dr Butte: Absolutely. I've used data as a moving thing, [and] data as a frozen thing; you've got to thaw out the data to release knowledge. There are a lot of analogies. But what are big data? Obviously, we're talking about gigabytes and terabytes, so that's the storage aspect of data. In medicine, we could talk about our data being big—but you and I, being here in Silicon Valley, know that our neighboring companies have much more data than we do in our healthcare institutions. Facebook, Google, Apple, and all the rest will always trump us in some ways, in terms of size on a hard disk.

But our data are complicated, and they've traditionally not been looked at before. We record a lot of measurements from our patients, and the clinical exams and the lab tests. We record what we're doing to our patients, what we've ordered, what's been dispensed. But there's more to it than that. There's imaging data from pathology and from radiology, and of course all the data that patients collect on themselves.

You make a restaurant reservation using OpenTable—well, what kind of restaurant are you going to? A healthy one, or a not-so-healthy one? Maybe that's relevant to your health. Wearables, if you have them—such as your Fitbits, Jawbones, and Apple watches that are tracking steps—those are useful. Increasingly, patients are going to be getting more advice through these types of devices than they're getting from us in an encounter-based approach.

One example I talk about a lot is that I myself have been using a Fitbit for almost 3 years. I started because I saw my weight exploding. I reached 247 lb, and I had to fix that before I hit 250 lb (that was a weird threshold for me). I weigh myself every morning now, I know we don't always tell patients to do that, but that's what I'm doing. I record everything I eat, and I've lost 50 lb. What's funny about this story is [that] my own doctor's healthcare system (I'm not going to say who my doctor is or which health system they're in) literally spent a billion dollars on their electronic health record system (which also will be nameless, but you can guess which [one it is]), but their billion-dollar system doesn't talk to my $40 gadget. They're taking one kind of big data, and my personal data are living in another kind of data ocean (or a puddle), and these two don't talk to each other.

Data has this characteristic of getting stuck in silos, and if we don't put it in motion to the people who can give you advice about it, we have this threat of patients cutting us out of the loop. But big data is about all these types of streams of data.

Let's bring it back to the clinician for a second. A lot of clinicians get worried: "How I am going to deal with, let's say, 6 billion DNA base pairs in a 15-minute encounter?" I'm technically trained as an endocrinologist. How was I able to deal with glucometer data in the exam room (those are painful enough to download)?

There is one kind of doctor that is able to look at a gigabyte of data in 5 minutes and actually render a diagnosis, today. And that's radiologists, right? A radiologist looking at a spiral CT or cardiac MRI—that's about a gigabyte of data. But they don't look at the data with ones and zeros; they have images, they have visualization and 3D.

That's what it's going to be like for most clinicians 5 or 10 years from now. These complex streams of data will be seen in some kind of visualization, some kind of tool that you're going to use during the encounter. Today's electronic health record systems, most doctors use them after the encounter, to document what they've done. We're going to be changing our mindsets to use them during the encounter, not just to order the meds, but even maybe to get advice or to synthesize what's going on with that patient in front of you. That's a long answer, but I think that's where I see it moving with big data.

Dr Harrington: It's a great answer. The visualizing aspect is going to be both important and intriguing for clinicians, and your radiology example is spot on. I had a conversation earlier this week with a young pediatric cardiac intensivist who is interested in the visualization of physiologic data. I asked the question: How much physiologic monitoring data does the typical pediatric intensive care unit (ICU) patient produce?

He said, oh, you know, in the average stay of about 5-7 days, in the pediatric ICU, they generate a couple of terabytes of data. My God, that's extraordinary, but he jumped right to this notion that you can't think about it like you think about the current electronic health record. You have to do just what you said, which is to get your mind around how we might approach visualizing that physiologic data. And in a way, that's aggregated with other data, so that you can begin to understand the patterns that might be particularly important to that particular patient.

Dr Butte: Exactly. No one physician is going to figure this all out manually. There will have to be more computational tools involved. Even the monitor traces now have alerts in the ICU setting.

The downside to a lot of this kind of automated type of analysis, where we build these tools, [is that] we think they're going to do well, but they're alarming too frequently. So people ignore the alarms—they ignore computational tools that we think are going to help. We have to be a lot more intelligent about building the next generation of decision support [so that] it doesn't sound an alarm because of legal reasons, but to actually be helpful to physicians.

Dr Harrington: Absolutely. A few months ago, I had a conversation on this show with your colleague Bob Wachter about his book The Digital Doctor, and one of the best anecdotes that he tells is talking to an intensive care nurse. He asked her, how do you learn to decipher the monitors? And she said, the thing I pay attention to is when they're not going off. When there's an absence of beeping, then I really get worried.

It is this notion that we're going to have to train providers about different ways of hearing the data, seeing the data, visualizing the patterns in the data. I've been fascinated by the work that you're doing. Why don't you talk a little bit about what you do in your lab, and then maybe tie that into what you're doing in the University of California (UC) system? Because you really have a big role not just at UCSF, but thinking about data across the UC system, which is an enormous healthcare system.

Dr Butte: Sure; I'll answer it in two ways. I'll first talk about my lab—I launched my lab at Stanford, and then we moved to UCSF 1 year ago. I'll talk about one type of particular data first, and that's public, or open data.

I'm a big fan of big open data. What does that mean? Well, many scientists when they study particular conditions—let's say heart failure or diabetes. If you use these kind of molecular tools, [then] it's amazing: The top-tier journals, the National Institutes of Health (NIH), the Wellcome Trust, these funding agencies make these scientists share their data on the Internet, for transparency and reproducibility.

It's amazing how much molecular data we have in our human samples (deidentified, of course). You can do an amazing amount of new research using publicly available data. We've done a couple of projects with type 2 diabetes, planning new therapeutic targets just by putting more than 100 of these data sets together, [and] we started to realize that there's the same receptor showing up again and again. Maybe we can design a drug against those receptors; we've done that kind of work.

A newer type of data that is becoming public or open is clinical trials data, because that same mandate and sharing system we've had in the molecular world is now moving toward the clinical world. We've seen a lot of push-back. There's this whole meme about research parasitism. Some clinical researchers don't want to share their data; they call people like me "research parasites."

That's fine; I'm happy to be a research parasite, if that's what they call me. But in terms of transparency and reproducibility, we're going to get more clinical trial data released to the public. We're learning how to do the analysis—meta-analyses and things like that—around this type of data.

I happen to run a website where we give out raw clinical trial tips to the public, so we're getting some practice there. The new fun thing I'm doing at UC (and it's actually beyond UCSF) [is that] they gave me an appointment at something called the UC Office of the President.

For listeners who don't know, UC has 10 campuses (five medical, five nonmedical), and then we have the Office of the President. All five of our UC health systems (UCLA, UCSF, Irvine, Davis, and San Diego) actually fall under one umbrella, and we call them "UC Health." We have cooperation across these five [systems]; we have biweekly calls from the clinical and translational science award programs, [and] we have the consistency between institutional review boards (IRBs). So if one IRB approves a proposal, the others approve it rapidly on contracting.

The new fun thing I get to do is build up the common clinical data warehouse for all the electronic medical record data [that] we're collecting on every patient in the UC [system]. At last count, that's about 14.5 million patients who have received some care in UC, and we have one database with all of the clinical data on these 14 million. Of course, it's under lock and key.

The idea is to learn from all five of these centers. Maybe three of the centers are doing something better than the other two; let's learn from that and teach the other two, and just keep rotating. This is what they call the "learning healthcare system," so we're going to use these data to help build out this learning healthcare system to UC.

Dr Harrington: That's really extraordinary, and the other great thing (as you know, I'm a still relative newcomer to California, at almost 4 years) is [that] the diversity of the population is extraordinary. I suspect that's going to be one of the advantages of the UC system—that the diversity of the California data really does mirror the diversity of the United States.

Dr Butte: Perhaps [the data] mirror the diversity of the entire world. There are a lot of people who are not from California in California, and I think that we capture the world's population in some way. A good fraction of them get their care within our systems.

Bringing it full circle back to precision medicine—we're going to need to learn about the whole world's population so that we can help take care of every patient individually.

Dr Harrington: That's exactly right, and that notion of having population-level data [from] 14 million individuals within a common data set allows you to really hone in on what might be precision medicine.

Atul, you also brought up a couple of topics that we're not going to cover today, but I absolutely would love to have you back to discuss one [of them]—something that both patients and providers worry about, which is security of the data,

Dr Butte: I'm happy to join you anytime.

Dr Harrington: Thanks, Atul. This has been a really terrific discussion, covering a wide range of topics around the general category of precision medicine and big data.

My guest today has been Atul Butte, the director of the Institute for Computational Health Sciences and a professor of pediatrics at UCSF. Atul, as always I appreciate your time, and thanks for joining us here on Medscape Cardiology.

Dr Butte: Thank you again.

Disclosures: Atul Butte, MD, PhD, has disclosed the following relevant financial relationships:
Serve(d) as a director, officer, partner, employee, advisor, consultant, or trustee for: NuMedii; Personalis, Inc.; Carmenta Bioscience; Nuna; Assay Depot; Geisinger Health System; Samsung Advanced Institute of Technology; GNS Healthcare; Eli Lilly & Co.; F. Hoffman-La Roche Ltd.; Wilson Sonsini Goodrich & Rosati; Verinata Health Inc; Pathway Genomics; Covance Inc; Regeneron Pharmaceutical, Inc.; Gerson Lehrman Group; Guardant Health, Inc.; GNS Healthcare; Medgenics
Received research grant from: National Institutes of Health; PhRMA Foundation; Howard Hughes Medical Institute
Have a 5% or greater equity interest in: Personalis; Carmenta Bioscience; NuMedii
Received income in an amount equal to or greater than $250 from: NuMedii; Personalis, Inc.; Carmenta Bioscience; Nuna; Assay Depot; Geisinger Health System; Samsung Advanced Institute of Technology; GNS Healthcare; Eli Lilly & Co.; F. Hoffman-La Roche Ltd.; Wilson Sonsini Goodrich & Rosati; Verinata Health Inc; Pathway Genomics; Convance Inc; Regeneron Pharmaceuticals, Inc.; Gerson Lehrman Group; Guardant Health, Inc.; GNS Healthcare; Medgenics; National Institutes of Health; PhRMA Foundation; Howard Hughes Medical Institute; Ansh Labs


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.