Google's Head of AI Talks About the Future of the EHR and Technology in Medicine

; Abraham Verghese, MD; Jeffrey Dean, PhD


August 20, 2021

This transcript has been edited for clarity.

Eric J. Topol, MD: Hello. This is Eric Topol with Medicine and the Machine, with my co-host, Abraham Verghese. This is a special edition for us, to speak with one of the leading lights of artificial intelligence (AI) in the world, Jeff Dean, who heads up Google AI. So, Jeff, welcome to our podcast.

Jeff Dean, PhD: Thank you for having me.

Topol: You have now been at Google for 22 years. In a recent book by Cade Metz (a New York Times tech journalist) called Genius Makers, you are one of the protagonists.

I didn't know this about you, but you grew up across the globe. Your parents took you from Hawaii, where you were born, to Somalia, where you helped run a refugee camp during your middle school years. As a high school senior in Georgia where your father worked at the CDC, you built a software tool for them that helped researchers collect disease data, and nearly four decades later it remains a staple of epidemiology across the developing world. I'm going to stop there because I didn't know this, and there's this thing called a pandemic. Can you help us?

Dean: My father was an epidemiologist, so we traveled around, and my mom studied medical anthropology. That combination of careers, plus a bit of wanderlust, led to going to new places, and I was an only child so I went along for the ride. We ended up living in lots of different places. I did a high school internship at the CDC, writing some software for outbreak investigations during epidemics. Epidemiologists use some very specialized statistics that most statistical packages don't support very well, and it's important to provide software that can be run all around the world on the relatively low-end computers at that time. So I started writing software that would provide the right kind of data collection and analysis tools that could be run in a fairly lightweight way by epidemiologists all around the world.

This was before the internet, so to distribute it, people would show up with floppy disks and I'd put the software on it. I would make copies and teach people how to use it, and it propagated from there. And it's still my seventh or eighth most-cited work.

Pandemics, Then and Now

Abraham Verghese, MD: Jeff, it's such a pleasure to get to talk to you. There's a lot of cross-pollination between Google and Stanford, and some of our folks have trained with you. The epidemiology software you're talking about, is that the same thing that you worked on for HIV/AIDS for the World Health Organization? Was that an evolution of that software? That brings me to the question of what you think of the current pandemic. I know your active mind must have turned to that as well.

Dean: The way the software evolved was that I started it in high school, and every summer I would do a new version of it. The first couple of summers I did it at the CDC as sort of follow-on internships. Then one of my colleagues, who was working at the CDC, moved to the WHO in Geneva and said, "Do you want to come to Geneva for the summer instead of Atlanta?" and I thought That sounds new and exciting — why don't I do that? So the next few versions of the epidemiologic software were done there, and I also started working on some software to help predict the future trajectory of HIV infections and AIDS cases around the world, based on early assessments of seroprevalence rates in different communities, working with the Global Programme on AIDS there. It was an evolution of the work I was already doing and some follow-on additional work to help do more forecasting specific to the HIV pandemic.

The current pandemic is obviously tragic in a lot of ways and has dramatically affected the world. HIV is very different from COVID in some ways, but it's also reflective of the public health community needing to come together and tell people what to do to keep themselves safe. As we learn more, the guidance changes. But COVID, being an airborne disease, is much more transmissible in many ways than HIV. There are some commonalities but also some differences. The reaction of the public health community has been pretty good here; the deployment of vaccines in such rapid fashion has been quite remarkable and a credit to all the scientists involved in that, because it helped the world move from what could have been even worse to a better situation. Obviously, we are not completely out of it yet.

Those Google Cats

Topol: The deep learning era, in which you played a major role, seemed to get legs with the cat video story. Can you tell us a little bit about that?

Dean: I had been introduced to neural networks in 1990, when there was an initial wave of excitement about neural networks and what they could to. At that time, they could show really interesting results on teeny, tiny problems, but they couldn't scale to anything of significance. It seemed like an interesting way of approaching and solving some kinds of learning problems, so there was a big wave of excitement but then a trough of disillusionment.

But I was a senior in college during that first wave of excitement, and I thought, Oh, these are great! What if we could just make bigger ones? So I did a senior thesis on parallel training — using 32 processors to train neural networks instead of just one — because I thought maybe we needed just a bit more computer and then we could make them do amazing things.

I was completely wrong. It turned out that we needed about a million times as much computer, not 32. But then starting around 2006, 2008, we started having that much computer power thanks to Moore's Law in the world. Researchers in a few universities were starting to see good results on using neural networks for a broad set of problems — speech, early computer visual problems. I heard about that, along with others at Google, and we decided to start a project to train very large neural networks using the computers in our data centers. We put together some software that enabled us to train using thousands of computers.

We decided to train a model using an unsupervised neural network. So we took 10 million randomly selected frames from YouTube videos, and we trained a model using an unsupervised learning algorithm. The system learned to recognize a whole bunch of different objects. Of course, it learned to recognize cats because YouTube is full of cats. The really interesting thing about that is that it developed a neuron with the ability to recognize whether a cat face was in the frame without ever being told what a cat was. So just from patterns and data and building higher and higher levels of abstraction through this unsupervised learning algorithm, the system developed a neuron that could tell whether there was a cat face there or not. That was pretty remarkable.

You could then train it with some supervised data, saying this is a car, or a truck, or whatever, and get better results than if you just did the unsupervised training.

It was eye-opening for us. It showed that scaling these kinds of models to larger sizes would produce good results, and it started a whole decade of successful uses, at Google and elsewhere, of neural networks for a wide variety of things — speech recognition, images, language understanding, and so on — as well as some applications in healthcare and medical diagnostics.

EHR of the Future

Verghese: One of the things in healthcare that greatly interests our audience is the electronic health record (EHR), which as you know has been both a major boon and a major source of frustration with the amount of time we spend on these systems. They have not really lived up to their initial promise, but what's very exciting is some of the work that you've done on natural language processing.

Where do you see the EHR evolving? Give us a sense of hope, if you will, for the future.

Dean: Doctors have a bit of a love-hate relationship with the EHR and the way they have to interact with it because it takes away time they could be spending with patients. But it's also a repository of really valuable information about decisions they've made about patients and the outcomes.

In collaboration with other organizations, we've done some work on using de-identified data in models that are similar to how we train natural language models. What you want to be able to do in natural language is to take a prefix of a piece of text and then predict the next word or sequence of words that is going to occur. If you are typing an email message, that can help you by suggesting how you might complete the sentence to save typing.

It turns out that same approach can be used to give clinicians suggestions about what might happen next in the medical record for a particular patient. If you think about the medical record as a whole sequence of events, and if you have de-identified medical records, you can take a prefix of a medical record and try to predict either the individual events or maybe some high-level attributes about subsequent events, like, "Will this patient develop diabetes within the next 12 months?"

You can become pretty good at predicting a lot of things that clinicians might care about in thinking about how to treat a particular patient or a particular condition of a patient they are seeing. You can, for example, suggest five diagnoses that might make sense, given the patient's current symptoms plus their past medical history, based on learning from other de-identified medical records.

Clinicians go to medical school, get trained, and see about 20,000 patients during their careers. That's extremely useful, but it also means they might have fairly limited experience, especially with rare things they might never see. An aspirational goal, which is really hard to achieve for a while bunch of reasons, is being able to use every past medical decision to help inform every future medical decision. That would be great, because we would be learning from the collective wisdom of what worked and what didn't work on billions of people, in order to provide better care for everyone in the future. That's complicated, but it's a good north star about what we might be able to achieve if we put our minds to it.

Topol: I do think it's achievable — the whole idea of a digital infrastructure with twinning and nearest-neighbor analysis — if we could get the data. Kai-Fu Lee and I wrote about this kind of big thinking in "It Takes a Planet." Someday that would be extraordinary.

Health and the Human Eye

Topol: You and I were at a conference (and by the way, it's great that there are so many publications from Google on AI in healthcare), but one area that was striking to cement the power of training neural networks was the retina. And you'll recall — not because this is the way we want to determine whether a person is a male or female — but by putting a photo of the retina through a neural network, instead of 50% accuracy from noted retina expert Pearse Keane, you could get 97% accuracy on whether it was a male or female. That suggests that human eyes are not as good as trained machine eyes. Can we say, firmly right now, that the combination may be even better?

Dean: The interesting thing about that work is that a neural network that had been trained on a bunch of retinal images could learn to predict biological sex from the retinal image alone. And it could also predict other things relevant to cardiovascular risk that ophthalmologists and even trained retinal specialists couldn't necessarily pick up on. That tells me that lurking in health information are subtle signs that, with the right framing of the problem and machine learning, might allow us to pick up things that are complementary to what human clinicians already pick up on. The combination of these computing systems plus trained professionals will result in better outcomes than either can achieve alone.

Topol: It's really extraordinary to see what has already been built out, not just for the potential of tracking diabetes, but glucose regulation, blood pressure regulation, hepatobiliary disease, and the calcium score of the coronary arteries, and even Alzheimer's disease, potentially using the retina as a window to neurodegenerative diseases. The list keeps getting longer.

Google has done a lot of work on diabetic retinopathy, for which half of all those with diabetes are never screened. You've worked with India's Aravind Eye Hospital. Where does that work stand? Is that the future of screening for diabetic retinopathy?

Dean: Diabetic retinopathy is one of the earliest problems in medical imaging diagnostics that we started looking at, because there's a huge need for additional screening capacity around the world. In many parts of the world, there just aren't enough clinicians, who need a fair amount of special training to assess a retinal image for signs of diabetic retinopathy on a five-point scale. If you catch it in time, you can prevent blindness or partial loss of vision. It's actually very treatable, so the ability to screen more people will actually prevent blindness.

We've followed a fairly detailed progression from early research projects to this possibility — whether we can actually train a machine learning model to assess whether a retinal image shows diabetic retinopathy. The earliest study we did showed that it was on par or perhaps slightly better than board-certified ophthalmologists in the United States. With some additional refinement of how we labeled the training data and how we trained the model, we're able to get it to be on par with retinal specialists who have additional training in this, which is the gold standard of care in this area. And then we've been working with our partners in a number of places, including India, Thailand, Germany, and France, to deploy this in real clinical settings and do screening. We've just reached a milestone of 50,000 patients screened with this approach, and the numbers are continuing to go up. It's nice to see things that are not just research papers but are also actually deployed in the world.

We've been looking at other modalities of medical imaging. After we saw the early success with diabetic retinopathy, we realized that this was a generally repeatable pattern with other medical imaging modalities and we could probably get good results. There is careful attention to lots of things, like collecting the right kind of training data, training the right kind of model, evaluating it appropriately, and so on.

Now That Computers Can See

Verghese: One of the things that your group has done so well is to create open-source programs or ways of consolidating the process of performing AI. Reading about your work, it seems to me that there are endless applications. You could look at almost every sector of life and every industry. How do you prioritize things? And in healthcare, where do you see the most important challenges that remain that you will focus on?

Dean: Our group realized that there were a lot of potential applications of machine learning in the world, in virtually every sector of human endeavor. Part of the reason is that between 2010 and 2013-2014, computers effectively developed the ability to see. That's pretty transformative. If you think back to evolutionary times when animals developed eyes, I suspect that was a big deal. We reached that same point in computing a few years ago. That has meant that there's a broad set of uses for the newfound capabilities of being able to see.

We wanted to make software packages and tools available to those who were trying to apply machine learning to different settings. That's why we developed a system called TensorFlow and released it as an open-source package for people to use in whatever way they wanted. We had an Apache 2 free software license. I'm not a lawyer, but it means you can do whatever you want with it. That enables people to use it in lots of different settings, and it has been downloaded 100 million times, I think, which is kind of remarkable for a fairly obscure program or specific tool.

In the medical setting, people are seeing the potential of machine learning, both for medical record data and for medical imaging modalities. So we want to provide tools that make it easy to deploy these systems. In the medical setting, there are a lot of complicated regulatory and privacy issues that, for good reason, impede the rollout of these systems in ways that don't exist in less regulated settings. But it's also good to realize that there is real potential here, and we need to get these things out in ways that are positive for the world and the medical outcomes that we think can be improved.

What's This on My Skin, Google?

Topol: Another body of your work is on skin lesions and problems. You developed Derm Assist, which I understand is being released in Europe, having gotten regulatory approval (but not in the United States, which has different regulatory hurdles). This will help people figure out whatever skin issue they are concerned about, by taking a photo and getting an automated preliminary diagnosis. Can you tell us about Derm Assist? Because that's one of the most common reasons why a person goes to see a doctor.

Dean: Dermatologists are one of the hardest specialists to get an appointment with because they're oversubscribed, and so sometimes a general clinician will try to make a diagnosis and perhaps refer to a dermatologist. You don't need specialized equipment necessarily to capture the right information to be able to look and assess it. We followed a similar path to the one we used in our diabetic retinopathy screening work, where we first wanted to gather data on a wide variety of different skin tones and skin conditions, to see if we had something that was of high-enough quality to put through a regulatory process. We've now done several iterations to improve the set of data that we have to make it more representative, so we can roll it out for use in assessing skin conditions. If you have a rash on your arm, you can take a picture of it and it will sort of provide a few alternatives of what the rash might be. Then you can decide what to do given that preliminary assessment of what it might be.

Verghese: One of the pushbacks on AI early on was that it could sometimes magnify the inequities in society, and no one would be aware that it was actually doing that unless you took a careful look. Have we gotten better? Have we built in more safeguards? Is this an issue that we can worry about a little less or is it a continuing issue?

Dean: Any time you're thinking about deploying a machine learning model or an AI system in the world, you need to be conscious and aware of many different aspects of that process and how they can introduce bias or fairness issues or ways that the system interacts with people. You want interpretable models so that people can understand that if the model is saying something, why is it saying that? That's throughout the whole process — looking at the dataset, at the kinds of algorithms you're doing and how you evaluate it. You want to be careful to not just evaluate it and report a single number. You want to look at different subsets of the population that might be affected and at how it performs on — for example, with dermatology-related work — different ethnicities or skin tones.

These are things we pay attention to, and what we think everyone deploying machine learning models — particularly those making consequential decisions about people's lives — should be paying attention to. In 2018, we put out a set of principles for thinking, in a structured way, about every use of machine learning at Google, what we should be thinking about. These models should avoid creating unfair bias, they should be interpretable by humans, and so on.

These are important issues and they will continue to require everyone's attention to make sure that we don't do what you suggested: take data from the world as it is, and then automate some process so that we can take something that is not really the world as it is and magnify or accelerate the effects of improper decisions. We need to make sure that we take the world as we would like it to be, and make sure that our machine learning models are as close to that goal as we can get.

Topol: Going back just for a second to Abraham's point about the EHRs. As you know, Jeff, there are burgeoning efforts for keyboard liberation, for taking the conversation and making a synthetic note based on that, which could be edited by both the patient and machine that has been trained by the doctor or clinician's previous notes and edits. Do you see that as being imminent? Getting rid of the keyboard could be the favorite AI advance in the history of medicine.

Dean: Our group has done a little bit of work using an audio recording of a patient-doctor conversation, to create a draft of a medical note that a clinician can just edit a little bit as opposed to having to type up a lengthy note. We all know that often clinicians copy and paste the most recent note and don't really edit it appropriately. That's partly because it's very cumbersome and unwieldy to interact with some of these systems, and speech and voice are a more natural way of creating notes. Creating summarized notes from conversations is also perhaps a good assistive tool that can reduce the clinical burden on clinicians but also perhaps create higher-quality information in the medical record itself, which would be fantastic.

Verghese: Do you worry, as you make open-source machine learning software available to everyone, about people putting it to nefarious use? And conversely, do you worry about excess government regulation coming in to monitor it to a point where it begins to snuff out the capability to do research and advance the cause?

Dean: Like a lot of technologies, machine learning can be put to both positive and negative uses in the world. By and of itself, it is kind of a neutral thing, but when you apply it to something, that's where you're making conscious decisions about what the system is going to do and what and who it will impact — this person, this populous, this community or the world. Computer vision has amazing uses in medical diagnostics, but it also could be used to create autonomous weapons. Society, governments, and other involved constituencies need to make decisions about what kinds of uses we want to allow and what things we don't want to allow.

In terms of regulation, it's much more about particular uses that are important to look at. In many areas, including medical devices or pharmaceuticals, there are existing regulatory frameworks for approval, so with some adaptation to what machine learning is able to do in that space, you already have a fairly well evolved regulatory framework that is pretty serviceable. For other kinds of things, such as autonomous vehicles, there is less of a regulatory framework, and that's an area where you'll want governments to weigh in on what is appropriate and what's not in a new regulatory framework that has to be essentially created from nothing.

Microchips in Minutes

Topol: Speaking of things that are a little bit removed from medicine, like autonomous vehicles, you and your colleagues at Google were involved with what I would consider pretty remarkable breakthroughs in AI very recently. Both were published in Nature, which is kind of an interesting place to have these papers. I'd like to get your comment on these.

One is the work that was done for AI to design microchips so that what would normally take months, if not years, to be created instead took minutes or hours. And the other is the work on protein structure, such that any sequence of amino acids could predict in 3-D all the folding for essentially a vast majority of proteins in the human proteome, and then eventually perhaps for all proteins. Obviously, that has implications for drug discovery and understanding the biology in man and elsewhere. The medical community may not be aware of this yet, though.

Dean: The chip design work is a bit further afield from the medical community, but it's interesting. One of the stages in designing a computer chip is that you have the actual logic and transistors that you want to put on the surface of the chip, but you don't know where to put them. So you often have human physical design engineers sit there and play this complicated game of "where should we put this thing" in order to minimize quite a number of different constraints: the length of wire between different pieces, and the amount of power and the area that the chip will consume. You want to make it as small as possible and consume as little power as possible and minimize wire length.

We used a reinforcement learning algorithm, where you take a bunch of actions and then at the end you get a reward. This is similar to how our DeepMind colleagues mastered the game of Go, where you can essentially play a bunch of moves and go. At the end you get a reward signal: Did you win or lose? Here you can get a reward signal that is a bit more multidimensional, like the actual size of the chip and the power that it consumes and so on. But you can essentially play the game of placing the components of the chip and see how well you did. Then you can try again, given the feedback you got the previous time. It turns out that this is a much higher-dimensional problem space than Go, but you can learn to do this successfully in an automated way instead of it taking a team of 10 people many weeks or months to go through this process. That's kind of exciting. And it results in smaller chips created more quickly and rolled out into the world, which is nice.

In terms of the protein folding, that's great work from our colleagues at DeepMind. The implications of that are still not known because the work was just released in the past few weeks. But it is really a testament to the fact that we can now tackle some of these basic problems in science with machine learning and develop new capabilities that then enable us to do things we couldn't do before and will affect lots of downstream fields and uses as well.

The potential of machine learning in medicine is in its infancy, and the ability for us to help give everyone in the world better medical care, better information about their own medical condition, and give clinicians better advice about the things they should be thinking about or decisions they make is remarkable, but still largely not rolled out in the world. How do we move from where we are to where we could be?

Topol: Well, full circle from your work on pandemic preparedness 40 years ago to your work now. I will be following you closely. Thanks so much for joining us.

Eric J. Topol, MD, is one of the top 10 most cited researchers in medicine and frequently writes about technology in healthcare, including in his latest book, Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again.

Abraham Verghese, MD, is a critically acclaimed best-selling author and a physician with an international reputation for his focus on healing in an era when technology often overwhelms the human side of medicine.

Jeff Dean, PhD, has lived on four different continents (North America, Africa, Asia, and Europe). He says one of his personal goals is to play soccer and basketball on all seven but acknowledges that Antarctica might be tough.

Follow Medscape on Facebook, Twitter, Instagram, and YouTube


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.