On 13 June 2024, at The Washington Post's Futurist Summit, Bina Venkataraman, a columnist at The Washington Post, hosted a discussion with Regina Barzilay, faculty lead for AI, MIT Jameel Clinic, and Renee Wegrzyn, director of the Advanced Research Projects Agency for Health (ARPA-H).
Bina Venkataraman: Hello, everyone. Thanks for joining us. I’m delighted to be here today. I’m Bina Venkataraman. I’m a columnist at The Post. I write about the future. And I am welcoming today, Dr Renee Wegrzyn, who is the director of ARPA-H; and Dr Regina Barzilay, who is a professor of AI and health at the School of Engineering at MIT. Welcome to The Washington Post.
Dr Wegrzyn, let's start with you. Your agency is named after an agency of lore, or kind of takes its inspiration from DARPA, the Defense Research Project Agency of the Pentagon, which brought us the internet, GPS, drones. No pressure.
Renee Wegrzyn: Right.
Bina Venkataraman: But can you tell us what is the biggest swing that you're taking? What's going to be ARPA-H’s internet?
Renee Wegrzyn: I love this question, because I always counter back I bet nobody asked President Eisenhower when is he going to start the internet when he launched DARPA.
But for ARPA-H, I think it's really the promise of what does the future of health look like that's completely different. How do we accelerate not just cool technologies, but actually outcomes? How do we keep people from becoming patients in the first place? And what are those investments, so the transactions, the internet equivalent, the GPS equivalent that are going to get us there? And so just about two years ago, we launched the agency to start to make those investments to get us there.
Bina Venkataraman: Okay, so any glimpse of something that you think is particularly promising?
Renee Wegrzyn: Yeah, so important to know about us is we're disease agnostic and technology agnostic. So, we tackle some pretty big problems in health. But in this space, since we're talking about AI, some of the things to really think about is just an example of a recent programme we launched called UPGRADE, which is looking at autonomous AI systems to help patch vulnerabilities to prevent, like, ransomware attacks from taking down hospitals, right?
So, these are big things that have big implications that we don't have technologies yet to address. So those are some of the investments that we're going to be making to do just that.
Bina Venkataraman: Fascinating. So, AI will both make it easier to attack systems like hospitals, and maybe easier to defend it.
Renee Wegrzyn: Right.
Bina Venkataraman: Okay, maybe we'll return to that.
Regina Barzilay, thanks so much for being here. You were doing some fascinating things at MIT that I've gotten just a glimpse of with respect to cancer detection. But I'm wondering if you can start by telling us about how your own path as someone who has survived cancer led to how you're shaping your agenda at MIT?
Regina Barzilay: So, I actually started my time at MIT as a professor that worked on natural language processing that developed tools like eventually became ChatGPT and many others. And in 2014, I was diagnosed with breast cancer. And one of the things that I discovered when I was treated at MGH, that there is no technology there, that in terms of, you know, even basic information technologies that you have, we’re really in stone age. And after I finished my own treatment, I started asking the questions what we can do to change it. And at the time, AI was not such a popular thing. Not everybody were trying to do AI. So, I had to go from office to office and propose my services for free, hoping to find a doctor. And I did find several who were interested to address different problems in cancer, and one of them the problem of predicting risk of cancer, as we all know that we know how to treat the disease early on, we don't know how to treat the advanced disease.
So the best thing that we can do is to be able to say who are the patients who are likely to develop the disease in the near future, not only to detect what we already have--this is what doctors do--but really kind of to take a glimpse of where are we going with it and being able to say looking in the mammogram or some scan how the patient or, you know, even before they become a patient what their future, you know, holds for them. And we did it both for breast and for lung, and it does much better than human radiologists.
Bina Venkataraman: Okay, so say more about that. You did this in a large-scale study, as I understand it, and looked at the predictive possibility of this tool, AI-powered tool to look at scans and predict whether someone down the road is going to get cancer, not whether they have a lump now. So this this is different than looking at tumors.
How much better is it than humans, and where are we--where are we in the chain of getting this actually deployed in hospitals?
Regina Barzilay: So, we developed it in two areas, in breast cancer using mammograms, because a majority of women in this country are scanned with a mammogram. So, it can just work--whenever you do your scan, you can make assessment. And we do it also for low dosage CT scan for lung cancer.
It does--so it's very hard to compare it with a human, because humans are not even trying to predict it. But let me compare it with something that is federally mandated by America. So women in the audience who do do mammograms, you know that we are getting always this letter that tells you if you have dense breasts that you have--that you’re at increased risk. It’s again federally mandated by America.
So, if you look at the predictive capacity of this biomarker, it’s close to random, so of around one. So, this model, if you're looking at percentage of woman which identify as high risk, close to 20 percent of them are likely to develop cancer in five years, whereas the risk of cancer in the population is closer to 7 percent.
With lung cancer, you can do actually much better. The accuracy there in the two years, it's a high 80s. Even if you're looking at seven years, or six years, you're close to like 79 percent. So, you really have this capacity to see the future. And once we know what's to come, we can change the screening, and we can think about developing drugs that can actually prevent the disease.
Bina Venkataraman: Ah, okay. So where are with that? Where are we in the chain of [unclear]?
Regina Barzilay: So we--one of the challenges of these technologies, they're trying to do the task. AI does a task that human cannot do. No radiologists, whenever the machine gives them a prediction, can say whether the machine did the right reasoning. If you're just predicting cancer, you can just look at it and say, yeah, they don’t know it. But here you're predicting something human cannot validate.
So, for this reason, we had to go to many, many countries and to many hospitals in the United States, and we broadly validated the breast cancer tool. And it's open-source tool. It has been validated. And there are a number of hospitals which are using it in different capacities.
For instance, at MGH, during the pandemic, when the access to mammogram was greatly reduced, this tool was used to prioritise women who need to have a mammogram despite the reduced capacity, and there are various clinical trials, prospective clinical trials, that are kind of studying what is the best way to utilise this tool. Because it is not enough to say to the patient, you are high risk; you actually need to do something. And the question, which is not an AI question, is what do you do next?
And there are various trials now that are going for breast, and we're starting, actually, hopefully jointly with VA to do it in lung cancer space.
Bina Venkataraman: Okay, so maybe coming to a clinic near you.
Dr Wegryzn, Dr Barzilay just mentioned, you know, the possibility to be able to drug against something preventatively. How are you working on AI as a tool for drug development and for advancing that area? Because that's a hugely complex and different side of this than diagnosis.
Renee Wegrzyn: Yeah, I think there's some pretty interesting tools that we can talk about our current investments. So, looking at predictive AI, we do have a programme that we launched, called MATRIX, which is taking all 3,000-FDA approved drugs, and using some of the tools of AI machine learning to look across every possible disease indication in an agnostic way.
So those drugs that have been approved for a certain indication--all of them have. But what is the data telling us for other diseases that they can be useful for? So, in some cases, there may be cures sitting on the shelf; we yet haven't discovered that. And so using this programme, we want to make predictions of what are some of the, let's say, top 20-25 best hits that we can then go into the laboratory to see if we can validate can these models start to predict where these drugs might be used elsewhere.
Some of the challenges are the data that's available now, of course, is the data that's been submitted because FDA does initial drug applications. So, getting access to some of that proprietary data that the drug companies might have, or you know, other data sources is going to be really what is going to drive the quality of those models. So those are sort of on the shelf things we can do today, predictive AI.
With generative AI, we're now looking at novel vaccine antigen production. So, we have a program called APECx, which is saying, okay, we do have some vaccine development that we all know about. What about other totally new families of viruses that we don't have vaccines for yet? How can we start to generate new antigens that are not only efficacious but also we can learn about? Are they manufacturable? Are they thermo stable? So all of the things that would could be a hindrance potentially for vaccine development, we can build that into the design of those antigens.
Bina Venkataraman: So how does a large language model accelerate that process? What's the role?
Renee Wegrzyn: So, in the former case, it's really there's a lot of things that it can do. So, if you have an FDA-approved drug, you could be looking at, for example, publications to see are there any clues inside publications that might tell us that, you know, one of the drug targets this drug is targeting is--could also be relevant in this disease as one example from the literature.
Of course, you want it to be looking at the electronic health records to understand, you know, who are even the patient populations that we want to be looking at here.
But part of the effort is really to, in some ways, answer your question. So, we don't know the limits of this technology and if it will even work for the task at hand that I described. So, it's--a lot of what ARPAs do is demonstrate the art of what's possible, and derisks that. And so it really is a series of hypotheses that we'll be testing in these programmes.
Two, these are, you know, big moonshot questions. We might not hit the moon on these. But if we hit low Earth orbit, maybe there will be some learnings that really advance the state of the art.
Bina Venkataraman: And can you talk specifically about the application of that to rare diseases, which afflicts some 30 million Americans, hundreds of millions of people around the world where there's just been an intractable problem of how do we make progress in diseases that individually affect so few people, but collectively, so many?
Renee Wegrzyn: Yeah. So, for in the case of MATRIX, it really is, you know, leveraging those off the shelf solutions, where there might not be incentives for drug companies to pursue a disease that only has 100 patients, right? So, we might be able to identify those targets.
But since you write about the future, maybe we can take a little peek to what it could look like in the future. So, if you think about failures in a lot of drug development, maybe rare in particular, there's failure sometimes in the toxicity of those drugs, in the pharmacokinetics of how these drugs are turned over. Even the animal models don't exist in some of those cases. And sometimes, people seem to be surprised that humans aren't mice. So, when the--when the clinical trials fail, you know, we really don't have great models.
And then the clinical trials themselves may fail, or may never even be started, because they’re so costly. To do it for a hundred or a thousand patients is just a non-starter for a company.
So, what if we could start to use some of these AI tools to predict toxicity, to predict the ability of the pharmacokinetics so you can start to simplify dosing? And then what if we can completely replace animal models with models that actually look like a human and behave like a human in those studies?
And then, of course, you mentioned using these tools to triage the patients that need it the most. How do we triage patients for a clinical trial? If you stack all of those innovations on top of the other, you can take drug discovery and bringing that forward to patients from something that takes years to something that that may take just months and really before you even start an experiment be able to predict your success in a much better way.
So, in ARPA, what we would do is break that down into the projects and the transactions that we need to invest in to make that true.
Bina Venkataraman: Dr Barzilay, one of the challenges with the application of AI to medicine and healthcare, even at the research stage, but particularly once we go and think about the clinic and treating patients, is that the access to technologies and medicine has historically been not evenly distributed, and we see even in the predictive models being used in healthcare today biases in terms of how those models are used and what they reflect of different populations. How in your work--do you see solutions to that in your work, both to the uneven access to the technologies and to the bias that we've seen thus far in a lot of tools like this?
Regina Barzilay: So this is--I would actually want to start with the first part of this question, on access to the technology. For years, you know, I would be interviewed somewhere, and I will say there is no AI in healthcare. Think when did you last time went to the doctor and you see any AI there? You know, I haven't. My colleagues haven't. And people say, no, it's there. You just don't know.
So finally, my colleague at Stanford, James Zou, who wrote a paper when he looked at the billing of the--you know, all the American insurers from 2018 to 2023. And the question that he asked, how much of it goes to their AI tool? So he looked at 500 FDA-approved tools and ask how many of them actually billed.
So, the results of the study is that from – I think – 9 billion cost, less than 100,000 went to all the AI tool collectively. There were only four tools that were actually billed, and the one that was billed the most, I think it’s some cardiology tool, had 67,000 billings. So, this is really a scary piece, that all these great technologies that we are developing, that is created in the United States, is actually not being translated into the united health--you know, into the healthcare system, and other countries are actually way ahead of the United States in this area.
And, you know, there are a lot of studies that demonstrate that part of it has to do with the billing, with how codes are designed. But today, we don't really have a great solution of translating this great technology into concrete outcomes. So, if you would ask me what is more important, to worry about the bias or to worry about translation, I would say, let's just start translating because the comparison is not the perfection. The comparison is all the patients who are under diagnosed, who get wrong treatments, for not having access to care. So, I think we should really focus on the first part of this equation.
Bina Venkataraman: Do you see a relationship between those two problems, though? Like if there's a reaction to a tool because it is biased, that its uptake might be affected by that?
Regina Barzilay: I think that we have as I mentioned earlier, a tool, we have measures today which are not AI measures, which are shown to be biased, like, you know, breast cancer assessment or lung cancer assessment, which are racially biased, which don't work for different classes of patients. And they are there, and they are reimbursable, and so on.
So, I think that this is a really important question. And we're thinking about it a lot, and I see there are a lot of technological solutions that can help. But first, let's bring technology in.
But to answer your question, it is indeed, you know, a serious question what happens when these models are making predictions that humans cannot validate, and they're systematically biased. And unfortunately, for years, some of the datasets that were collected by NCI were not representing, you know, the full American population, like NLST, a very big trial for lung cancer doesn't have almost African American in the whole very big set of images. So, this is indeed an issue.
But I think given the current worryness [phonetic] on one hand, people are much more sensitive to what is in the data and whether it is representative of the population. On the other hand, there is a lot of work on algorithms for detecting biases, for teaching the models when to say I don't know when they are uncertain. Seeing the right developments in this field, but we first have to bring the technology into hospitals.
Bina Venkataraman: Okay, two rapid fire questions, because you're so--both so fascinating that we've gone way off script.
So one is--one is a question from the audience. So Agnieszka from Maryland asks, what is the most urgent question that is not being asked yet in the current public discussion about artificial intelligence? I would say, if we can answer from the health perspective, that'd be great.
Dr Wegryzn, you first.
Renee Wegrzyn: Maybe pulling on that last thread, it's how do we just get it out there and start testing it. So, because of some of the biases of the models that have been made, they will degrade; performance will degrade when we get out into the real world. And so I think that's--how are we getting out there? How are we making it more accessible?
Really importantly, in an aspirational type of note, how are we using this to augment healthcare providers today to allow them to be top of license, to do what they went to school for to be medical doctors or community healthcare workers? How are we leveling up their skills, so you don't always have to go back into a central care setting. And so that's, you know, assisted AI, assisting task guidance, et cetera. These are the questions that I would be really excited to start adopting in that healthcare environment.
Bina Venkataraman: Dr Barzilay, the most urgent question we're not--
Regina Barzilay: I think how to me these technologies, there were translation really fast. When we're thinking today how many years it takes to bring new technology, sometimes it's decades if we’re thinking about drugs, and very, very slow. So, with AI technologies, you've seen how fast the technology that you're using today is changing.
Bina Venkataraman: And that’s about regulation or that’s about people just welcoming it?
Regina Barzilay: It’s about how do we design clinical trials. How do you bring if there was improvement in the technology? We're not testing now in a big clinical trial obsolete technology? How do you update it? And of course, how do we change FDA regulations that truly, truly can benefit for significant redesign in the AI space?
Bina Venkataraman: Okay. And here's the last one that's closing every conversation today at the summit. Dr Barzilay, then Dr Wegrzyn, who is the smartest person you have heard or read on the future of AI, aside from each other, of course.
If you already have yours, feel free to jump in.
[Pause]
Bina Venkataraman: Wow, the silence is deafening.
[Laughter]
Bina Venkataraman: Are we all--are we all not smart enough yet?
Regina Barzilay: I just think that a lot of what I read about AI, especially within for general audience, if my students would submit it to me as part of their homework in machine learning class, they will not get a pass in the class. So this sea of misinformation, it's really hard to find, you know, pearls of wisdom for me. But there is a lot of amazing technical work that is coming out, very inspirational work. But oftentimes, maybe I'm not reading the right authors, but what I read, I can't really pick one. Sorry.
Renee Wegrzyn: Poignant. I've been excited--actually, I won't name anybody, but I think the group of scientists that really sees nucleic acid as a language. And so, you know, there's a beginning, there's a middle, there's an end to every gene. So natural language processing, a lot of these tools should be working with genetics as a language. And so whoever unlocks that, I think it's going to be incredibly powerful for the design of new drugs, for the understanding of our own genetics. And really unlocking that that future of genome editing is going to be a really, really powerful tool. And I don't think there's any one person, but I'm really excited to see that field move forward.
Bina Venkataraman: Okay, well, here's a call for more intelligent voices like the two people in this room talking about this topic. Thanks for this illuminating conversation, and thanks to everyone for being here. And the programme will continue.
This transcript originally appeared in The Washington Post's article 'Transcript: The Futurist Summit: The age of AI: The new frontiers of medicine'.