Noel Brown
Hello and welcome to Pathfinders, a podcast series from RBC Capital Markets where we uncover key trends and catalysts shaping the fast-moving world of biotech and pharma. I'm your host, Noël Brown, Head of US Biotechnology Investment Banking at RBC Capital Markets. In this week's episode, we'll be looking at what is turning into an inflection point for the sector, and that is the integration of AI and machine learning and drug discovery. And in order to do that, we're joined by two biotechnology executives who are pioneering breakthroughs at this intersection of technology and research. Laksh Aithani is a scientist, CEO and co-founder at Charm Therapeutics, a three deep learning research company of developing small molecule therapeutics against previously hard to drug targets to cancer and other areas. Prior to funding Charm, he was a core contributor at Exscientia, a leading AI drug discovery company. Clarissa Desjardins is a serial Life Sciences entrepreneur, a co-founder of three successful biotechnology companies and the CEO of Congruence Therapeutics, a company working at the interface of computational and experimental drug discovery to design novel small molecules for diseases caused by protein misfolds. So let's get started, and let me first kick it over to you, Laksh. Can we get a bit of the history of the use of AI or machine learning in drug discovery?
Laksh Aithani
Yeah, sure. So I guess maybe I'll focus my answer on the discovery side, and perhaps I'll focus on small molecules. So I guess in small molecules, the whole goal is to identify a molecule that has all the right properties, the potency, the selectivity and the right pKa and Acme and toxicology properties in one molecule. So people have been building statistical models like random forests and even simple regression models for quite a while now, probably for the last several decades, and they've been using things like fingerprints to sort of describe a molecule, but for a very long time. I would say, then probably around, let's say the early 2010s, you started to have some companies crop up to try and industrialize this a bit more. So for example, one of them is Exscientia, which was probably one of the first companies. And they also do a lot in the generative chemistry space. So actually using AI machine learning to generate molecules. I actually used to work there, and actually just recently they have merged with Recursion, which is kind of cool, and then you had a bunch of other companies crop up as well around a similar time. Then I would say, more recently, let's say in, you know, 2020 and after 2020 you have what I would call the generative AI wave. And in the broader AI landscape, this is probably characterized best by technologies like ChatGPT, where you can just ask an AI question and it'll answer it pretty well. But probably for biology, you could point to things such as AlphaFold, which can predict the 3D structure of a protein with incredible accuracy. And you know, it sort of uses generative AI. It's sort of similar to how a transformer well how ChatGPT works. It uses this architecture called a transformer. So broadly speaking, that's sort of how I see the evolution of these technologies. So we started off with statistical machine learning models, then we moved more to more advanced machine learning and deep learning models, but still ultimately predicting, you know, let's say, molecular properties. And then nowadays we're actually going into the structural space where we can predict, you know, how a protein folds, or how a protein and ligand fall together. So that's how I characterize it, broadly.
Noël Brown
That's helpful. I appreciate it. So that gives us sort of a starting framework. I think the thing that the question that still remains in my mind, and Clarissa, maybe you can help sort of fill this in…. if you could paint the dream for us with AI, because sometimes I think it is a tool that would just kind of accelerate what we are already doing, sort of like Excel was a tool that helped us, you know, advance in financial modeling. And then other times I think about what I hear and read about in just in the ether. And. It's more of a, almost an autonomous tool that comes up with its own ideas of what drugs should be. But you know what should we be viewing as the dream?
Clarissa Desjardins
Sure. I mean, it is a tool Noel, but it's a tool like the steam engine was a tool to move things around, right? It's a revolutionary tool. And so for me, when I think about ultimately where machine learning and AI is going to take us in this early drug discovery phase, is that I believe that someday we'll be able to, in silico, design a small molecule for any protein, any protein. So you know, this is amazing. I'm talking just about hits right now, if we could just in silico, design hits to manipulate proteins, we could then better understand them, right? Then, if you can easily design hits, you can easily optimize them for drug like properties. So to me, the first goal is in silico, know that with confidence you can design a molecule in silico without testing it, and that it will bind to that protein. So that opens up so many avenues, not only to for new knowledge to understand proteins better, because we really don't understand a whole proportion of the proteome, but it's just going to completely shorten what is one of the most difficult stages in drug discovery. And so the second ultimate dream, and that I think a lot of us talk about, or that is represented in movies, is this idea that we're going to understand human biology. We're going to understand all of our organs, all of our tissues. We're going to understand consciousness and feelings. And that is another dream that… to me, that dream means that all diseases are cured, you know, and that we live an optimal life up to 130 years old, and, you know, die peacefully in our sleep. So again, revolutionary ideas, which I think we're very far from, because we don't know the list of the parts of the body. I haven't seen a really good in silico digitization of a cell, let alone an organ or a human being. But it is, how to put it, it is an achievable goal. I don't think it's an unrealistic thing to expect that will happen, because the list of parts of the human body is finite, and someday we will understand how they're moving in space and time, and we will be able to digitize this, but I think that's a long way in the future. For now, I'd be happy if in my lifetime, we could go from choosing a protein to finding a molecule that hits on it, and it seems like every week and every month, there's progress being made on that front. The challenge here is that not only are the proteins moving around in like the little molecular machines that they are, the small molecule is also moving around and spinning in every direction. So to find that small molecule that meets that pocket where the forces are holding it together more than pulling it apart is a huge mystery. We really don't understand what those forces are. We can simulate some of them, not all of them, and that's where we need to go, and that's where AI and machine learning is going to help.
Noel Brown
And so now we're building and asking machines to essentially help us simulate what's going on in an environment that we don't fully understand. And so are we going to advance without having that knowledge, or are we going to advance in filling that knowledge gap?
Clarissa Desjardins
Part of me wants to say there's no way we can simulate something that we don't know the parts too, right? So no. On the other hand, the machine learning tools are helping to generate that knowledge. So it is, you know, a virtuous circle, if you will, of generating the new information that you need to then build the models to better simulate. It's a great point. There's a lot left to discover in terms of just what the component parts of biology are. That's why we focus at congruence on the protein. There we know we have atomic level information on the protein. We can simulate pockets. We can simulate small molecules that bind to those pockets to accelerate drug discovery.
Noël Brown
Got it. And so we've talked about where AI has sort of been historically, Laksh filled us in on that. And then clearly we talked about where it's going to get to, you know, where would you position us on that spectrum? You know, where are we today?
Laksh Aithani
I would say that once we have some data on a particular small molecule drug discovery project then I think machine learning works quite well at predicting the properties of future molecules. And also for other properties, such as Atomy data, permeability metabolism, etc, etc, I think also with things such as AlphaFold. And you know now AlphaFold 3, of course, what we're developing with DragonFold, I think we can also now much better predict the structural basis of protein ligand binding. So if you have an arbitrary small molecule and an arbitrary protein, how do they bind together? And I think that works much better than docking, which is, I guess, the traditional method. So I would say that is where we are at the moment. I would say the main challenges in this field, and probably in many other AI fields as well, is the generalization, right? So, you know, I'll give you an example. Let's say the AI model has data on kinases, right? So it knows how kinases fold. It knows how small molecules bind to kinases. If you then give it a new challenge, right? Predict a transcription factor. How does that fold into structure? And also, how does that bind small molecules? That's a very difficult challenge if it's not seen any data similar to what you're trying to get it to predict. So that's called the generalization problem, which is a problem across machine learning. And I think that's really where we need to go, and there's really two ways we solve it. One by building better machine learning models more powerful. They can learn more from less data. And the next thing to do is just to generate more data, and that's probably going to have to be an industry wide effort.
Clarissa Desjardins
So where we are today. I personally feel that if you take a human life as an analogy, we have just been conceived, in terms of the use of machine learning and AI in drug discovery. And I'll give you an example. So Laksh mentioned generative small molecule generative AI, where we kind of de novo, generate a small molecule with the features that we want. There's a recent article actually from, well, it's from Alex Zhavoronkov from Insilico Medicines, right? So it's called The Hitchhiker's Guide to deep learning driven generative chemistry, and it's a paper from last year, but he did a review of generative AI. There was only 55 papers total, and there was only a smaller subset of those that actually used wet lab data to validate the small molecules that they were generating in silico. Right? So this is nothing, right? This is a drop in the ocean. We are just at the beginning of this exercise, if you just take that particular endeavor as an example.
Noël Brown
I'm shocked. It's almost hard to believe that there's so few.
Clarissa Desjardins
Yes, it's a very particular type of generative chemistry, and there's probably a lot of work ongoing that is not published, but it's just one illustration of how we are completely embryonic in our use of machine learning in AI and drug discovery. And, I mean, look at a typical pharma drug discovery pipeline, right? What are they doing? High Throughput Screening? I mean, it's all wet lab traditional. They're doing drug discovery the same way we've been doing it for the last 50 years. Now Laksh and I are in companies that are early adopters, and he's creating some of the software that'll ultimately be used by everyone, but we're really just at the beginning. So there's so much to learn, and there's so much to develop from the entire industry.
Noël Brown
So investors in the space, you know, the all people associated with the kind of financing and investing in AI companies, have struggled with, how do we value these companies? How do we compare them? How do we measure them? But what's starting to resonate from what you both have been saying is that at the end of the day, you're still biotech companies, right? Like how ultimately you should be measured is by the drugs you've either developed or contributed to development of period…
Laksh Aithana
I mean, I think right now, whether you like it or not, at least, if you're going to raise from biotech specialist investors, I think you're going to be classified as as a biotech company. I think right now, platform companies, you know, are finding it pretty, pretty difficult, right? Just given, given the interest rates that being said, you know, I do think there is a way to value your platform, but you're right. It's, it's very, very hard, and it's oftentimes, you know, I guess at best, whereas with assets, it's a lot more standard, you can value them much more accurately.
Clarissa Desjardins
We were told by investors, your platform sounds very exciting, super intriguing. We have no way of evaluating it, and so we're really just going to look at the efficiencies that you're gaining by using these tools in the development of your own pipeline. And we completely understood that, and took that strategy from day one. So use our tools to show how we can gain efficiencies to develop new development candidates. However, when you think about it, it is kind of selling our technology short, right? So we're not investing the majority of our R& D dollars to protect this platform. We're not investing to sell software to others, you know. So I'm arguing the counter to the strategy that we've chosen. But there are, you know, there are pros and cons to each strategy. We've definitely taken the strategy of building our pipeline, and maybe someday we can go back when we've earned the right to then fully invest in all of the potential of the platform.
Noël Brown
That might be a good point to talk about what each of your companies are doing, so we can get your individual perspectives, but so our audience can also start to appreciate where you fit in the broader kind of milieu.
Laksh Aithana
So what we focus on specifically at Charm, is using AI to predict how a protein and a small molecule make it fold into their respective 3D structures. So in many ways, it's doing a similar thing to docking, which is where you use physics, plus a crystal structure of a protein to figure out how a small molecule binds its protein. But that doesn't account for, you know, for example, protein movement. So we're able to do that with deep learning, which is a form of AI, You can learn patterns of how a protein and a small molecule bind together. So that's really what we do. We are a pre clinical company, but we hope to file our first IND next year.
Clarissa Desjardins
Congruence is also a three year old company, and what we're doing is we're developing pharmacological correctors. So these are small molecules that bind and restore wild type features to mutated proteins. So we look at targets where there's a mutation that's causal to a disease. And so our targets now are MC4r, genetic obesity, G case for GBA driven Parkinson's disease, and A1AT, and each one of those has severe mutations that cause, you know, horrible diseases which have no treatment as of today. So we're going after those. And what we do is we start with an X ray crystal structure of the protein, and then we build the mutant. We model that mutant, we generate conformational landscapes for both. This is using machine learning and AI. This is our proprietary tools. And so we see all the pockets that form across this conformational landscape. We compare the two, the wild type and the mutant, and we're able to identify the biophysical features that we believe are responsible for the pathogenicity of the mutated protein. That's what we want to correct with a small molecule. And then we generate pharmacophores on unique pockets that we identify. We do virtual screening, this is our main way of identifying small molecules as of today. And then we do virtual docking, where we measure the effect of the small molecule on the mutated conformational landscape. So we choose the small molecules that are best able to correct that defect that we identified and shift the landscape back to wild type. So so far, this gives us a hit rate. So let's say we synthesize 200 molecules predicted from this Revenir platform that we call it, Revenir meaning to come back, come back to wild type, come back to health in French. So we synthesize, let's say, the 200 molecules. We have hit rates between six and 26% so this is starting to shorten the cycle. Now, these are hits, we have to then transform them into drug like molecules, and this is where we're using some novel machine learning AI tools in the multi parameter optimization space. But really the core of our technology is about looking at this full protein conformational landscape and how we can affect it with small molecules.
Noël Brown
AI/ML relies on training models with large data sets. So what has been your experience with accessing high quality data and scaling your AI models?
Laksh Aithana
We're focused on predicting protein ligand structure. So we need protein ligand structures to train the models. A lot of that data is in the public domain, so there's a database called the Protein Data Bank, or the PDB, that researchers have been depositing structures into over the last 50 or so years. So I think there are now over 200,000 structures in the PDB, and a lot of them have small molecule drugs down to the protein. So that's obviously a very useful data set. But what we've realized is actually, for the projects that we work on, it can be very helpful to generate our own data on those protein targets that we decide to go after, and so that's why we've built an in house crystallography capability to generate this protein ligand structure data. So we've already generated, I think, over 200 of these proprietary structures, which sounds like a small amount, but when it's really focused data generation on a single protein or just a few proteins, it really makes a huge impact on the accuracy of the model.
Clarissa Desjardins
Yeah. We also work on protein ligand binding, but we also work on predicting various descriptors of the protein and how it binds. So one of the things we focus on is the free energy of folding the delta G, the so called thermodynamics of the protein. And here there's a variety of public databases, but what we've found when we're looking for these protein descriptor databases is that one day they're there and the other day they're gone. A lot of these were generated in academic labs, and they're not kept up for whatever reason: the professor retires or they change interest. So we have come to now download immediately any useful database that we see for fear that it's not going to be maintained or that it's going to disappear. And that happened to us actually with this thermal mute DB database. So we downloaded, I guess, two years ago when we first started using it, and now we've gone deep into the database, and we find that there are several errors. So as they've tried to parse the database to study subsets of proteins there's been some mistakes that have repeated themselves with the same proteins having different free energy of folding. So one has to be super careful about the publicly available databases, which are absolutely necessary. And so I completely agree with Laksh that we still need to generate our own, and for someone trying to predict small molecule binding to a protein that has to be atomic level information about how small molecules are binding, and particularly to a specific binding pocket, and so access to novel X ray crystal structures, or cryo em or NMR, this is absolutely key, and the databases are relatively small. I mean, Laksh mentioned the total size of the PDB. But we, for example, were looking for proteins that were crystallized with a ligand and without, because we're trying to predict how this pocket you know, forms itself, and that was less than 1000 of those examples. So you can see how we have a long way to go until we have really robust databases that we can train sophisticated models on. But for an individual protein, yes, we can generate our own data and be, you know, quite comprehensive in that small database generation.
Noël Brown
So we often hear about going a different direction, right? We hear about the market for AI drug development to be a $50 billion market in the next decade. How do we think about those kinds of numbers? Like, how do you interpret that? I mean, does it represent revenue that's coming. It's not revenue directly generated by the AI platform, per se, but more is it, I'm assuming revenue generated from drugs that have come from these platforms?
Clarissa Desjardins
All companies doing drug development are going to adopt these tools for better efficiency. And if you think the global Pharma R&D budget today is about 250 billion so yes, that entire budget is going to be AI/ML driven, and I believe that it's going to create a lot more than $50 billion in value by efficiently driving towards new drugs, you know, faster, cheaper, but I don't think it's going to be in software sales. No.
Laksh Aithana
I mean, I actually have a different opinion. I think one reason why the software sales for, not quite AI technologies, but let's say computational technologies for drug discovery have sort of been perhaps less than expected. You know, with Schrodinger being the biggest player, which is, I don't know, maybe 100 million dollars in annual recurring revenue. I think it's probably because, you know, the technologies, while Schrodinger is really good, it still makes mistakes. And so I think, you know, people still need to ultimately do that lab testing, which ends up being the most expensive part of discovery, right, the synthesis and the profiling of molecules. So I think as soon as the AI becomes good enough to actually really start massively cutting down on the amount of molecules you need to make and test, that's when you're going to start getting real spend on software. And I think that might be the time when, you know, a company or multiple companies form to actually start, you know, separating the work in terms of, they make the software, and then people buy from them. I mean, in many ways, it's like, it's like CROs these days in drug discovery, right? A lot of people now don't do their own synthetic chemistry. They just use Wuxi, or Pharmaron, or some other CRO, and I think you might get a similar thing with with AI for designing and predicting molecular properties at some point.
Noël Brown
You know the potential that can still come from your platforms, but because we're in a world of like, that's all well and good, but like, how do I what drugs like, what stage of development like? What's the what's the TAM? A) How do you balance those needs and B) how are we going to slowly help the investor to expand their thinking and appreciate that again, there's a lot of value by investing in these platforms?
Laksh Aithana
The first thing is probably, to whatever extent it's possible, trying to get investors that perhaps do have some experience in investing in platform companies, are long-term oriented. That probably helps, I guess. The other thing is having demonstrable case studies, either from the past or within this particular company, for how the AI is enabling and speeding up drug development in many different areas.
Clarissa Desjardins
The beauty here is that our platforms are relatively cheap, or at least in the case of Congruence, and so that we can focus 80% of our R and D dollars on traditional small molecule synthesis, and wet lab testing and only quote, unquote, 20% on the platform. But that turns out to be quite sufficient, at least for our pipeline needs, because it's relatively cheap, and so in this case, I think that we have, you know, that's a great benefit so that we can honestly tell our shareholders that we're investing the majority to build the pipeline without neglecting the future impact that our platforms are going to deliver.
Noël Brown
Should pharma be investing as heavily in AI, or is this something that should be left to innovators?
Laksh Aithana
I mean, in many ways, they are investing through collaborations, right? So there's been a lot of collaborations in the AI space, probably now, for the last five or so years. So that is a way they're investing. They haven't bought any companies yet, although platform companies tend to not get bought that often. And of course, they do have their own investments as well, but I think, you know, a lot of it is through collaborations.
Clarissa Desjardins
Yes, I think they're doing what they can to build expertise in house, right? So people graduating with machine learning AI backgrounds, which is relatively recent, and some healthcare or drug discovery knowledge, will be in super high demand, even just to be able to evaluate, you know, from a big pharma perspective, what's going on in the industry and collaborations as well. So I think they're doing a mixture of everything, trying to build expertise in house, you know, do their own machine learning, AI model building and testing in house, as well as collaborating with external companies.
Noël Brown
What are the long term goals for each of your companies, and where do you see the space over the next five to 10 years or so?
Laksh Aithana
We want to get the clinical data for our LEAD program, hopefully in late 2026, that's certainly a goal. But then another important goal is to continue to develop the platform and improve its predictive power, right? So we want to increase the accuracy at which we're able to predict these protein latent structures, and also, in particular, improve the ability for the model to generalize, right? So to be able to predict completely novel proteins that it's never seen in the training data before. In terms of where the field is going over the next 5, 10, years, so and again, this is just my opinion. I'm gonna keep it focused on small molecule drug discovery. I think the first thing that gets solved in this industry is, how does a protein and a small molecule fold up into their structure? And then the next thing that gets solved is, how tightly do they bind? So what's the binding affinity, or the KD? And once you can do that, then you can essentially identify a hit. Then you can essentially identify a hit against any target. So that's I think that’ll get solved within the next 5, 10 years. Then the next part of the puzzle, which is potentially much harder, is, how do you optimize that hit into a lead and then also, finally, into a drug, right? So for that, you need to improve the potency, selectivity and also the ADME properties. So I think the ADME properties are very difficult because there's not that much data, and also the data that does exist is often not very clean, so we might need to generate a lot more data over there. And I guess the challenge with drug discovery is, if your molecule is missing even just one property or a few properties it's potentially not going to be a drug. And so in order to have the biggest impact on small molecule drug discovery, you need to solve all of those problems. However, by just finding the hits, that gets you a long way there, right? So if you think about some targets, such as RAS you know, we've been working on RAS for the last 30, 40, years, probably, and many would argue that we still don't have good RAS targeting drugs, just because it's been so hard to find hits against RAS and then develop them into drugs.
Clarissa Desjardins
Now at Congruence, so in five to 10 years, if all goes well, we should have developed at least three drugs against these rare, intractable diseases, where we have generated proof of concept data in a phase two. So for the most advanced we would be further along. But I think that that would be tremendous accomplishment for our little company to have shown three novel drugs on difficult targets, showing efficacy in the clinic, and there'll be many more after that, but that's just the simple version. In terms of the industry moving forward, I agree with Laksh. We're going to see increasingly better predictions from either, you know, virtual screening of billions of compounds, or generative design or multiple different technologies leading to this ability to generate hits, but ability to generate a hit with drug like molecules. But there'll be tipping points in a variety of fields. I don't even know. I don't believe that I can predict where they're going to occur now, but there'll be breakthroughs such as ChatGPT, and we'll be surprised at our ability to predict x, y, z, that we need to know to develop drugs. But I think it's an exciting place to be. It's a laudable goal to try to address the suffering caused by all these diseases. And I, for one am super excited to be in the field, and to be where we are at this time with the application of these new technologies.
Noël Brown
So I'm incredibly grateful to both of you. It's been so incredibly informative, and you've helped me better understand, and I think the audience understand this landscape of AI drug development a little more, and it is very much appreciated,
That’s it for our conversation today, folks. Thanks again for listening to Pathfinders in Biopharma, brought to you by RBC Capital Markets. Please remember to subscribe to get more great content and be alerted about future episodes. This episode was recorded on August 19, 2024 if you'd like to learn more or continue the conversation, please visit rbccm.com/biopharma. See you next time.