The new wave of AI tools are about far more than text and images
Imagine a world in which we can develop new drugs to treat diseases in a matter of months, rather than years. Where we can slow down or even nullify conditions once believed to be untreatable. Because of new developments in AI, this might soon be possible. That’s according to Anima Anandkumar, senior director of AI research at NVIDIA and the Bren Professor of Computing at California Institute of Technology.
“This is an inflection point for the use of AI in life sciences,” she says. “Gartner predicted that by 2025 more than 30 percent of new drugs and materials will be systematically discovered using generative AI techniques, up from zero today. I believe we will meet or even exceed this prediction.”
The field of AI has been enjoying a boom phase for a decade thanks to advancements in machine learning. Most of those developments have been related to making sense of existing data, identifying patterns, and extrapolating insights. Recently, however, there have been breakthroughs in generative AI models. These produce new content altogether. Instead of simply recognising a face in a crowd, for instance, generative AI can fabricate a new one.
Since December 2022, the technology news cycle has been dominated by gen-AI applications such as ChatGPT, a chatbot that quickly became the fastest-growing consumer application in history. But tools that can produce text and images are only part of a larger wave of gen-AI use cases.
Computer scientists are particularly excited about what gen-AI means for medicine. On the one hand, it could help tackle future pandemics. As easily as you can teach these models English and ask them to generate plausible sentences, you can train them on the genome data of known viruses and ask them to conjure variants of concern before they emerge. “We can use this information to prepare vaccines and other countermeasures before we even encounter them in the wild,” says Anandkumar.
It’s on the treatment side that gen-AI holds especially transformative potential – and not just for pandemics. Drug design is an industry that’s projected to hit USD161.76 billion by 2030, according to Precedence Research. The traditional methodology is convoluted and painstaking. It begins with identifying a biological target, which is usually a protein that is causing a disease process. Next, you must develop a compound that produces the desired cellular effect on that target: either change how it works or shut it down. For a compound to be effective, it must also bind with the target – but this is harder than it sounds because the target receptor’s structure morphs when it binds.
Traditionally, chemists begin by sifting through libraries of existing compounds to synthesise and test in a laboratory. They’ll modify each promising compound, adding and removing atoms based on its effectiveness. It can take thousands of iterations, and years of work, before a candidate can even be tested on humans, and many of them still fail because of how they interact with the body as a whole. “There are so many steps to the process,” says Anandkumar. “With every one it becomes more expensive.”
AI offers a welcome shortcut. By crunching masses of data, such as the structure of pathogens and the efficacy of existing drugs, gen-AI models can produce drug molecules that are fit for purpose and may never have been seen before. These can then be synthesized to spec. In other words, rather than looking for the needle in the haystack, you simply create the needle.
Anandkumar herself is exploring this space with her colleagues at Caltech and NVIDIA, and in collaboration with public research centres such as Argonne National Lab and the pharmaceutical company Entos. The latter is one of a raft of startups and incumbents in the medical sector who are putting gen-AI tools front and centre.
Over the next decade, she expects to see these models producing drugs that are personally tailored to the patient. “Once the drug design cycle gets vastly sped up and cheaper,” she says, “it may become economically feasible to discover new and better drug molecules based on specific targets in a person, according to their genetic analysis.”
The major hurdle, though, is data. These algorithms are only as useful as the data they’re trained on. Gen-AI has been successful in applications such as text and image generation because there’s a wealth of historical, valid data accessible. But in fields such as the study of molecular properties, says Anandkumar, the data is more limited or even unavailable.
She believes that the solution may lie in physics-informed learning, where you embed physical laws and boundaries in the model’s processes. “Training models to follow those rules makes them significantly more powerful and accurate,” she says. This could unlock their industrial potential not only in drug design but well beyond. “We need a foundation AI model for science and engineering that understands complex phenomena beyond just text or natural images,” she continues, “and I believe we will see such developments in the near future.”