Can ChatGPT be used to advance drug discovery?

Photo/Shutterstock
AI in clinical research

Chat GPT – which is a large language model (LLM) and belongs to a family of artificial intelligence (AI) known as generative AI – has been a revelation for many people around the world, and, since its launch, has been used by professionals in many industries to assist with work-related tasks, such as editing and writing code. But can it also be used in the life sciences industry to advance drug discovery? 

It’s no secret that AI has become a pretty big deal within life sciences to advance drug discovery and development. This is especially true after the COVID-19 pandemic revealed AI to be an ideal tool to help find treatments and vaccines with greater speed and precision. 

In fact, in a major breakthrough, it was reported recently that AI had speedily helped to discover a new antibiotic called abaucin, which can combat a mutli-drug resistant bacteria.

Although ChatGPT itself is not specially designed for drug discovery, some companies are making use of it as a helpful tool to assist researchers in the process, which, as a whole, can be very complicated and anything that helps speed up the process is a welcome asset. 

Assisting in the drug discovery process

When we asked ChatGPT whether it could be used to advance drug discovery, it responded by saying that, although it can be used as a tool in the drug discovery process, it ‘has limitations and is not a substitute for specialized software or expertise in the field.’

It did, however, say that it can assist with drug discovery through data analysis, literature reviews, virtual screening, predictive modeling, and decision support.  

Leonard Wossnig, CTO of LabGenius, also commented that ChatGPT is useful for querying data. 

“Natural language processing (NLP) has been useful for querying data that is otherwise locked in scientific literature. An emerging solution is to query large language models (LLMs) including both generic models, like ChatGPT, and domain specific models, like BioMedLM,” Wossnig said. 

“For example, querying ChatGPT “What proteins could be targeted for the treatment of triple negative breast cancer by antibody therapy?” yields the suggestions of EGFR, VEGF, PD-L1, PARP, and IGF-1R. While none of these are revolutionary proposals, more domain trained LLMs are likely to aid in the acceleration of target identification in the near future.”

Chat GPT-4 in particular seems to be very useful in aiding the drug discovery process, with OpenAI even describing a number of possible ways in which it can assist drug discovery in its full Technical Report, published after GPT-4’s release.

Here, OpenAI stated that GPT-4 could help to find similar compounds to those that researchers are studying, propose re-engineering compounds and identify mutations that alter pathogenicity, and determine whether the compounds are patented. 

Customizing ChatGPT for drug discovery

Furthermore, ChatGPT can be customized in a way that allows researchers to work more easily with other forms of AI than their standard interfaces. 

For example, an AI drug discovery company called Insilico Medicine has integrated ‘ChatPandaGPT’ into its PandaOmics platform, which enables researchers to have natural language conversations with its platform and efficiently navigate and analyze large data sets, in turn, more efficiently facilitating the discovery of potential therapeutic targets and biomarkers. 

ChatPandaGPT draws from a specialized knowledge base that allows it to provide accurate and detailed information related to molecular biology, therapeutic target discovery, and pharmaceutical development. 

By using both natural language processing and machine learning algorithms, it can provide more personalized and relevant responses for researchers using the platform. 

Using Chat GPT to develop biological experiments 

Creators of the world’s first digital experiment platform for life sciences R&D, Synthace, also recently announced the integration of their platform with ChatGPT, in order to design protocols for biology experiments and automate lab work. 

Experiments are difficult to design, plan and automate in a lab, and the whole process can take up an enormous amount of time. But, Synthace’s ChatGPT prototype helps to speed up the process, and allows scientists to complete an experiment in hours rather than weeks or longer. 

“With this prototype, Synthace uses ChatGPT to help a scientist define their experiment through natural language prompts. When the scientist is ready, Synthace converts the experiment into instructions for lab robots. ChatGPT has been trained on scientific literature, so it can interpret and create experiment designs, while Synthace is built to automate lab equipment,” explained Markus Gershater, co-founder and chief scientific officer (CSO) of Synthace. 

Regarding drug discovery, the prototype can actually help companies speed up this process, too.

“In our prototype, ChatGPT can be used to develop the experiments that drug discovery companies need to optimize and use in their labs. Even the companies that use AI to discover potential drugs still need to take those candidates into the lab to run experiments on them. This is where Synthace comes in,” said Gershater. 

“I think the most exciting scientific discoveries will stem from novel proprietary data sets. This is because most scientific breakthroughs are often centered around novel drug modalities, for which there is no available data.”

Leonard Wossnig, CTO, LabGenius

Key limitations of using Chat GPT for drug discovery 

As much as it may be possible for ChatGPT to assist in drug discovery, it definitely has its limitations, so much so that Stef van Grieken, co-founder and CEO of Cradle, said he would not currently recommend it to be used in drug discovery, due to three reasons.

The first, he said, is that ChatGPT is often not truthful and will convincingly state inaccurate information. Indeed, ChatGPT is known for generating ‘hallucinations’, whereby the information it gives you sounds plausible but is either factually incorrect or completely unrelated to the given context.

Van Grieken carried on, saying the second reason is that ChatGPT finds it difficult to explain why it came to a certain answer or conclusion.

And, thirdly: “ChatGPT has very limited access to relevant data and literature for drug discovery. Many scientific publishers don’t allow access to their papers that would be required to train these models, and lots of relevant experimental datasets and patents are likely missing.” 

But, van Grieken added that he does think in the near term, companies will develop similar LLMs that are experts at understanding drug development literature, datasets, and patents and will be able to help scientists by directly answering questions, finding relevant data, and summarizing literature.

Meanwhile, for Wossnig, the limitations of ChatGPT for drug discovery also stem from the fact that the data it provides is limited to what it can find on the internet.

“…I think the most exciting scientific discoveries will stem from novel proprietary data sets. This is because most scientific breakthroughs are often centered around novel drug modalities, for which there is no available data. However, thanks to recent advances in automation and disease modeling, companies can now create their own high-throughput, clinically-relevant data,” he said.

Although ChatGPT has its drawbacks, it does appear to have a role to play in assisting drug discovery, even if that role is currently limited. And, with OpenAI bringing out GPT-4 so soon after introducing the world to the GPT-3.5 series, who knows what the future holds regarding the role of ChatGPT for drug discovery if OpenAI decides to bring out an even more advanced version.