Intelligent innovation

Markus Gershater reveals how AI will transform biology and the pharmaceutical industry.

Artifical intellignece (AI) will transform biology. As far as anything can be predicted, I know this statement is true. But it is also a statement that risks being useless or even counterproductive, for two reasons.

The first reason is that AI is massive and can be applied in a myriad of ways across every aspect of the value chain. Saying “we need to use AI” is like saying “we need to use electricity”: obvious and useless unless you talk specifics. Much more meaningful is “we need to apply large language models to improve the user interfaces for our complex equipment and methodologies,” or “we should use active learning to optimise the development of assays for early discovery.”

The second reason it’s useless is that it’s an empty call to arms, with no acknowledgment of all the change that will be needed to make the touted revolution come about. In the second industrial revolution, electricity was insufficient by itself to increase productivity. People needed to first realise that it offered a way of changing the way they worked. Factories no longer had to be arranged around massive drive-shafts powered by steam engines. Instead, they could be arranged into production lines. It was the combination of new technology (electrification) and new ways of working (production lines and separation of labour) that enabled the step-change in productivity.

My underlying belief is that AI and biological research don’t fit together properly yet. AI is a technology that fundamentally demands change from the people who want to use it. So for AI to have a fundamental impact on biology, we must change the way we approach the process of science in the first place. Organisations and teams will have to adopt new mindsets, which demand new scientific processes and must be supported by an updated ecosystem of tooling.

In effect, the biggest way AI will change our industry and the study of biology is the change it prompts us to make with how we work. A conversation about AI is, by this point, really a conversation about data. So what data would an AI system need in order to be able to untangle biological complexity?

Quality and context of data is as important as volume

Biology’s complexity emerges from the interactions of its simpler components, giving rise to unique properties and behaviours. These emergent features can’t be reliably predicted from individual components, necessitating a comprehensive and interconnected dataset for a deeper understanding of biological systems.

Much of the big data produced in biology are multi-omic studies: highly detailed molecular snapshots of a system. But apart from genomic data, all of these readouts are highly dynamic- they change over time and in response to a multitude of stimuli. To truly understand a biological system, we must understand its dynamics as any number of factors change. We can’t just measure a lot of things, we have to measure them in the context of this multifactorial landscape, systematically running experiments that map the space, and allow AI to “see” what is going on.

Just sequencing something isn’t enough; we must also look at how it works, interacts, and reacts to different stimuli. In our pursuit of comprehending the intricacies of biological processes, it’s clear that one-dimensional data alone won’t lead us far along this investigative path.

When recording experimental data, we often lose vital context regarding its production. A thorough grasp of the array of experimental inputs and their systematic variations is crucial for comprehending response complexity. Yet, the extent of data we possess about our experiments – why we selected them, our methods, lab conditions, liquid classes used with automated pipettors – is extensive but challenging to document effectively with current processes and tools.

Research we recently conducted revealed that a concerning 43% of R&D leadership lacks confidence in their experiment data quality. This not only underscores the need to enhance data recording methods but also emphasises the importance of conducting experiments that yield higher-quality data from the outset. Consequently, gaining a profound understanding of this data necessitates comprehensive insight into its creation process. Metadata pertaining to experimentation should be a pivotal component of any future AI strategy, prompting a shift in our work methodologies.

When AI can help a biologist identify the best possible experiment, run it, help analyse the full breadth and depth of experimental data and metadata, and then use that data to decide on the next experiment, then the transformation of our industry will truly have happened.

AI will only change our industry if we change with it

Some companies today exemplify a future-oriented approach to biological data, especially in regard to AI. Consider companies such as Recursion and Insitro, with comprehensive automated platforms designed for systematic, full-digitised exploration of biological systems. They offer a glimpse into the future: routine generation of high-quality, multidimensional data with rich metadata. This data forms the AI foundation, revolutionising our understanding of, and interaction with, biological systems.

Beyond data quantity, it’s the depth of context that truly matters. Some companies already lead by building from the ground up for nuanced, rich data for discovery. As this approach evolves, it holds promise for the industry and for biology more widely.

Markus Gershater is chief science officer at Synthace.

Recent Issues