Artificial Intelligence, Machine Learning & Drug Discovery in The “Post-Watson Era”

“People should look at machine learning more like electricity. I think every single company is either going to be using machine learning or out of business the same way that every single company would have had to use electricity at some point or go out of business.”

— Simon Smith, Chief Growth Officer, BenchSci

Increasing computational power and a more realistic set of expectations are helping to bring the use of AI in drug discovery out of theory and into practice. 

The massive increase in computer power since companies initially floated theories on using artificial intelligence (AI) to help discover drugs has helped bring the area into practical uses. 

But it’s really the change in expectations that has turned AI in drug discovery from a nice theory, exciting to those working in what-ifs, to a true tool that can aid those working in practicalities. 

Initially there was an idea that technologies such as IBM’s Watson had the potential to completely change the way work was done in every area of human endeavour. 

“IBM Watson had just won Jeopardy. There was a misunderstanding at that time that if you just had a lot of data, a general AI would be able to go through it and answer any questions that you might have,” said Simon Smith, the Chief Growth Officer at the startup BenchSci.

When IBM Watson won on Jeopardy, where the questions span across different categories, it gave the illusion that one general purpose technology could answer all kinds of questions just by having enough data.

“That was the early phase of all of this and IBM was doing everything in its power to demonstrate that they were able to apply this technology to healthcare through Watson for Oncology and that helped to build up this idea that the technology was much more general purpose than it actually was,” he added. 

In other words, do not expect a single, general-AI to begin producing miracle drugs anytime soon. Today, a more nuanced picture has emerged, one in which specialized machine learning and AI programs can be tailored to improve multiple specific use-cases across every stage of the drug discovery process.

Smith has identified 13 areas of the drug discovery process where tech startups are already applying machine learning and AI: 


  1. Aggregating and Synthesizing Information

  2. Understanding Mechanisms of Disease

  3. Generating Data and Models

  4. Repurposing Existing Drugs

  5. Generating Novel Drug Candidates

  6. Validating Drug Candidates

  7. Drug Design

  8. Designing Preclinical Experiments

  9. Running Preclinical Experiments

  10. Designing Clinical Trials

  11. Recruiting for Clinical Trials

  12. Optimizing Clinical Trials

  13. Publishing Data

“The traditional trial-and-error approach to drug discovery is too costly and time-consuming, with a low overall success rate,” said Dr. Niven R. Narain, Co-Founder, President and CEO of the Boston based biopharma company BERG. “By identifying and targeting biomarkers, we can eliminate the ‘guess and check’ approach, improving how treatment pathways are identified and targeted in addition to better personalizing treatments for patients.”

Developments in the use of AI and machine learning across the drug discovery process have not gone unnoticed by the World’s largest technology companies. In 2017 DeepMind, a London-based AI company owned by Google’s parent Alphabet, applied its powerful software toward folding proteins for drug discovery. 


In July, the Google Brain team announced it had successfully crafted a computer vision for the crystallization of protein. Verily Life Sciences, also owned by Alphabet, has entered into a three year, $90 million partnership with Gilead Sciences to identify how certain patients respond to existing drugs and to discover clues from the data that could lead to new medications.

This September, Amazon Web Services (AWS) announced a partnership with Merck and Accenture to launch a cloud-based informatics research platform with the intention of helping life science companies improve and streamline their drug discovery initiatives. Merck will be the first to use the platform, which according to their press release is meant to facilitate innovation by “creating open, industry-standard application programming interfaces for core research functions, allowing researchers to rapidly adopt new capabilities.”

Venture Capital and The Long Term Picture


As Big Pharma has been scrambling to partner with up-and-coming machine learning and AI startups, venture capitalists have for the past few years steadily been increasing their investments in the field. 


In 2015, Silicon Valley VC Andreesen Horowitz launched their first fund focused on biology and computer science with $200 million. Last year they launched their second bio-focused fund for $450 million. 


Chinese linked Venture Capitalists have been very active in this space as well. At the beginning of the year, Sequoia Capital China led in providing $15 million in a Series B funding round for XtalPi Inc, a computation-driven pharmaceutical technology company. 


These examples are just the tip of the iceberg, VC investment has shot up dramatically and as of yet shows no signs of slowing down. Simon Smith at BenchSci, itself a startup in this field, has identified over 100 startups working in one or another niche of the drug discovery process.


“People should look at machine learning more like electricity,” said Smith. “I think every single company is either going to be using machine learning or out of business the same way that every single company would have had to use electricity at some point or go out of business.


“I don’t think people are going to be walking around saying our technology is powered by machine learning. That would be like saying our technology is powered by electricity. It’s obvious that it’s powered by electricity and I think it is going to become obvious that you are using some kind of cognitive technology in your business.”


“As soon as we move to a world where the only competitive differentiation between companies is going to be the data that they own,” Smith continues. “As soon as that comes to the world of pharma and life science, that is where everyone is going to have to focus. 


“GSK did a big deal with 23andMe, $300 million dollars to access their data. Why? Because they have a unique database of genomes. Now when you make drugs, people who take those drugs and have their 23andMe genome sequenced can upload the efficacy of their drugs and then they can analyze that and make a better next generation of drugs by tailoring to people’s genomes and the virtuous cycle continues. 


“That is where people need to focus. I don’t run a pharma company, but if I did I would be scouring the world just as much for unique data sources and unique data partnerships as I was for unique compounds to

Partnerships Instead of In-House


Traditional Big Pharma companies have aggressively sought out partnerships to stay at the cutting edge of these technologies. For example, AstraZeneca and SanofiPasteur have partnered with Niven Narain’s company BERG.

“We all know that traditional drug discovery takes too long, and we’ve seen the industry increasingly embrace these technologies,” said Narain. “I think many big pharmas are still looking for better ways to inject this technology and expertise into their processes, but they need a quality partner. Fortunately for BERG, AstraZeneca and Sanofi Pasteur recognized the potential of our proprietary technology and unique approach, with strategic partnerships announced to identify new therapeutic targets for neurological disorders, such as Parkinson’s, as well as to assess biomarkers of flu vaccine performance.”

“Takeda is very thoughtful about what it can do on its own and what it can do more effectively with partners,” states David J. Weitz, a senior vice president at Takeda California out of San Diego. 

“Our partnerships with Numerate and Schrodinger are good examples of where our partners are effectively combining the software they developed with the availability of larger compound libraries, protein crystal structures and traditional drug discovery skills to discover new medicines with novel chemistry. 

“Other examples are consortia like Open Targets and the UK Biobank. In both cases we are looking to utilize the consortia’s access to large quantities of otherwise unmined human data and take advantage of software tools for interrogating the data that are being generated by the consortia.”

Earlier this year Boehringer Ingelheim entered into a drug discovery agreement with the machine learning company Bactevo to identify novel small molecule lead compounds. Meanwhile GlaxoSmithKline (GSK) partnered with artificial intelligence-driven drug design and development company Bactevo. 

To give a couple more recent examples, last year GSK also entered into a $42.7 million deal with the UK-based Exsientia, which uses an artificial intelligence based-platform to automate drug design while Sanofi entered into an even larger deal with the Scottish company, offering research funding and milestone payments of up to $250 million.

“Pharma, as the ultimate consumers of advances in machine learning and artificial intelligence, understand its promise as well as its limitations and the amount of work that goes beyond the answers these technologies can provide,” said Weitz. “There are some effective applications for the technology but the solutions provided are of focused utility. Fundamental issues such as noisy biological data and poor data harmonization still exist.”  

For example, human genetic data plus machine learning tools may correlate a location/target with a disease phenotype but that correlation still needs to be validated experimentally to determine the best target and execution pathways to succeed as a medication, he added.  

And that needs to be kept in mind when discussing AI in drug discovery. What has not changed is a hype cycle that may lead some to believe that rather complete automated answers (or drugs) can be spit out of a computer. A great deal of in-depth experimental follow-up is still required,” he said. 

Computational Acceleration


Fortunately for advocates of AI in drug discovery, exponential increases in computation power is making their job easier. In the 80s, concepts such as Computer Aided Drug Discovery (CADD) were enthusiastically embraced by companies such as Merck. But computational power was not enough to deliver on potential and CADD did not live up to its potential. 

“Two things that are different now are the amount of data available and computational power,” said Weitz. “These two changes enable us to detect patterns from models that previously may not have been detected.”

“The technological capabilities exist today, and computing power is only continuing to accelerate,” Narain continues. “The hurdles we must continue to overcome are more specific to research and patient engagement.

For example, access to biobanks or other clinical databases from which to identify, secure and analyze biological samples continues to be a challenge. On the patient side, identification and clinical trial recruitment takes time to put in place.

It is another area where AI could play a role. “The modelling and technology are available and scalable now, but I’m confident in the next few years we’ll begin to see more progress in connecting patients to trial sites, as well as connecting the necessary research dots to streamline development processes and study protocols accordingly,” Narain added. 

If you found this content valuable, please consider subscribing to Rx Data News to receive monthly updates on the state of data analytics and machine learning in the pharmaceutical industry.

© 2019 PNG Publishing

Follow us on twitter:

  • Black Twitter Icon