Medicilon Logo
search icon search icon language icon contact icon menu icon
Medicilon Logo
search icon close search icon language icon contact icon menu icon
Contact Us
Close Button
Back To Top
Online Message×
Click switch
Close Button
Medicilon's News information
News information

Artificial Intelligence Helps Drug Development

Page View:

On October 22, 2021, the research group of Academician Hualiang Jiang and Mingyue Zheng from the Shanghai Institute of Materia Medica, Chinese Academy of Sciences published an article Drug target inference by mining transcriptional data using a novel graph convolutional network framework on Protein & Cell. Researchers applied the Twin Spectrogram Convolutional Network (SSGCN) to develop a novel drug target prediction algorithm based on transcriptome data, and verified the prediction results of the model by wet experiments. The results show that the SSGCN model can significantly improve the accuracy of drug target prediction, and provides a powerful means for drug action mechanism research and target validation[1].

At Medicilon, chemistry and biology are ingrained in every project we undertake. Our medicinal chemistry team is capable of flexibly applying computer chemistry to assist compound design process. In the meantime, we apply advanced drug discovery technologies, including proteolysis-targeting chimera (PROTAC), DNA-encoded chemical library (DEL) and antibody drug conjugation (ADC). We are also proud of our rich experience in innovative design and patent strategies that complement our technologies.In addition, our responsive project management and effective communication help optimize our project delivery.

At present, most drugs exert their therapeutic effects by interacting with specific targets in the body. To clarify the target and mechanism of action of drugs is very important for drug research and development and marketing applications. For the active compounds or natural product active ingredients obtained by phenotypic screening, the discovery and confirmation of the target is the key difficulty for further research and development. In addition to speeding up the early drug discovery process, the identification of potential drug targets can also deepen our understanding of the drug’s mechanism of action, metabolism, adverse reactions and drug resistance, point out the direction for the discovery of new indications for drugs, and significantly reduce the cost of drug development.

Drug Development

Drug targets can be identified through biochemical experiments (such as proteomics mass spectrometry). However, due to factors such as experimental scale, accuracy, and cost, large-scale experiments are often difficult to be widely used. The calculation-based target prediction method has its low-cost and high-throughput characteristics, so it has always attracted attention. These methods are of great significance for the development of targeted drugs, the mechanism of action of the active ingredients of natural products, and the research of chemical biology. Classical drug target prediction algorithms generally include ligand-based methods and receptor protein structure-based methods: the former mainly uses small molecular structures or physical and chemical properties, such as molecular fingerprints, shapes, and pharmacophores, to predict drug-target interactions. The latter usually relies on molecular docking to reveal potential interactions between small molecules and proteins. In the preliminary work, academician Jiang Hualiang of Shanghai Institute of Medicine and Li Honglin team of East China University of Science and Technology carried out in-depth research on target prediction based on ligand and protein receptor structure [3-5].

In recent years, the rapid accumulation of multi-omics data and the development of artificial intelligence technology have laid the foundation for the development of more accurate drug target reasoning algorithms. Among them, drug-related gene expression profile data can be regarded as characterizing the action characteristics of drugs from the gene and cell level, so it has important reference value for drug target prediction. For example, the Connectivity Map (CMap) method developed by the Broad Institute is based on the similarity analysis between differentially expressed characteristic genes, which provides important clues for drug redirection, drug targets, and mechanism of action [6]. In addition, there are some methods based on dynamic network analysis and machine learning, but the shallow analysis methods of biological networks are generally difficult to find deep correlations between compounds and transcription profiles after gene perturbation. In general, the characteristic gene changes caused by the drug action at the cellular level provide a wealth of information for target prediction, but how to discover the true physical effect of the drug through high-dimensional, high-redundancy, and high-noise gene expression profile data The target remains an important unsolved challenge. For example, how to systematically consider the relationship between genes in biological regulatory networks, how to consider the influence of complex factors such as intracellular noise, cell-to-cell differences, compound concentration, and time of action on expression profiles, and how to expand inferred drug targets Range, etc.

In order to cope with the above-mentioned challenges, researchers designed a twin network architecture based on the ideas of comparative learning and metric learning, and predicted targets based on transcriptome data. The model uses two parallel graph convolutional networks to extract features from the differential gene expression profiles induced by compound perturbation and gene perturbation, effectively reducing the influence of noise in gene expression profiles on the prediction of drug-target effects. Compared with the existing target prediction algorithm, on the benchmark data set, the top 100 accuracy of this method reaches 0.53, which is significantly higher than the CMap method [3]. By using deep learning to mine transcriptome data and protein regulatory networks, the SSGCN method introduces fewer assumptions, and at the same time can learn the deep correlation between compound perturbation spectrum and gene perturbation spectrum. The research team found that traditional bioinformatics methods (such as Pearson’s correlation coefficient and KEGG feature Tanimoto’s correlation coefficient) are difficult to capture this deep correlation, which also explains why the model has achieved significant performance improvements. In addition, the method also integrates heterogeneous experimental condition information (cell type, duration and compound dose), not only can use more training data to improve model performance, but also consider the effects of cell line background, dose and time dependence. For differential gene expression, better consider the influence of complex network disturbances on drug target prediction inference.

Target prediction using SSGCN model
Picture Figure 1.   Target prediction using SSGCN model

In addition, in order to further verify the method, the research team also used the method to carry out the application research of wet experiments. In the first application scenario, the researchers established a compound-centric target reasoning process to predict the potential host target of Nelfinavir (NFV). The experimental results successfully verified that cyclophilin A (CYPA) is the target of NFV, and explained the possible mechanism of NFV anti-coronavirus activity [7]; in the second application scenario, the research team established The target-centric prediction process screens inhibitors of Ectonucleotide pyrophosphatase/phosphodiesterase 1, ENPP1. The experimental results successfully discovered and confirmed that the old drug methotrexate (Methotrexate, MTX) is a new skeleton ENPP1 inhibitor.

At present, artificial intelligence has achieved great success in the fields of protein structure prediction, drug molecule generation, and reaction route planning. This work predicts drug targets based on transcriptome data, characterizes drug action modes in the cellular environment, and explores potential targets of drug action from the perspectives of cell transcriptomics and RNA biology. It is a positive contribution of artificial intelligence to help drug development.


Zhong FS, Wu XL, Yang RR, et al. Drug target inference by mining transcriptional data using a novel graph convolutional network framework [J]. Protein & Cell, 2021, (in press) article/10.1007/s13238-021-00885-0
Keiser M J, Setola V, Irwin J J, et al. Predicting new molecular targets for known drugs [J]. Nature, 2009, 462(7270): 175-181.
Li H, Gao Z, Kang L, et al. TarFisDock: a web server for identifying drug targets with docking approach [J]. Nucleic acids research, 2006, 34(suppl_2): W219-W224.
Liu X, Gao Y, Peng J, et al. TarPred: a web application for predicting therapeutic and side effect targets of chemical compounds [J]. Bioinformatics, 2015, 31(12): 2049-2051.
Wang X, Shen Y, Wang S, et al. PharmMapper 2017 update: a web server for potential drug target identification with a comprehensive target pharmacophore database [J]. Nucleic acids research, 2017, 45(W1): W356-W360.
Subramanian, A., Narayan, R., Corsello, SM, Peck, DD, Natoli, TE, Lu, X., Gould, J., Davis, JF, Tubelli, AA and Asiedu, JK (2017) A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell, 171, 1437-1452.
Xu Z, Yao H, Shen J, et al. Nelfinavir is active against SARS-CoV-2 in Vero E6 cells. (2020) Preprint at [J ].

Related Articles:

Target Validation and Drug Discovery

Target Identification and Validation in Drug Discovery

preclinical drug development process

Relevant newsRelevant news