Curated
Report abuse
Use this data
Sign up for free
Name
1
InterPred
2
RetroGNN
3
GraphNVP
4
CGVAE
5
HyFactor
6
primroseLightning
7
MRlogP
8
ChemBO
9
Retroformer
10
DeepTox
11
DNN-DTI
12
DeepDTI
13
ADRTarget
14
DeepPurpose
15
DeepScreen
16
MIC Pred
17
DDI
18
LigandNet
19
DeepCE
20
HGraph2Graph
21
DeepSiba
22
egfr-att
23
bayesian-druglikeness
24
HybridTox2D
25
drug-class
26
DeepHit
27
ADR-OpenTG
28
FragmentRetrosynthesis
29
Activity coefficient
30
IUPAC2Struct
31
EnzymaticTransformer
32
DeepMolecularOptimization
33
ShapMetabolic
34
CypInhibitors
35
hergspred
36
Graph-based Genetic Algorithm and Generative Model/Monte Carlo Tree Search for the Exploration of Chemical Space
37
HDAC3i-Finder
38
MolTrans
39
xnmapper
40
SumGNN
41
DeepDrug
42
Terpenes: The chemical space of Terpenes
43
hERG blocker
44
ChemicalX
45
Malformer
46
DRKG_COVID19
47
ATC_CNN
48
MLforCOE
49
SSI–DDI
50
TxGNN
51
GNN-MTB
52
COATI
53
BioGPT Embeddings
54
BioBERT Embeddings
55
GlyLES
56
NPBert-Malaria
57
Lipophilicity predictor
58
hERG
59
Atom-in-SMILES
60
QuantitativeTox
61
smilesX
62
MolecularTransformerEmbeddings
63
AntibioticsAI
Drag to adjust the number of frozen columns
Selected
Done
Publication
Source Code
Description
Checkpoints Available
License
Priority
Difficulty
Type
Code
Comments
Year
https://www.nature.com/articles/s41598-022-14229-3?utm_source=dlvr.it&utm_medium=twitter#data-availability
https://gitlab.com/mongolicious/interpretable-ml-for-mechanism-of-action
Predicts the bioactivity and mechanism of action of a query molecule
Yes
None
High
High
Chemistry
Python
2022
https://pubs.acs.org/doi/abs/10.1021/acs.jcim.1c01476
https://github.com/pchliu/RetroGNN/
Use of a GNN to estimate synthesizability of compounds
Yes
Non-Commercial only
High
Low
Chemistry
Python
2022
https://arxiv.org/abs/1905.11600
https://github.com/pfnet-research/graph-nvp
Uses invertible flow-based model for generating unique molecules with desired properties. The model generates molecular graphs
Yes
MIT
Medium
High
Chemistry
Python
2019
https://arxiv.org/abs/1805.09076?context=cs
https://github.com/microsoft/constrained-graph-variational-autoencoder
Applies semantic constraints to variational autoencoders. Compared to grammar VAE and character VAE, these generate more valid and novel molecules.
Yes
MIT
Low
Low
Chemistry
Python
2019
https://pubs.acs.org/doi/full/10.1021/acs.jcim.2c00744?casa_token=wAW_PTZU94oAAAAA%3AoAtvGihZ6jGLb_lZ2Dg9LHtjihhlVeqw84EMbGxDQtPYmrEfdPV7TgiixN087krRNxdlZuPvPDctn05R
https://github.com/Laboratoire-de-Chemoinformatique/hyfactor
Novel hydrogen-count based graph-based architecture for generating novel molecules tested on ChemBL, MOSES, and ZINC 250K datasets
Yes
LGPL
Low
High
Chemistry
Python
2022
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-022-00583-x
https://github.com/csbarak/primroseLightning
Contains several models (LASSO regression, decision tree, and CNN-SMILES) to predict binding to an RNA hairpin target against tuberculosis
Yes
MIT
High
Low
Chemistry
Python
2022
https://www.mdpi.com/2227-9717/9/11/2029/htm
https://github.com/JustinYKC/MRlogP
NN-based predictor of druglike small molecule lipophilicity, using transfer learning
No
MIT
High
Low
Chemistry
Python
2021
https://arxiv.org/abs/1908.01425
https://github.com/ks-korovina/chembo
Baysian Optimization of small molecules with synthesis planning
No
MIT
Medium
High
Chemistry
Python
Needs to be retrained
2019
https://arxiv.org/abs/2201.12475v1
https://github.com/yuewan2/Retroformer
Template-free retrosynthesis planner utilizing a transformer-based architecture
Yes
MIT
Low
High
Chemistry
Python
2022
https://www.frontiersin.org/articles/10.3389/fenvs.2015.00080/full
http://bioinf.jku.at/research/DeepTox/tox21.html
Prediction of the toxicity features across the Tox21 dataset using Binet (https://github.com/bioinf-jku/binet)
Yes
GPLv3
High
Low
Chemistry
Python
2016
https://ieeexplore.ieee.org/abstract/document/8217693
https://github.com/JohnnyY8/DNN-DTI
Drug target interaction
No
None
High
Low
Chemistry
Python
2021
https://pubs.acs.org/doi/full/10.1021/acs.jproteome.6b00618
https://github.com/Bjoux2/DeepDTIs_DBN
Drug target interaction
No
GPLv3
High
Low
Chemistry
Python
2017
https://www.thelancet.com/pdfs/journals/ebiom/PIIS2352-3964(20)30212-7.pdf
https://github.com/samanfrm/ADRtarget
Prediction of ADRs
Yes
GPLv3
High
Low
Chemistry
R
2020
https://arxiv.org/abs/2004.08919
https://github.com/kexinhuang12345/DeepPurpose
Drug Target Interaction
Yes
BSD-3
High
Low
Chemistry
Python
2020
https://pubs.rsc.org/en/content/articlehtml/2020/sc/c9sc03414e
https://github.com/cansyl/DEEPScreen
Drug Target Interaction
Yes
None
High
High
Chemistry
Python
Needs training for each target protein
2020
https://journals.asm.org/doi/full/10.1128/JCM.01260-18
https://github.com/PATRIC3/mic_prediction
Predicting MIC for Klebisella from genomic data
Yes
None
Medium
High
Genomics
Python
Needs to be updated to PY3
2020
https://arxiv.org/abs/1908.01288
https://github.com/rezacsedu/Drug-Drug-Interaction-Prediction
Drug Drug interaction prediction
Yes
None
Medium
High
Chemistry
Python
2019
https://github.com/sirimullalab/LigandNet
protein-specific ligand activity prediction
Yes
MIT
High
High
Chemistry
Python
We can implement the search from smiles, but not the other way around from proteins
2021
https://www.biorxiv.org/content/10.1101/2020.07.19.211235v1
https://github.com/pth1993/DeepCE
Predicts gene expression profiles given a chemical compound
Yes
None
Medium
High
Chemistry
Genomics
Python
2020
https://arxiv.org/pdf/2002.03230.pdf
https://github.com/wengong-jin/hgraph2graph/
Generative model
Yes
MIT
Medium
Low
Chemistry
Python
2020
https://arxiv.org/ftp/arxiv/papers/2004/2004.01028.pdf
https://github.com/BioSysLab/deepSIBA
Chemical Structuere-based inference of biological alterations
Yes
None
Medium
Low
Chemistry
Python
2021
https://arxiv.org/pdf/1906.05168v3.pdf
https://github.com/lehgtrung/egfr-att
Activity prediction EGFR Inhibitors
Yes
None
Medium
Low
Chemistry
Python
2019
https://github.com/Nanotekton/drugability/tree/v0.1
https://www.nature.com/articles/s42256-020-0209-y
Drug-likeness measure
No
Non-Commercial only
Medium
Low
Chemistry
Python
2020
https://pubs.acs.org/doi/10.1021/acsomega.8b03173
https://github.com/Abdulk084/HybridTox2D
Toxicological classification of chemical compouds
Yes
MIT
High
Low
Chemistry
Python
2019
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819987/
https://github.com/jgmeyerucsd/drug-class
Prediction of drug functions MeSH
Yes
GPLv3
Medium
Low
Chemistry
Python
2019
https://academic.oup.com/bioinformatics/article/36/10/3049/5727757
https://bitbucket.org/krictai/deephit/src/master/
Prediction of hERG toxicity
Yes
None
Medium
Low
Chemistry
Python
It is developed in py2.7, it would be good to test if it works in py3
2020
https://www.frontiersin.org/articles/10.3389/fddsv.2021.768792/full
https://github.com/attayeb/adr
Prediction of ADR using gene expression profiles
Yes
None
Low
High
Chemistry
Python
Uses Genomic data as input
2021
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7802345/
https://github.com/knu-chem-lcbc/fragment_based_retrosynthesis
Retrosynthetic pathway prediction using MACCS keys
Yes
Non-Commercial only
Medium
Low
Chemistry
Python
2021
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9721150
https://github.com/Bene94/SMILES2PropertiesTransformer/tree/main/Models
Predicts the activity coefficient (measures the deviation of a mix of chemical substances from its ideal behaviour)
Yes
None
Low
Low
Chemistry
Python
2022
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8292511/
https://github.com/sergsb/IUPAC2Struct
Conversion from IUPAC to Structure
Yes
MIT
Low
Low
Chemistry
Python
Not interesting as the authors do not provide the Struct2IUPAC, which is the one more difficult. STOUT does show both capabilities
2021
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8246114/
https://github.com/reymond-group/OpenNMT-py
Predicts the enzymatic reaction of a small molecule
Yes
MIT
Low
High
Chemistry
Python
2021
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980633/
https://github.com/MolecularAI/deep-molecular-optimization
Optimizes molecules based on chemist intuition Transformer trained model
Yes
Apache
Medium
Low
Chemistry
Python
The prediction models for activity, ADME etc are not provided openly
2021
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00542-y#availability-of-data-and-materials
https://github.com/gmum/metstab-shap
A model that uses SHAP to explain metabolic stability
Yes
MIT
Medium
High
Chemistry
Python
No pretrained models
2021
https://www.sciencedirect.com/science/article/pii/S2001037022004536
https://github.com/gmum/cyp-inhibitors
Generation of CYP inhibitors
Yes
None
Medium
Low
Chemistry
Python
2022
https://pubs.acs.org/doi/10.1021/acs.jcim.2c00256
http://www.icdrug.com/ICDrug/T
Prediction of hERG Cardiotoxicity
No
None
Medium
Low
Chemistry
Python
Checkpoints not provided not dataset, we can only redirect to their online server
2022
https://pubs.rsc.org/en/content/articlelanding/2019/sc/c8sc05372c
https://github.com/jensengroup/GB_GA
This paper explores performance of non deep learning techniques for the task of chemical space exploration. It presents two algorithms, namely, a graph based genetic algorith, and a graph based generative model with monte carlo tree search for this task and positions them as strong competitors to previous RNN based approaches with respect to algorithm runtime.
MIT
Chemistry
Python
2019
https://onlinelibrary.wiley.com/doi/10.1002/minf.202000105
https://github.com/jwxia2014/HDAC3i-Finder
Histone deacetylase 3 (HDAC3) Finder trained model help to screen(identify) for HDAC3 inhibitor in compounds. Histone deacetylase 3 (HDAC3) is a prospective drug target for the treatment of human diseases such as cancer.
Yes
GPLv3
Medium
Low
Chemistry
Python
2020
https://academic.oup.com/bioinformatics/article/37/6/830/5929692?login=false
https://github.com/kexinhuang12345/moltrans
Molecular Interaction Transformer (MolTrans) gives more accurate results as compared to other baselines by incorporating pattern mining algorithm and interaction modeling module which uses sub-structural pattern for more accurate and interpretable DTI prediction. Secondly, it incorporates an augmented transformer encoder to better extract and capture the semantic relations among substructures extracted from massive unlabeled biomedical data while existing methods focuses on limited labeled data and ignored massive unlabelled molecular data. MolTrans is evaluated on real world data and it showed improved DTI prediction performance compared to state-of-the-art baselines. MoITrans is basically a classfication model that determine whether a pair of drug and target protein will interact.
No
BSD-3
Medium
High
Chemistry
Python
2020
https://doi.org/10.1126/sciadv.abe4166
https://github.com/rxn4chemistry/rxnmapper
Atom mapping on valid reaction SMILES. Given a SMILES, the model returns the mapped reactions and confidence scores.
Yes
MIT
Medium
High
Chemistry
Python
2021
https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btab207/6189090
https://github.com/yueyu1030/SumGNN
SumGNN is a trainer for Drug-Drug Interaction(DDI) prediction which incorporates knowledge summarization graph neural network an improvement from tradition KG(Knowledge Graph) network used in present base trainer. Models produced by SumGNN produced better pharmacological effect prediction score from other trainer by 5.57%. DDI prediction is critical in determining side effects of drugs in people with pre-existing conditions.pharmacological effect prediction score is used to measure the adversity of interaction of two or more drugs. The DDI model can be used to rule out some medicines for patients or suggest change in drug's chemical composition to suite a patient. This will accelerate drug discovery by offering faster and accurate reaction feedback
Yes
None
Medium
High
Chemistry
Python
2021
https://www.biorxiv.org/content/10.1101/2020.11.09.375626v2
https://github.com/wanwenzeng/deepdrug
DeepDrug is a deep learning framework, using residual graph convolutional networks (RGCNs) and convolutional networks (CNNs) to learn the comprehensive structural and sequential representations of drugs and proteins in order to boost the drug-drug interactions(DDIs) and drug-target interactions(DTIs) prediction accuracy.
No
None
Medium
High
Chemistry
Python
2022
https://arxiv.org/abs/2110.15047
https://github.com/smortezah/napr
Terpenes are a wide range family of naturally occurring substances with different types of chemical and biological properties. Many of these molecules have already found use in pharmaceuticals. Characterisation of these wide range of molecules with classical approaches has proved to be a daunting task. This model provides more insight to identifying types of terpenes by using a natural product database, COCONUT to extract information about 60,000 terpenes. For clustering approach to this dataset, PCA, FastICA, Kernel PCA, t-SNE and UMAP were used as benchmark. For classification approach, Light gradient boosting machine, k-nearest neighbors, random forests, Gaussian naiive Bayes and Multilayer perceptron were used. The best performing algorithms yielded accuracy, F1 score, precision and other metrics all over 0.9. Input- Terpene features Output- Chemical subclass Programming Language- Python
Yes
MIT
High
High
Chemistry
Python
More information of the model are provided in this pdf below: 2110.15047.pdf
2021
https://www.sciencedirect.com/science/article/abs/pii/S0010482522011994
https://github.com/WeilabMSU/hERG-prediction
Prediction of hERG Blocker
Yes
MIT
Medium
Low
Chemistry
Python
2022
https://arxiv.org/abs/2111.02916
https://github.com/AstraZeneca/chemicalx
Drug pair scoring is a machine learning task that involves a set of drugs and the task of predicting the behavior of drug pairs. It is used for predicting the effectiveness and safety of drug combinations.
No
Apache
Medium
High
Chemistry
Python
Multimodal, needs to be broken into pieces
2021
https://arxiv.org/abs/2106.09553
https://github.com/IBM/molformer
The above Paper discusses the use of machine learning models to accurately and quickly predict molecular properties in drug discovery and material design. However, the vast chemical space and limited availability of property labels make supervised learning challenging. To address this, the authors present MoLFormer. The MOLFORMER's design is based to learn about a model trained on a small molecules which are represented as SMILES string. The Model architecture has an efficient linear attention mechanism and relative positional embeddings with the goal of learning a meaningful and compressed representation of chemical molecules.
Yes
Apache
Medium
Low
Chemistry
Python
2022
https://arxiv.org/abs/2007.10261v1
https://github.com/gnn4dr/DRKG
Drug-Repurposing for COVID-19
Yes
Apache
High
Low
Chemistry
Python
2020
https://academic.oup.com/bib/article/23/5/bbac346/6677124
https://github.com/lookwei/ATC_CNN
Anatomical Therapeutic Chemical (ATC) classification for compounds/drugs plays an important role in drug development and basic research. However, previous methods depend on interactions extracted from STITCH dataset which may make it depend on lab experiments. ATC_CNN presents a pilot study to explore the possibility of conducting the ATC prediction solely based on the molecular structures. The motivation is to eliminate the reliance on the costly lab experiments so that the characteristics of a drug can be pre-assessed for better decision-making and effort-saving before the actual development
Yes
Non-Commercial only
Medium
Low
Chemistry
Python
2022
https://arxiv.org/abs/2105.10236
https://github.com/PV-Lab/MLforCOE
Framework for establishing antibiotic property predictions. It consists of four components: (1) molecular representation, (2) feature down-selection, (3) ML algorithm selection, and (4) molecular descriptor importance analysis
No
BSD-2
Medium
Low
Chemistry
Python
2021
SSI–DDI: Substructure–Substructure Interactions for Drug-Drug Interaction Prediction
https://github.com/kanz76/ssi-ddi
That paper works on exploring the interaction between different drugs which is called drug-drug interactions (DDIs), The model takes as an input a DDI tuple (Gx , Gy , r) and predicts the probability of a pair of drugs (Gx , Gy ) having an interaction r.
No
None
Medium
High
Chemistry
Python
Needs re training
2021
https://www.medrxiv.org/content/10.1101/2023.03.19.23287458v1
https://github.com/mims-harvard/TxGNN
Zero-shot prediction of therapeutic use with geometric deep learning and clinician centered design
No
MIT
High
High
Chemistry
Python
2023
https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/abstract/abstract5544.shtml
https://github.com/gu-yaowen/GNN-MTB
This paper implements and builds an anti-TB inhibitor prediction model
Yes
None
Medium
Low
Chemistry
Python
Very simple model, data not available
2023
https://chemrxiv.org/engage/chemrxiv/article-details/64e8137fdd1a73847f73f7aa
https://github.com/terraytherapeutics/COATI/tree/main
COATI: multi-modal contrastive pre-training for representing and traversing chemical space
Yes
Apache
Medium
Low
Chemistry
Python
2023
https://academic.oup.com/bib/article/23/6/bbac409/6713511?guestAccessKey=a66d9b5d-4f83-4017-bb52-405815c907b9&login=false
https://github.com/microsoft/BioGPT
BioGPT embeddings from biomedical text
Yes
MIT
Medium
Low
Biomedical Text
Python
This model is useful for the GRADIENT project
2022
https://arxiv.org/abs/1901.08746
https://pypi.org/project/biobert-embedding/
BioBERT embeddings from biomedical text
Yes
MIT
Low
Low
Biomedical Text
Python
2019
https://github.com/kalininalab/GlyLES/
https://github.com/kalininalab/GlyLES/
GlyLES: Grammar-based Parsing of Glycans from IUPAC-condensed to SMILES
Yes
MIT
Low
Low
Chemistry
Python
https://github.com/mldlproject/2021-NPBERT-Antimalaria
https://pubs.acs.org/doi/10.1021/acs.jcim.1c00584
Antimalarial prediction based on BERT
Yes
None
Medium
High
Chemistry
Python
2022
https://github.com/VEK239/StructGNN-lipophilicity
https://ml4molecules.github.io/papers2020/ML4Molecules_2020_paper_48.pdf
Lipophilicity
No
None
Low
High
Chemistry
Python
https://github.com/WeilabMSU/hERG-prediction#virtual-screening-of-drugbank-database-for-herg-blockers-using-topological-laplacian-assisted-ai-models
https://www.nature.com/articles/s41598-019-47536-3
hERG inhibition
Yes
None
Medium
Low
Chemistry
Python
https://github.com/snu-lcbc/atom-in-SMILES
https://doi.org/10.1186/s13321-023-00725-9
Yes
Creative Commons
Medium
Low
Chemistry
Python
https://github.com/Abdulk084/QuantitativeTox/tree/master
https://pubs.acs.org/doi/10.1021/acsomega.1c01247
Toxicity endpoints
https://github.com/Lambard-ML-Team/SMILES-X
https://iopscience.iop.org/article/10.1088/2632-2153/ab57f3
https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.9b01212
https://github.com/mpcrlab/MolecularTransformerEmbeddings
Transformer-based translation of SMILES text into embedding
Yes
MIT
High
High
Chemistry
Python
https://www.nature.com/articles/s41586-023-06887-8#code-availability
https://github.com/felixjwong/antibioticsai
Yes
MIT
High
Low
Chemistry
Python
63 records

Alert

Lorem ipsum
Okay