Airtable - Curated

Hide fields

Filter

Group

Sort

InterPred

RetroGNN

GraphNVP

CGVAE

HyFactor

primroseLightning

MRlogP

ChemBO

Retroformer

DeepTox

DNN-DTI

DeepDTI

ADRTarget

DeepPurpose

DeepScreen

MIC Pred

DDI

LigandNet

DeepCE

HGraph2Graph

DeepSiba

egfr-att

bayesian-druglikeness

HybridTox2D

drug-class

DeepHit

ADR-OpenTG

FragmentRetrosynthesis

Activity coefficient

IUPAC2Struct

EnzymaticTransformer

DeepMolecularOptimization

ShapMetabolic

CypInhibitors

hergspred

Graph-based Genetic Algorithm and Generative Model/Monte Carlo Tree Search for the Exploration of Chemical Space

HDAC3i-Finder

MolTrans

xnmapper

SumGNN

DeepDrug

Terpenes: The chemical space of Terpenes

hERG blocker

ChemicalX

Malformer

DRKG_COVID19

ATC_CNN

MLforCOE

SSI–DDI

TxGNN

GNN-MTB

COATI

BioGPT Embeddings

BioBERT Embeddings

GlyLES

NPBert-Malaria

Lipophilicity predictor

hERG

Atom-in-SMILES

QuantitativeTox

smilesX

MolecularTransformerEmbeddings

AntibioticsAI

Selected

https://www.nature.com/articles/s41598-022-14229-3?utm_source=dlvr.it&utm_medium=twitter#data-availability

https://gitlab.com/mongolicious/interpretable-ml-for-mechanism-of-action

Predicts the bioactivity and mechanism of action of a query molecule

Yes

None

High

Chemistry

Python

2022

https://pubs.acs.org/doi/abs/10.1021/acs.jcim.1c01476

https://github.com/pchliu/RetroGNN/

Use of a GNN to estimate synthesizability of compounds

Yes

Non-Commercial only

High

Low

Chemistry

Python

2022

https://arxiv.org/abs/1905.11600

https://github.com/pfnet-research/graph-nvp

Uses invertible flow-based model for generating unique molecules with desired properties. The model generates molecular graphs

Yes

MIT

Medium

High

Chemistry

Python

2019

https://arxiv.org/abs/1805.09076?context=cs

https://github.com/microsoft/constrained-graph-variational-autoencoder

Applies semantic constraints to variational autoencoders. Compared to grammar VAE and character VAE, these generate more valid and novel molecules.

Yes

MIT

Low

Chemistry

Python

2019

https://pubs.acs.org/doi/full/10.1021/acs.jcim.2c00744?casa_token=wAW_PTZU94oAAAAA%3AoAtvGihZ6jGLb_lZ2Dg9LHtjihhlVeqw84EMbGxDQtPYmrEfdPV7TgiixN087krRNxdlZuPvPDctn05R

https://github.com/Laboratoire-de-Chemoinformatique/hyfactor

Novel hydrogen-count based graph-based architecture for generating novel molecules tested on ChemBL, MOSES, and ZINC 250K datasets

Yes

LGPL

Low

High

Chemistry

Python

2022

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-022-00583-x

https://github.com/csbarak/primroseLightning

Contains several models (LASSO regression, decision tree, and CNN-SMILES) to predict binding to an RNA hairpin target against tuberculosis

Yes

MIT

High

Low

Chemistry

Python

2022

https://www.mdpi.com/2227-9717/9/11/2029/htm

https://github.com/JustinYKC/MRlogP

NN-based predictor of druglike small molecule lipophilicity, using transfer learning

MIT

High

Low

Chemistry

Python

2021

https://arxiv.org/abs/1908.01425

https://github.com/ks-korovina/chembo

Baysian Optimization of small molecules with synthesis planning

MIT

Medium

High

Chemistry

Python

Needs to be retrained

2019

https://arxiv.org/abs/2201.12475v1

https://github.com/yuewan2/Retroformer

Template-free retrosynthesis planner utilizing a transformer-based architecture

Yes

MIT

Low

High

Chemistry

Python

2022

https://www.frontiersin.org/articles/10.3389/fenvs.2015.00080/full

http://bioinf.jku.at/research/DeepTox/tox21.html

Prediction of the toxicity features across the Tox21 dataset using Binet (https://github.com/bioinf-jku/binet)

Yes

GPLv3

High

Low

Chemistry

Python

2016

https://ieeexplore.ieee.org/abstract/document/8217693

https://github.com/JohnnyY8/DNN-DTI

Drug target interaction

None

High

Low

Chemistry

Python

2021

https://pubs.acs.org/doi/full/10.1021/acs.jproteome.6b00618

https://github.com/Bjoux2/DeepDTIs_DBN

Drug target interaction

GPLv3

High

Low

Chemistry

Python

2017

https://www.thelancet.com/pdfs/journals/ebiom/PIIS2352-3964(20)30212-7.pdf

https://github.com/samanfrm/ADRtarget

Prediction of ADRs

Yes

GPLv3

High

Low

Chemistry

2020

https://arxiv.org/abs/2004.08919

https://github.com/kexinhuang12345/DeepPurpose

Drug Target Interaction

Yes

BSD-3

High

Low

Chemistry

Python

2020

https://pubs.rsc.org/en/content/articlehtml/2020/sc/c9sc03414e

https://github.com/cansyl/DEEPScreen

Drug Target Interaction

Yes

None

High

Chemistry

Python

Needs training for each target protein

2020

https://journals.asm.org/doi/full/10.1128/JCM.01260-18

https://github.com/PATRIC3/mic_prediction

Predicting MIC for Klebisella from genomic data

Yes

None

Medium

High

Genomics

Python

Needs to be updated to PY3

2020

https://arxiv.org/abs/1908.01288

https://github.com/rezacsedu/Drug-Drug-Interaction-Prediction

Drug Drug interaction prediction

Yes

None

Medium

High

Chemistry

Python

2019

https://github.com/sirimullalab/LigandNet

protein-specific ligand activity prediction

Yes

MIT

High

Chemistry

Python

We can implement the search from smiles, but not the other way around from proteins

2021

https://www.biorxiv.org/content/10.1101/2020.07.19.211235v1

https://github.com/pth1993/DeepCE

Predicts gene expression profiles given a chemical compound

Yes

None

Medium

High

Chemistry

Genomics

Python

2020

https://arxiv.org/pdf/2002.03230.pdf

https://github.com/wengong-jin/hgraph2graph/

Generative model

Yes

MIT

Medium

Low

Chemistry

Python

2020

https://arxiv.org/ftp/arxiv/papers/2004/2004.01028.pdf

https://github.com/BioSysLab/deepSIBA

Chemical Structuere-based inference of biological alterations

Yes

None

Medium

Low

Chemistry

Python

2021

https://arxiv.org/pdf/1906.05168v3.pdf

https://github.com/lehgtrung/egfr-att

Activity prediction EGFR Inhibitors

Yes

None

Medium

Low

Chemistry

Python

2019

https://github.com/Nanotekton/drugability/tree/v0.1

https://www.nature.com/articles/s42256-020-0209-y

Drug-likeness measure

Non-Commercial only

Medium

Low

Chemistry

Python

2020

https://pubs.acs.org/doi/10.1021/acsomega.8b03173

https://github.com/Abdulk084/HybridTox2D

Toxicological classification of chemical compouds

Yes

MIT

High

Low

Chemistry

Python

2019

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819987/

https://github.com/jgmeyerucsd/drug-class

Prediction of drug functions MeSH

Yes

GPLv3

Medium

Low

Chemistry

Python

2019

https://academic.oup.com/bioinformatics/article/36/10/3049/5727757

https://bitbucket.org/krictai/deephit/src/master/

Prediction of hERG toxicity

Yes

None

Medium

Low

Chemistry

Python

It is developed in py2.7, it would be good to test if it works in py3

2020

https://www.frontiersin.org/articles/10.3389/fddsv.2021.768792/full

https://github.com/attayeb/adr

Prediction of ADR using gene expression profiles

Yes

None

Low

High

Chemistry

Python

Uses Genomic data as input

2021

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7802345/

https://github.com/knu-chem-lcbc/fragment_based_retrosynthesis

Retrosynthetic pathway prediction using MACCS keys

Yes

Non-Commercial only

Medium

Low

Chemistry

Python

2021

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9721150

https://github.com/Bene94/SMILES2PropertiesTransformer/tree/main/Models

Predicts the activity coefficient (measures the deviation of a mix of chemical substances from its ideal behaviour)

Yes

None

Low

Chemistry

Python

2022

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8292511/

https://github.com/sergsb/IUPAC2Struct

Conversion from IUPAC to Structure

Yes

MIT

Low

Chemistry

Python

Not interesting as the authors do not provide the Struct2IUPAC, which is the one more difficult. STOUT does show both capabilities

2021

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8246114/

https://github.com/reymond-group/OpenNMT-py

Predicts the enzymatic reaction of a small molecule

Yes

MIT

Low

High

Chemistry

Python

2021

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980633/

https://github.com/MolecularAI/deep-molecular-optimization

Optimizes molecules based on chemist intuition Transformer trained model

Yes

Apache

Medium

Low

Chemistry

Python

The prediction models for activity, ADME etc are not provided openly

2021

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00542-y#availability-of-data-and-materials

https://github.com/gmum/metstab-shap

A model that uses SHAP to explain metabolic stability

Yes

MIT

Medium

High

Chemistry

Python

No pretrained models

2021

https://www.sciencedirect.com/science/article/pii/S2001037022004536

https://github.com/gmum/cyp-inhibitors

Generation of CYP inhibitors

Yes

None

Medium

Low

Chemistry

Python

2022

https://pubs.acs.org/doi/10.1021/acs.jcim.2c00256

http://www.icdrug.com/ICDrug/T

Prediction of hERG Cardiotoxicity

None

Medium

Low

Chemistry

Python

Checkpoints not provided not dataset, we can only redirect to their online server

2022

https://pubs.rsc.org/en/content/articlelanding/2019/sc/c8sc05372c

https://github.com/jensengroup/GB_GA

This paper explores performance of non deep learning techniques for the task of chemical space exploration. It presents two algorithms, namely, a graph based genetic algorith, and a graph based generative model with monte carlo tree search for this task and positions them as strong competitors to previous RNN based approaches with respect to algorithm runtime.

MIT

Chemistry

Python

2019

https://onlinelibrary.wiley.com/doi/10.1002/minf.202000105

https://github.com/jwxia2014/HDAC3i-Finder

Histone deacetylase 3 (HDAC3) Finder trained model help to screen(identify) for HDAC3 inhibitor in compounds. Histone deacetylase 3 (HDAC3) is a prospective drug target for the treatment of human diseases such as cancer.

Yes

GPLv3

Medium

Low

Chemistry

Python

2020

https://academic.oup.com/bioinformatics/article/37/6/830/5929692?login=false

https://github.com/kexinhuang12345/moltrans

Molecular Interaction Transformer (MolTrans) gives more accurate results as compared to other baselines by incorporating pattern mining algorithm and interaction modeling module which uses sub-structural pattern for more accurate and interpretable DTI prediction. Secondly, it incorporates an augmented transformer encoder to better extract and capture the semantic relations among substructures extracted from massive unlabeled biomedical data while existing methods focuses on limited labeled data and ignored massive unlabelled molecular data. MolTrans is evaluated on real world data and it showed improved DTI prediction performance compared to state-of-the-art baselines. MoITrans is basically a classfication model that determine whether a pair of drug and target protein will interact.

BSD-3

Medium

High

Chemistry

Python

2020

https://doi.org/10.1126/sciadv.abe4166

https://github.com/rxn4chemistry/rxnmapper

Atom mapping on valid reaction SMILES. Given a SMILES, the model returns the mapped reactions and confidence scores.

Yes

MIT

Medium

High

Chemistry

Python

2021

https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btab207/6189090

https://github.com/yueyu1030/SumGNN

SumGNN is a trainer for Drug-Drug Interaction(DDI) prediction which incorporates knowledge summarization graph neural network an improvement from tradition KG(Knowledge Graph) network used in present base trainer. Models produced by SumGNN produced better pharmacological effect prediction score from other trainer by 5.57%. DDI prediction is critical in determining side effects of drugs in people with pre-existing conditions.pharmacological effect prediction score is used to measure the adversity of interaction of two or more drugs. The DDI model can be used to rule out some medicines for patients or suggest change in drug's chemical composition to suite a patient. This will accelerate drug discovery by offering faster and accurate reaction feedback

Yes

None

Medium

High

Chemistry

Python

2021

https://www.biorxiv.org/content/10.1101/2020.11.09.375626v2

https://github.com/wanwenzeng/deepdrug

DeepDrug is a deep learning framework, using residual graph convolutional networks (RGCNs) and convolutional networks (CNNs) to learn the comprehensive structural and sequential representations of drugs and proteins in order to boost the drug-drug interactions(DDIs) and drug-target interactions(DTIs) prediction accuracy.

None

Medium

High

Chemistry

Python

2022

https://arxiv.org/abs/2110.15047

https://github.com/smortezah/napr

Terpenes are a wide range family of naturally occurring substances with different types of chemical and biological properties. Many of these molecules have already found use in pharmaceuticals. Characterisation of these wide range of molecules with classical approaches has proved to be a daunting task. This model provides more insight to identifying types of terpenes by using a natural product database, COCONUT to extract information about 60,000 terpenes. For clustering approach to this dataset, PCA, FastICA, Kernel PCA, t-SNE and UMAP were used as benchmark. For classification approach, Light gradient boosting machine, k-nearest neighbors, random forests, Gaussian naiive Bayes and Multilayer perceptron were used. The best performing algorithms yielded accuracy, F1 score, precision and other metrics all over 0.9. Input- Terpene features Output- Chemical subclass Programming Language- Python

Yes

MIT

High

Chemistry

Python

More information of the model are provided in this pdf below: 2110.15047.pdf

2021

https://www.sciencedirect.com/science/article/abs/pii/S0010482522011994

https://github.com/WeilabMSU/hERG-prediction

Prediction of hERG Blocker

Yes

MIT

Medium

Low

Chemistry

Python

2022

https://arxiv.org/abs/2111.02916

https://github.com/AstraZeneca/chemicalx

Drug pair scoring is a machine learning task that involves a set of drugs and the task of predicting the behavior of drug pairs. It is used for predicting the effectiveness and safety of drug combinations.

Apache

Medium

High

Chemistry

Python

Multimodal, needs to be broken into pieces

2021

https://arxiv.org/abs/2106.09553

https://github.com/IBM/molformer

The above Paper discusses the use of machine learning models to accurately and quickly predict molecular properties in drug discovery and material design. However, the vast chemical space and limited availability of property labels make supervised learning challenging. To address this, the authors present MoLFormer. The MOLFORMER's design is based to learn about a model trained on a small molecules which are represented as SMILES string. The Model architecture has an efficient linear attention mechanism and relative positional embeddings with the goal of learning a meaningful and compressed representation of chemical molecules.

Yes

Apache

Medium

Low

Chemistry

Python

2022

https://arxiv.org/abs/2007.10261v1

https://github.com/gnn4dr/DRKG

Drug-Repurposing for COVID-19

Yes

Apache

High

Low

Chemistry

Python

2020

https://academic.oup.com/bib/article/23/5/bbac346/6677124

https://github.com/lookwei/ATC_CNN

Anatomical Therapeutic Chemical (ATC) classification for compounds/drugs plays an important role in drug development and basic research. However, previous methods depend on interactions extracted from STITCH dataset which may make it depend on lab experiments. ATC_CNN presents a pilot study to explore the possibility of conducting the ATC prediction solely based on the molecular structures. The motivation is to eliminate the reliance on the costly lab experiments so that the characteristics of a drug can be pre-assessed for better decision-making and effort-saving before the actual development

Yes

Non-Commercial only

Medium

Low

Chemistry

Python

2022

https://arxiv.org/abs/2105.10236

https://github.com/PV-Lab/MLforCOE

Framework for establishing antibiotic property predictions. It consists of four components: (1) molecular representation, (2) feature down-selection, (3) ML algorithm selection, and (4) molecular descriptor importance analysis

BSD-2

Medium

Low

Chemistry

Python

2021

SSI–DDI: Substructure–Substructure Interactions for Drug-Drug Interaction Prediction

https://github.com/kanz76/ssi-ddi

That paper works on exploring the interaction between different drugs which is called drug-drug interactions (DDIs), The model takes as an input a DDI tuple (Gx , Gy , r) and predicts the probability of a pair of drugs (Gx , Gy ) having an interaction r.

None

Medium

High

Chemistry

Python

Needs re training

2021

https://www.medrxiv.org/content/10.1101/2023.03.19.23287458v1

https://github.com/mims-harvard/TxGNN

Zero-shot prediction of therapeutic use with geometric deep learning and clinician centered design

MIT

High

Chemistry

Python

2023

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/abstract/abstract5544.shtml

https://github.com/gu-yaowen/GNN-MTB

This paper implements and builds an anti-TB inhibitor prediction model

Yes

None

Medium

Low

Chemistry

Python

Very simple model, data not available

2023

https://chemrxiv.org/engage/chemrxiv/article-details/64e8137fdd1a73847f73f7aa

https://github.com/terraytherapeutics/COATI/tree/main

COATI: multi-modal contrastive pre-training for representing and traversing chemical space

Yes

Apache

Medium

Low

Chemistry

Python

2023

https://academic.oup.com/bib/article/23/6/bbac409/6713511?guestAccessKey=a66d9b5d-4f83-4017-bb52-405815c907b9&login=false

https://github.com/microsoft/BioGPT

BioGPT embeddings from biomedical text

Yes

MIT

Medium

Low

Biomedical Text

Python

This model is useful for the GRADIENT project

2022

https://arxiv.org/abs/1901.08746

https://pypi.org/project/biobert-embedding/

BioBERT embeddings from biomedical text

Yes

MIT

Low

Biomedical Text

Python

2019

https://github.com/kalininalab/GlyLES/

GlyLES: Grammar-based Parsing of Glycans from IUPAC-condensed to SMILES

Yes

MIT

Low

Chemistry

Python

https://github.com/mldlproject/2021-NPBERT-Antimalaria

https://pubs.acs.org/doi/10.1021/acs.jcim.1c00584

Antimalarial prediction based on BERT

Yes

None

Medium

High

Chemistry

Python

2022

https://github.com/VEK239/StructGNN-lipophilicity

https://ml4molecules.github.io/papers2020/ML4Molecules_2020_paper_48.pdf

Lipophilicity

None

Low

High

Chemistry

Python

https://github.com/WeilabMSU/hERG-prediction#virtual-screening-of-drugbank-database-for-herg-blockers-using-topological-laplacian-assisted-ai-models

https://www.nature.com/articles/s41598-019-47536-3

hERG inhibition

Yes

None

Medium

Low

Chemistry

Python

https://github.com/snu-lcbc/atom-in-SMILES

https://doi.org/10.1186/s13321-023-00725-9

Yes

Creative Commons

Medium

Low

Chemistry

Python

https://github.com/Abdulk084/QuantitativeTox/tree/master

https://pubs.acs.org/doi/10.1021/acsomega.1c01247

Toxicity endpoints

https://github.com/Lambard-ML-Team/SMILES-X

https://iopscience.iop.org/article/10.1088/2632-2153/ab57f3

https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.9b01212

https://github.com/mpcrlab/MolecularTransformerEmbeddings

Transformer-based translation of SMILES text into embedding

Yes

MIT

High

Chemistry

Python

https://www.nature.com/articles/s41586-023-06887-8#code-availability

https://github.com/felixjwong/antibioticsai

Yes

MIT

High

Low

Chemistry

Python

63 records

Summary

Sum

111137

Alert

Lorem ipsum

Okay