Airtable - Public

Hide fields

Filter

Group

Sort

eos74km

eos8ub5

eos2db3

eos9gg2

eos3mk2

eos9p4a

eos39co

eos3wzy

eos3nn9

eos1pu1

eos39dp

eos6ru3

eos6ost

eos8aox

eos57bx

eos5guo

eos24ur

eos2401

eos5gge

eos7d58

eos694w

eos42ez

eos21q7

eos18ie

eos8bhe

eos5cl7

eos1n4b

eos9ym3

eos30f3

eos5xng

eos69e6

eos4wt0

eos4x30

eos1ut3

eos9ivc

eos9zw0

eos633t

eos3kcw

eos1d7r

eos9ueu

eos4f95

eos2zmb

eos1noy

eos3le9

eos4rta

eos2l0q

eos3804

eos2hzy

eos8fma

eos1mxi

eos7yti

eos4qda

eos80ch

eos3ev6

eos7nno

eos5jz9

eos59rr

eos7kpb

eos2gw4

eos3cf4

eos3zur

eos9tyg

eos44zp

eos24jm

eos6aun

eos31ve

eos2fy6

eos2lqb

eos8fth

eos8lok

eos9yy1

eos22io

eos74bo

eos81ew

eos93h2

eos7qga

eos4avb

eos4cxk

eos8c0o

eos6hy3

eos5505

eos4se9

eos24ci

eos935d

eos4q1a

eos9taz

eos2rd8

eos9sa2

eos8a5g

eos238c

eos2v11

eos1579

eos6m4j

eos4zfy

eos2a9n

eos9c7k

eos7jlv

eos4b8j

eos3ae7

100

eos9be7

101

eos4tcc

102

eos5qfo

103

eos2mrz

104

eos2re5

105

eos30gr

106

eos526j

107

eos6pbf

108

eos2b6f

109

eos3xip

110

eos6o0z

111

eos85a3

112

eos8451

113

eos157v

114

eos481p

115

eos2mhp

116

eos6fza

117

eos5smc

118

eos9ei3

119

eos46ev

120

eos69p9

121

eos7a45

122

eos65rt

123

eos2hbd

124

eos97yu

125

eos6tg8

126

eos2gth

127

eos7pw8

128

eos8ioa

129

eos9yui

130

eos2r5a

131

eos6oli

132

eos6ao8

133

eos43at

134

eos1af5

135

eos2ta5

136

eos96ia

137

eos8d8a

138

eos7asg

139

eos2lm8

140

eos78ao

141

eos7jio

142

eos2thm

143

eos8a4x

144

eos8h6g

145

eos3b5e

146

eos5axz

147

eos3ae6

148

eos7w6n

149

eos4u6p

150

eos7a04

151

eos77w8

152

eos1amr

153

eos1vms

154

eos92sw

155

eos9f6t

156

eos4e40

Slug

Task

antimicrobial-kg-ml

Ready

GitHub

Antimicrobial class specificity prediction

Prediction of antimicrobial class specificity using simple machine learning methods applied to an antimicrobial knowledge graph. The knowledge graph is built on ChEMBL, Co-ADD and SPARK. Endpoints are broad terms such as activity against gram-positive or gram-negative bacteria. The best model according to the authors is a Random Forest with MHFP6 fingerprints.

Compound

Single

Score

List

Float

Pretrained

Antimicrobial activity

https://github.com/ersilia-os/eos74km

https://www.biorxiv.org/content/10.1101/2024.12.02.626313v1.full

https://github.com/IMI-COMBINE/broad_spectrum_prediction

MIT

Class probabilities for each antimicrobial class

Annotation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos74km.zip

Local

17/12/2024

2024

chemical-space-projections-coconut

Ready

GitHub

Projections against Coconut

This tool performs PCA, UMAP and tSNE projections taking the Coconut natural products database as a chemical space of reference. The Ersilia Compound Embeddings are used as descriptors. Four PCA components and two UMAP and tSNE components are returned.

Compound

Single

Value

List

Float

In-house

Embedding

https://github.com/ersilia-os/eos8ub5

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00478-9

https://github.com/ersilia-os/compound-embedding

GPL-3.0-or-later

Coordinates of 2D projections, namely PCA, UMAP and tSNE.

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos8ub5

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8ub5.zip

Local

10/11/2024

2024

chemical-space-projections-chemdiv

Ready

GitHub

Chemical space 2D projections against ChemDiv

This tool performs PCA, UMAP and tSNE projections taking a 100k ChemDiv diversity set as a chemical space of reference. The Ersilia Compound Embeddings are used as descriptors. Four PCA components and two UMAP and tSNE components are returned.

Compound

Single

Value

List

Float

In-house

Embedding

https://github.com/ersilia-os/eos2db3

https://www.chemdiv.com/catalog/diversity-libraries/representative-diversity-libraries-out-of-1-6m-stock/

https://github.com/ersilia-os/compound-embedding

GPL-3.0-or-later

Coordinates of 2D projections, namely PCA, UMAP and tSNE.

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos2db3

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2db3.zip

Local

9/11/2024

2024

chemical-space-projections-drugbank

Ready

GitHub

Chemical space 2D projections against DrugBank

This tool performs PCA, UMAP and tSNE projections taking the DrugBank chemical space as a reference. The Ersilia Compound Embeddings are used as descriptors. Four PCA components and two UMAP and tSNE components are returned.

Compound

Single

Value

List

Float

In-house

Embedding

https://github.com/ersilia-os/eos9gg2

https://academic.oup.com/nar/article/52/D1/D1265/7416367

https://github.com/ersilia-os/compound-embedding

GPL-3.0-or-later

Coordinates of 2D projections, namely PCA, UMAP and tSNE.

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos9gg2

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9gg2.zip

Local

9/11/2024

2024

bbbp-marine-kinase-inhibitors

Ready

GitHub

BBBP model tested on marine-derived kinase inhibitors

A set of three binary classifiers (random forest, gradient boosting classifier, and logistic regression) to predict the Blood-Brain Barrier (BBB) permeability of small organic compounds. The best models were applied to natural products of marine origin, able to inhibit kinases associated with neurodegenerative disorders. The training set size was around 300 compounds.

Compound

Single

Score

List

Float

Retrained

Drug-likeness

Permeability

https://github.com/ersilia-os/eos3mk2

https://pubmed.ncbi.nlm.nih.gov/30699889/

https://github.com/plissonf/BBB-Models

MIT

Classification score over three classifiers, namely random forest (rfc), gradient boosting classifier (gbc), and logistic regression (logreg).

Annotation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos3mk2

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3mk2.zip

Local

23/10/2024

2024

deep-dl

Ready

GitHub

Drug-likeness scoring based on unsupervised learning

This model evaluates drug-likeness using an unsupervised learning approach, eliminating the need for labeled data and avoiding biases from incomplete negative sets. It extracts features directly from known drug molecules, identifying common characteristics through a recurrent neural network (RNN) language model. By representing molecules as SMILES strings, the model learns the probability distribution of known drugs and assesses new molecules based on their likelihood of appearing in this space.

Compound

Single

Score

Single

Float

Pretrained

Drug-likeness

https://github.com/ersilia-os/eos9p4a

https://pubs.rsc.org/en/content/articlehtml/2022/sc/d1sc05248a

https://github.com/SeonghwanSeo/DeepDL

GPL-3.0-or-later

Higher score indicates higher drug likeness

Annotation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://eos9p4a-izpny.ondigitalocean.app/

https://hub.docker.com/r/ersiliaos/eos9p4a

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9p4a.zip

https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos9p4a

Online

4/9/2024

2024

unimol-representation

Ready

GitHub

Uni-Mol molecular representation

Uni-Mol offers a simple and effective SE(3) equivariant transformer architecture for pre-training molecular representations that capture 3D information. The model is trained on >200M conformations. The current model outputs a representation embedding.

Compound

Single

Value

List

Float

Pretrained

Fingerprint

https://github.com/ersilia-os/eos39co

https://openreview.net/forum?id=6K2RM6wVqKu

https://github.com/deepmodeling/Uni-Mol

GPL-3.0-only

Uni-Mol representation embedding

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos39co

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos39co.zip

Local

22/7/2024

2024

qupkake

Ready

GitHub

Predict micro-pKa of organic molecules

QupKake is an innovative approach that combines graph neural network (GNN) models with semiempirical quantum mechanical (QM) features to forecast the micro-pKa values of organic molecules. QM has a significant role in both identifying reaction sites and predicting micro-pKa values. Precisely predicting micro-pKa values is vital for comprehending and adjusting the acidity and basicity of organic compounds, This has significant applications in drug discovery, materials science, and environmental c

Compound

Single

Value

List

Float

Pretrained

pKa

https://github.com/ersilia-os/eos3wzy

https://doi.org/10.1021/acs.jctc.4c00328

https://github.com/hutchisonlab/QupKake

BSD-3-Clause

Up to 10 pKa values for the molecule

Annotation

LauraGomezjurado

https://github.com/LauraGomezjurado

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3wzy.zip

Local

17/7/2024

2024

mpro-covid19

Ready

GitHub

Predict bioactivity against Main Protease of SARS-CoV-2

MProPred predicts the efficacy of compounds against the main protease of SARS-CoV-2, which is a promising drug target since it processes polyproteins of SARS-CoV-2. This model uses PaDEL-Descriptor to calculate molecular descriptors of compounds. It is based on a dataset of 758 compounds that have inhibition efficacy against the Main Protease, as published in peer-reviewed journals between January, 2020 and August, 2021. Input compounds are compared to compounds in the dataset to measure molecul

Compound

Single

Value

Single

Float

Pretrained

COVID19

https://github.com/ersilia-os/eos3nn9

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10289339/

https://github.com/Nadimfrds/Mpropred

MIT

Gives the pIC50 values for each compound to compare their bioactivity against the main protease

Annotation

HarmonySosa

https://github.com/HarmonySosa

https://hub.docker.com/r/ersiliaos/eos3nn9

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3nn9.zip

Local

1/7/2024

2024

cardiotox-dictrank

Ready

GitHub

Cardiotoxicity Classifier

Prediction of drug-induced cardiotoxicity as a binary classification of cardiotoxicity risk. The probability score depicts risk of the compound being cardiotoxic. Classification is based on the chemical data such as SMILES representations of compounds and a variety of descriptors such as Morgan fingerprints and Mordred physicochemical descriptors that describe the molecular structure of the drug interactions. Biological data is also used including gene expression and cellular paintings after dru

Compound

Single

Score

Single

Float

Retrained

Cardiotoxicity

DrugBank

https://github.com/ersilia-os/eos1pu1

https://doi.org/10.1021/acs.jcim.3c01834

https://github.com/srijitseal/DICTrank

None

The model provides a probability score indicating the likelihood of a compound being cardiotoxic

Annotation

kurysauce

https://github.com/kurysauce

https://hub.docker.com/r/ersiliaos/eos1pu1

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1pu1.zip

Local

29/6/2024

2024

phakinpro

Ready

GitHub

Pharmacokinetics Profiler (PhaKinPro)

Pharmacokinetics Profiler (PhaKinPro) predicts the pharmacokinetic (PK) properties of drug candidates. It has been built using a manually curated database of 10.000 compounds with information for 12 PK endpoints. Each model provides a multi-classifier output for a single endpoint, along with a confidence estimate of the prediction and whether the query molecule is within the applicability domain of the model.

Compound

Single

Score

List

String

Pretrained

Microsomal stability

ADME

Metabolism

Half-life

Permeability

https://github.com/ersilia-os/eos39dp

https://pubs.acs.org/doi/10.1021/acs.jmedchem.3c02446

https://github.com/molecularmodelinglab/PhaKinPro

MIT

A list of several ADME predictions

Annotation

sucksido

https://github.com/sucksido

https://hub.docker.com/r/ersiliaos/eos39dp

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos39dp.zip

Local

3/5/2024

2024

whales-qmug

Ready

GitHub

WHALES similarity search on 600k molecules from Q-Mug

Search Q-Mug based on WHALES descriptors. Q-Mug is a subset of 600k bioactive molecules from ChEMBL. Three conformers are given for each molecule. WhALES is a simple descriptor useful for scaffold hopping.

Compound

Single

Compound

List

String

Pretrained

Similarity

https://github.com/ersilia-os/eos6ru3

https://link.springer.com/protocol/10.1007/978-1-0716-1209-5_2

https://github.com/ETHmodlab/scaffold_hopping_whales

GPL-3.0

The top 100 most similar molecules are returned, based on WHALES descriptors. 3D conformer generation is done internally.

Sampling

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos6ru3

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6ru3.zip

Local

22/4/2024

2024

reinvent4-libinvent

Ready

GitHub

REINVENT 4 LibInvent

REINVENT 4 LibInvent creates new molecules by appending R groups to a given input. If the input SMILES string contains specified attachment points, it is directly processed by LibInvent to generate new molecules. If no attachment points given, the model try to find potential attachment points, and iterates through different combinations of these points. It passes each combination to LibInvent to generate new molecules.

Compound

Single

Compound

List

String

Pretrained

Similarity

https://github.com/ersilia-os/eos6ost

https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a

https://github.com/MolecularAI/REINVENT4

Apache-2.0

Model generates up to 1000 similar molecules per input molecule.

Sampling

ankitskvmdam

https://github.com/ankitskvmdam

https://hub.docker.com/r/ersiliaos/eos6ost

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6ost.zip

Local

18/4/2024

2024

cc-signaturizer-3d

Ready

GitHub

Chemical Checker Signaturizer 3D

Building on the Chemical Checker bioactivity signatures (available as eos4u6p), the authors use the relation between stereoisomers and bioactivity of over 1M compounds to train stereochemically-aware signaturizers that better describe small molecule bioactivity properties. In this implementation we provide the A1, A2, A3, B1, B4 and C3 signatures

Compound

Single

Value

List

Float

Pretrained

Descriptor

Bioactivity profile

Embedding

https://github.com/ersilia-os/eos8aox

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-024-00867-4

https://gitlabsbnb.irbbarcelona.org/packages/signaturizer3d

MIT

2D projection of bioactivity signatures

Representation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos8aox

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8aox.zip

Local

19/3/2024

2024

reinvent4-mol2mol-scaffold

Ready

GitHub

REINVENT 4 Mol2MolScaffold

Mol2MolScaffold uses REINVENT4's mol2mol scaffold prior and mol2mol scaffold generic prior to generate around 500 new molecules similar to the provided molecules. The generated molecules will be relatively similar to the input molecules.

Compound

Single

Compound

List

String

Pretrained

Similarity

https://github.com/ersilia-os/eos57bx

https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a

https://github.com/MolecularAI/REINVENT4

Apache-2.0

Model generates up to 500 similar molecules per input molecule.

Sampling

ankitskvmdam

https://github.com/ankitskvmdam

https://hub.docker.com/r/ersiliaos/eos57bx

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos57bx.zip

Local

8/3/2024

2024

erg-fingerprints

Ready

GitHub

ErG 2D Descriptors

The Extended Reduced Graph (ErG) approach uses the description of pharmacophore nodes to encode molecular properties, with the goal of correctly describing pharmacophoric properties, size and shape of molecules. It was benchmarked against Daylight fingerprints and outperformed them in 10 out of 11 cases. ErG descriptors are well suited for scaffold hopping approaches.

Compound

Single

Value

List

Float

Pretrained

Descriptor

Fingerprint

https://github.com/ersilia-os/eos5guo

https://pubs.acs.org/doi/10.1021/ci050457y

https://www.rdkit.org/docs/source/rdkit.Chem.rdReducedGraphs.html

BSD-3.0

Vector representing ErG fingerprint values

Representation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos5guo

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5guo.zip

Local

6/3/2024

2024

whales-scaled

Ready

GitHub

WHALES scaled

Scaled version of the WHALES descriptors (see eos3ae6). WHALES are holistic molecular descriptors useful for scaffold hopping, based on 3D structure to facilitate natural product featurization. The scaling uses sklearn's Robust Scaler trained on a random set of 100K molecules from ChEMBL.

Compound

Single

Value

List

Float

Pretrained

Natural product

Descriptor

https://github.com/ersilia-os/eos24ur

https://www.nature.com/articles/s42004-018-0043-x

https://github.com/grisoniFr/scaffold_hopping_whales

MIT

Scaled vector representation of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos24ur

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos24ur.zip

Local

5/3/2024

2024

scaffold-decoration

Ready

GitHub

Scaffold decoration

The context discusses a novel notation system called Sequential Attachment-based Fragment Embedding (SAFE) that improves upon traditional molecular string representations like SMILES. SAFE reframes SMILES strings as an unordered sequence of interconnected fragment blocks while maintaining compatibility with existing SMILES parsers. This streamlines complex molecular design tasks by facilitating autoregressive generation under various constraints. The effectiveness of SAFE is demonstrated by trai

Compound

Single

Compound

List

String

Pretrained

Compound generation

https://github.com/ersilia-os/eos2401

https://arxiv.org/pdf/2310.10773.pdf

https://github.com/datamol-io/safe/tree/main

Model generates up to 1000 new molecules from input molecule by replacing side chains of the scaffold

Sampling

Inyrkz

https://github.com/Inyrkz

https://hub.docker.com/r/ersiliaos/eos2401

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2401.zip

Local

20/2/2024

2024

dili-predictor

Ready

GitHub

Early prediction of Drug-Induced Liver Injury

The DILI-Predictor predicts 10 features related to DILI toxicity including in-vivo and in-vitro and physicochemical parameters. It has been developed by the Broad Institute using the DILIst dataset (1020 compounds) from the FDA and achieved an accuracy balance of 70% on a test set of 255 compounds held out from the same dataset. The authors show how the model can correctly predict compounds that are not toxic in human despite being toxic in mice.

Compound

Single

Score

List

Float

Pretrained

Toxicity

Metabolism

https://github.com/ersilia-os/eos5gge

https://pubs.acs.org/doi/10.1021/acs.chemrestox.4c00015

https://github.com/Manas02/dili-pip

None

Prediction of 10 DILI-related endpoints. The most important is the first, DILI. Threshold for DILI active is set at 0.16 by the authors.

Annotation

Zainab-ik

https://github.com/Zainab-ik

https://hub.docker.com/r/ersiliaos/eos5gge

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5gge.zip

Local

19/2/2024

2024

admet-ai-prediction

Ready

GitHub

ADMET properties prediction

ADMET AI is a framework for carrying out fast batch predictions for ADMET properties. It is based on ensemble of five Chemprop-RDKit models and has been trained on 41 tasks from the ADMET group in Therapeutics Data Commons (v0.4.1). Out of these 41 tasks, there are 31 classification tasks and 10 regression tasks. In addition to that output also contains 8 physicochemical properties, namely, molecular weight, logP, hydrogen bond acceptors, hydrogen bond doners, Lipinski's Rule of 5, QED, stereo c

Compound

Single

Score

Value

List

Float

Pretrained

ADME

Toxicity

https://github.com/ersilia-os/eos7d58

https://academic.oup.com/bioinformatics/article/40/7/btae416/7698030

https://github.com/swansonk14/admet_ai

MIT

ADMET outcomes, including physicochemical properties and classification tasks, as well as percentile normalizations based on the DrugBank chemical space.

Annotation

DhanshreeA

https://github.com/DhanshreeA

https://eos7d58-awe6b.ondigitalocean.app/

https://hub.docker.com/r/ersiliaos/eos7d58

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7d58.zip

https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos7d58

Local

Yes

7/2/2024

2024

reinvent4-mol2mol-medium-similarity

Ready

GitHub

REINVENT 4 Mol2MolMediumSimilarity

The Mol2MolMediumSimilarity leverages REINVENT4's mol2mol medium similarity prior to generate up to 100 unique molecules. The generated molecules will be relatively similar to the input molecule.

Compound

Single

Compound

List

String

Pretrained

Similarity

https://github.com/ersilia-os/eos694w

https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a

https://github.com/MolecularAI/REINVENT4

Apache-2.0

Model generates up to 100 similar molecules per input molecule.

Sampling

ankitskvmdam

https://github.com/ankitskvmdam

https://hub.docker.com/r/ersiliaos/eos694w

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos694w.zip

Local

7/2/2024

2024

antibiotics-ai-cytotox

Ready

GitHub

Human cytotoxicity endpoints

The authors tested the dataset of 39312 compounds used to train the antibiotics-ai model (eos18ie) against several cytotoxicity endpoints; human liver carcinoma cells (HepG2), human primary skeletal muscle cells (HSkMCs) and human lung fibroblast cells (IMR-90). Cellular viability was measured after 20133 days of treatment with each compound at 10 μM and activities were binarized using a 90% cell viability cut-off. 341 (8.5%), 490 (3.8%) and 447 (8.8%) compounds classified as cytotoxic for HepG2

Compound

Single

Score

List

Float

Pretrained

Cytotoxicity

https://github.com/ersilia-os/eos42ez

https://www.nature.com/articles/s41586-023-06887-8

https://github.com/felixjwong/antibioticsai

MIT

Predicting cytotoxicity in human liver carcinoma cells (HepG2), human primary skeletal muscle cells (HSkMCs) and human lung fibroblast cells (IMR-90)

Annotation

Richiio

https://github.com/Richiio

https://hub.docker.com/r/ersiliaos/eos42ez

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos42ez.zip

Local

Yes

5/2/2024

2024

inter-dili

Ready

GitHub

InterDILI: drug-induced injury prediction

This model has been trained on a publicly available collection of 5 datasets manually curated for drug-induced-liver-injury (DILI). DILI outcome has been binarised, and ECFP descriptors, together with physicochemical properties have been used to train a random forest classifier which achieves AUROC > 0.9

Compound

Single

Score

Single

Float

Retrained

Toxicity

Human

Metabolism

https://github.com/ersilia-os/eos21q7

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-023-00796-8

https://github.com/bmil-jnu/InterDILI

None

Probability of Drug-Induced Liver Injury (DILI), higher score indicates higer risk

Annotation

leilayesufu

https://github.com/leilayesufu

https://hub.docker.com/r/ersiliaos/eos21q7

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos21q7.zip

Local

30/1/2024

2024

antibiotics-ai-saureus

Ready

GitHub

Antibiotic activity prediction against Staphylococcus aureus

The authors use a mid-size dataset (more than 30k compounds) to train an explainable graph-based model to identify potential antibiotics with low cytotoxicity. The model uses a substructure-based approach to explore the chemical space. Using this method, they were able to screen 283 compounds and identify a candidate active against methicillin-resistant S. aureus (MRSA) and vancomycin-resistant enterococci.

Compound

Single

Score

Single

Float

Pretrained

Antimicrobial activity

ESKAPE

https://github.com/ersilia-os/eos18ie

https://www.nature.com/articles/s41586-023-06887-8

https://github.com/felixjwong/antibioticsai

MIT

Probability of growth inhibition (80% cut off at 50uM)

Annotation

Richiio

https://github.com/Richiio

https://hub.docker.com/r/ersiliaos/eos18ie

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos18ie.zip

Local

Yes

26/1/2024

2024

scaffold-morphing

Ready

GitHub

Scaffold morphing

Compound

Single

Compound

List

String

Pretrained

Compound generation

https://github.com/ersilia-os/eos8bhe

https://arxiv.org/pdf/2310.10773.pdf

https://github.com/datamol-io/safe/tree/main

Model generates new molecules from input molecule by replacing core structures of input molecule.

Sampling

Inyrkz

https://github.com/Inyrkz

https://hub.docker.com/r/ersiliaos/eos8bhe

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8bhe.zip

Local

12/1/2024

2024

ngonorrhoeae-inhibition

Ready

GitHub

Growth Inhibitors of Neisseria gonorrhoeae

The authors curated a dataset of 282 compounds from ChEMBL, of which 160 (56.7%) were labeled as active N. gonorrhoeae inhibitor compounds. They used this dataset to build a naïve Bayesian model and used it to screen a commercial library. With this method, they identified and validated two hits. We have used the dataset to build a model using LazyQSAR with Ersilia Compound Embeddings as molecular descriptors. LazyQSAR is an AutoML Ersilia-developed library.

Compound

Single

Score

Single

Float

Retrained

Antimicrobial activity

ChEMBL

N.gonorrhoeae

https://github.com/ersilia-os/eos5cl7

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8274436/

https://github.com/ersilia-os/lazy-qsar

GPL-3.0

Probability of activity for the inhibition of the pathogen N. gonorrhoeae

Annotation

Richiio

https://github.com/Richiio

https://hub.docker.com/r/ersiliaos/eos5cl7

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5cl7.zip

Local

Yes

3/1/2024

2024

hdac3-inhibition

Ready

GitHub

Identifying HDAC3 inhibitors

The model predicts the inhibitory potential of small molecules against Histone deacetylase 3 (HDAC3), a relevant human target for cancer, inflammation, neurodegenerative diseases and diabetes. The authors have used a dataset of 1098 compounds from ChEMBL and validated the model using the benchmark MUBD-HDAC3.

Compound

Single

Score

Single

Float

Pretrained

Cancer

ChEMBL

https://github.com/ersilia-os/eos1n4b

https://onlinelibrary.wiley.com/doi/10.1002/minf.202000105

https://github.com/jwxia2014/HDAC3i-Finder

GPL-3.0

Probability that the molecule is a HDAC3 inhibitor

Annotation

Richiio

https://github.com/Richiio

https://hub.docker.com/r/ersiliaos/eos1n4b

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1n4b.zip

Local

14/12/2023

2023

mrlogp

Ready

GitHub

MRlogP: neural network-based logP prediction for druglike small molecules

The authors use a two-step approach to build a model that accurately predicts the lipophilicity (LogP) of small molecules. First, they train the model on a large amount of low accuracy predicted LogP values and then they fine-tune the network using a small, accurate dataset of 244 druglike compounds. The model achieves an average root mean squared error of 0.988 and 0.715 against druglike molecules from Reaxys and PHYSPROP.

Compound

Single

Value

Single

Float

Pretrained

Lipophilicity

LogP

https://github.com/ersilia-os/eos9ym3

https://www.mdpi.com/2227-9717/9/11/2029/htm

https://github.com/JustinYKC/MRlogP

MIT

Predicted LogP of small molecules

Annotation

leilayesufu

https://github.com/leilayesufu

https://hub.docker.com/r/ersiliaos/eos9ym3

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ym3.zip

Local

12/12/2023

2023

dmpnn-herg

Ready

GitHub

Prediction of hERG channel blockers with directed message passing neural networks

This model leverages the ChemProp network (D-MPNN) to build a predictor of hERG-mediated cardiotoxicity. The model has been trained using a published dataset which contains 7889 molecules with several cut-offs for hERG blocking activity. The authors select a 10 uM cut-off. This implementation of the model does not use any specific featurizer, though the authors suggest the moe206 descriptors (closed-source) improve performance even further.

Compound

Single

Score

Single

Float

Pretrained

Cardiotoxicity

hERG

Toxicity

Descriptor

https://github.com/ersilia-os/eos30f3

https://pubs.rsc.org/en/content/articlehtml/2022/ra/d1ra07956e

https://github.com/AI-amateur/DMPNN-hERG

None

Probability of blocking hERG (cut-off: 10uM)

Annotation

leilayesufu

https://github.com/leilayesufu

https://hub.docker.com/r/ersiliaos/eos30f3

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos30f3.zip

Local

4/12/2023

2023

chemprop-burkholderia

Ready

GitHub

Burkholderia cenocepacia inhibition

Prediction of antimicrobial potential using a dataset of 29537 compounds screened against the antibiotic resistant pathogen Burkholderia cenocepacia. The model uses the Chemprop Direct Message Passing Neural Network (D-MPNN) abd has an AUC score of 0.823 for the test set. It has been used to virtually screen the FDA approved drugs as well as a collection of natural product list (>200k compounds) with hit rates of 26% and 12% respectively.

Compound

Single

Score

Single

Float

Pretrained

Antimicrobial activity

https://github.com/ersilia-os/eos5xng

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9624395/

https://github.com/cardonalab/Prediction-of-ATB-Activity

GPL-3.0

Probability that a compound inhibits the drug resistant bacteria Burkholderia cenocepacia. Scores range from 0 to 1. With 1 indicating the highest probability for growth inhibitory activity.

Annotation

Richioo

https://github.com/Richioo

https://hub.docker.com/r/ersiliaos/eos5xng

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5xng.zip

Local

Yes

3/12/2023

2023

pgmg-pharmacophore

Ready

GitHub

Pharmacophore-guided molecular generation

Based on a molecule's pharmacophore, this model generates new molecules de-novo to match the pharmacophore. Internally, pharmacophore hypotheses are generated for a given ligand. A graph neural network encodes spatially distributed chemical features and a transformer decoder generates molecules.

Compound

Single

Compound

List

String

Pretrained

Chemical graph model

Compound generation

https://github.com/ersilia-os/eos69e6

https://www.nature.com/articles/s41467-023-41454-9

https://github.com/CSUBioGroup/PGMG

MIT

Model generates new molecules from input molecule by first creating pharmacophore hypotheses and then constraining generation.

Sampling

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos69e6

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos69e6.zip

Local

1/12/2023

2023

morgan-binary-fps

Ready

GitHub

Morgan fingerprints in binary form (radius 3, 2048 dimensions)

The Morgan Fingerprints are one of the most widely used molecular representations. They are circular representations (from an atom,search the atoms around with a radius n) and can have thousands of features. This implementation uses the RDKit package and is done with radius 3 and 2048 dimensions, providing a binary vector as output. For Morgan counts, see model eos5axz.

Compound

Single

Value

List

Integer

Pretrained

Descriptor

Fingerprint

https://github.com/ersilia-os/eos4wt0

https://pubmed.ncbi.nlm.nih.gov/20426451/

https://www.rdkit.org/docs

BSD-3.0

Binary vector representing the SMILES

Representation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos4wt0

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4wt0.zip

Local

1/12/2023

2023

pmapper-3d

Ready

GitHub

3D pharmacophore descriptor

The pharmacophore mapper (pmapper) identifies common 3D pharmacophores of active compounds against a specific target and uniquely encodes them with hashes suitable for fast identification of identical pharmacophores. The obtained signatures are amenable for downstream ML tasks.

Compound

Single

Value

List

Integer

Pretrained

Descriptor

Fingerprint

https://github.com/ersilia-os/eos4x30

https://www.mdpi.com/1422-0067/20/23/5834

https://github.com/DrrDom/pmapper

BSD-3.0

Vector representation of pharmacophores

Representation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos4x30

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4x30.zip

Local

28/11/2023

2023

molfeat-usrcat

Ready

GitHub

USR descriptors with pharmacophoric constraints

USRCAT is a real-time ultrafast molecular shape recognition with pharmacophoric constraints. It integrates atom type to the traditional Ultrafast Shape Recognition (USR) descriptor to improve the performance of shape-based virtual screening, being able to discriminate between compounds with similar shape but distinct pharmacophoric features.

Compound

Single

Value

List

Float

Pretrained

Descriptor

Embedding

https://github.com/ersilia-os/eos1ut3

https://jcheminf.biomedcentral.com/articles/10.1186/1758-2946-4-27

https://molfeat.datamol.io/featurizers/usrcat

Apache-2.0

60 features based on USRCAT

Representation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos1ut3

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1ut3.zip

Local

28/11/2023

2023

antitb-seattle

Ready

GitHub

Antituberculosis activity prediction

Prediction of the activity of small molecules against Mycobacterium tuberculosis. This model has been developed by Ersilia thanks to the data provided by the Seattle Children's (Dr. Tanya Parish research group). In vitro activity against M. tuberculosis was measured i na single point inhibition assay (10000 molecules) and selected compounds (259) were assayed in MIC50 and MIC90 assays. Cut-offs have been determined according to the researcher's guidance.

Compound

Single

Compound

List

Float

In-house

M.tuberculosis

Antimicrobial activity

MIC90

Tuberculosis

https://github.com/ersilia-os/eos9ivc

https://pubmed.ncbi.nlm.nih.gov/30650074/

https://github.com/ersilia-os/lazy-qsar

GPL-3.0

Probability of inhibition of M.tb in vitro in the MIC50, MIC90 and whole cell assays at cut-offs 10 and 20 uM and 50%, respectively

Classification

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos9ivc

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ivc.zip

Local

24/11/2023

2023

molpmofit

Ready

GitHub

Molecular Prediction Model Fine-Tuning (MolPMoFiT) encodings

Using self-supervised learning, the authors pre-trained a large model using one millon unlabelled molecules from ChEMBL. This model can subsequently be fine-tuned for various QSAR tasks. Here, we provide the encodings for the molecular structures using the pre-trained model, not the fine-tuned QSAR models.

Compound

Single

Value

List

Float

Pretrained

Descriptor

Embedding

https://github.com/ersilia-os/eos9zw0

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00430-x

https://github.com/XinhaoLi74/MolPMoFiT

Embedding vectors of each smiles are obtained, represented in a matrix, where each row is a vector of embedding of each smiles character, with a dimension of 400. The pretrained model is loaded using the fastai library

Representation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos9zw0

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9zw0.zip

Local

6/11/2023

2023

moler-enamine-blocks

Ready

GitHub

Extending molecular scaffolds with building blocks

MoLeR is a graph-based generative model that combines fragment-based and atom-by-atom generation of new molecules with scaffold-constrained optimization. It does not depend on generation history and therefore MoLeR is able to complete arbitrary scaffolds. The model has been trained on the GuacaMol dataset. Here we sample the 300k building blocks library from Enamine.

Compound

Single

Compound

List

String

Pretrained

Chemical graph model

Compound generation

https://github.com/ersilia-os/eos633t

https://arxiv.org/abs/2103.03864

https://github.com/microsoft/molecule-generation

MIT

1000 new molecules are sampled for each input molecule, preserving its scaffold.

Sampling

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos633t

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos633t.zip

Local

3/11/2023

2023

small-world-wuxi

Ready

GitHub

Small World Wuxi search

Small World is an index of chemical space containing more than 230B molecular substructures. Here we use the Small World API to post a query to the SmallWorld server. We sample 100 molecules within a distance of 10 specifically for the Wuxi map, not the entire SmallWorld domain. Please check other small-world models available in our hub.

Compound

Single

Compound

List

String

Online

Similarity

https://github.com/ersilia-os/eos3kcw

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606195/

https://pypi.org/project/smallworld-api/

MIT

List of 100 nearest neighbors

Sampling

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos3kcw

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3kcw.zip

Local

2/11/2023

2023

small-world-zinc

Ready

GitHub

Small World Zinc search

Small World is an index of chemical space containing more than 230B molecular substructures. Here we use the Small World API to post a query to the SmallWorld server. We sample 100 molecules within a distance of 10 specifically for the ZINC map, not the entire SmallWorld domain. Please check other small-world models available in our hub.

Compound

Single

Compound

List

String

Online

Similarity

https://github.com/ersilia-os/eos1d7r

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606195/

https://pypi.org/project/smallworld-api/

MIT

List of 100 nearest neighbors

Sampling

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos1d7r

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1d7r.zip

Local

2/11/2023

2023

small-world-enamine-real

Ready

GitHub

Small World Enamine REAL search

Small World is an index of chemical space containing more than 230B molecular substructures. Here we use the Small World API to post a query to the SmallWorld server. We sample 100 molecules within a distance of 10 specifically for the Enamine REAL map, not the entire SmallWorld domain. Please check other small-world models available in our hub.

Compound

Single

Compound

List

String

Online

Similarity

https://github.com/ersilia-os/eos9ueu

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606195/

https://pypi.org/project/smallworld-api/

MIT

List of 100 nearest neighbors

Sampling

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos9ueu

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ueu.zip

Local

1/11/2023

2023

mycetos

Ready

GitHub

Inhibition of Eumycetoma from MycetOS

This model predicts the growth of the fungus M. mycetomatis, causal agent of Mycetoma, in presence of small drugs. It has been developed using the data from MycetOS, an opemn source initiative aiming at finding new patent-free drugs. The model has been trained using the LazyQSAR package (MorganBinaryClassifier) from Ersilia.

Compound

Single

Score

Single

Float

In-house

Mycetoma

Antifungal activity

https://github.com/ersilia-os/eos4f95

https://www.ijidonline.com/article/S1201-9712(20)31735-5/fulltext

https://github.com/ersilia-os/lazy-qsar

GPL-3.0

Probability of inhibition of M. mycetomatis (growth assay, cut-off at 20% growth)

Annotation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos4f95

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4f95.zip

Local

27/9/2023

2023

hdac1-inhibition

Ready

GitHub

Inhibition of HDAC1

Prediction of the inhibition of the Human Histone Deacetylase 1 to revert HIV latency. The dataset is composed of all available pIC50 values from ChEMBL target 325, and the model has been developed using Ersilia's LazyQsar package (MorganBinaryClassifier)

Compound

Single

Score

List

Float

In-house

HIV

Human

HDAC1

https://github.com/ersilia-os/eos2zmb

https://www.ebi.ac.uk/chembl/target_report_card/CHEMBL325/

https://github.com/ersilia-os/lazy-qsar

GPL-3.0

Probability of inhibition of HDAC1 at cut-offs pIC50 7 (0.1uM) and 8 (10nM)

Annotation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos2zmb

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2zmb.zip

Local

27/9/2023

2023

chembl-sampler

Ready

GitHub

ChEMBL Molecular Sampler

A simple sampler of the ChEMBL database using their API. It looks for similar molecules to the input molecule and returns a list of 100 molecules by default. This model has been developed by Ersilia. It posts queries to an online server.

Compound

Single

Compound

List

String

Pretrained

Similarity

https://github.com/ersilia-os/eos1noy

https://academic.oup.com/nar/article/40/D1/D1100/2903401

https://github.com/ersilia-os/chem-sampler/blob/main/chemsampler/samplers/chembl/sampler.py

GPL-3.0

100 nearest molecules in ChEMBL

Sampling

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos1noy

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1noy.zip

Local

4/9/2023

2023

hepg2-mmv

Ready

GitHub

HepG2 Toxicity - MMV

This model predicts the toxicity of small molecules in HepG2 cells. It has been developed by Ersilia thanks to data provided by MMV. We have used two cut-offs to define activity (5 and 10 uM respectively) with a dataset of 1335 molecules. 5-fold crossvalidation showed an AUROC of 0.8 and 0.77 respectively

Compound

Single

Probability

List

Float

In-house

Toxicity

Human

https://github.com/ersilia-os/eos3le9

https://ersilia.io

https://github.com/ersilia-os/lazy-qsar

GPL-3.0

Probability of toxicity in HepG2 cells. Cut-offs: 5 and 10 uM

Classification

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos3le9

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3le9.zip

Local

24/8/2023

2023

malaria-mmv

Ready

GitHub

Antimalarial activity (MMV)

Prediction of the in vitro antimalarial potential of small molecules. This model has been developed by Ersilia thanks to experimental data provided by MMV. The model provides the probability of inhibition of the malaria parasite (NF54) measured both as percentage of inhibition (with luminescence and LDH) and IC50. 5-fold crossvalidation of the models shows AUROC>0.75 in all models.

Compound

Single

Probability

Single

Float

In-house

Malaria

P.falciparum

IC50

https://github.com/ersilia-os/eos4rta

https://ersilia.io

https://github.com/ersilia-os/lazy-qsar

GPL-3.0

Probability of inhibiting the malaria parasite (strain NF54) in IC50 (threshold 1uM) and percentage of inhibition (50%, measured by LDH and Lum)

Classification

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos4rta

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4rta.zip

Local

24/8/2023

2023

schisto-swisstph

Ready

GitHub

Anti-schistosomiasis activity

Prediction of the activity of small molecules against the schistosoma parasite. This model has been developed by Ersilia thanks to the data provided by the Swiss TPH. In vitro activity against newly transformed schistosoma (nts) and adult worms was measured (% of inhibition of activity and IC50, respectively)

Compound

Single

Probability

List

Float

In-house

Neglected tropical disease

Schistosomiasis

IC50

https://github.com/ersilia-os/eos2l0q

https://pubmed.ncbi.nlm.nih.gov/30398059

https://github.com/ersilia-os/lazy-qsar

GPL-3.0

The probabilities of the molecule being active against schistosoma in NTS stage (in a % of inhibition assay at 70 and 90% inhibition 10uM) and adult stage (in IC50 assay at cut-offs 5 and 10uM

Classification

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos2l0q

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2l0q.zip

Local

24/8/2023

2023

chemprop-abaumannii

Ready

GitHub

Inhibition of Acinetobacter baumannii growth

This model is a Chemprop neural network trained with a growth inhibition dataset. Authors screened ~7,500 molecules for those that inhibited the growth of A. baumannii in vitro. They discovered abaucin, an antibacterial compound with narrow-spectrum activity against A. baumannii.

Compound

Single

Score

Single

Float

Pretrained

A.baumannii

Antimicrobial activity

https://github.com/ersilia-os/eos3804

https://www.nature.com/articles/s41589-023-01349-8

https://github.com/GaryLiu152/chemprop_abaucin

None

Probability of growth inhibition of the bacteria A. Baumannii (threshold > 80%)

Annotation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://eos3894-gz5nz.ondigitalocean.app/

https://hub.docker.com/r/ersiliaos/eos3804

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3804.zip

https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos3804

Online

Yes

23/8/2023

2023

pubchem-sampler

Ready

GitHub

PubChem Molecular Sampler

A simple sampler of the PubChem database using their API. It looks for similar molecules to the input molecule and returns a list of 100 molecules by default. This model has been developed by Ersilia and posts queries to an online server.

Compound

Single

Compound

List

String

Pretrained

Similarity

https://github.com/ersilia-os/eos2hzy

https://academic.oup.com/nar/article/51/D1/D1373/6777787

https://github.com/ersilia-os/chem-sampler/blob/main/chemsampler/samplers/pubchem/sampler.py

GPL-3.0

100 nearest molecules in PubChem

Similarity

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos2hzy

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2hzy.zip

Local

10/8/2023

2023

stoned-sampler

Ready

GitHub

Stoned Sampler

The STONED sampler uses small modifications to molecules represented as SELFIES to perform a search of the chemical space and generate new molecules. The use of string modifications in the SELFIES molecular representation bypasses the need for large amounts of data while maintaining a performance comparable to deep generative models.

Compound

Single

Compound

List

String

Pretrained

Compound generation

https://github.com/ersilia-os/eos8fma

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153210/

https://github.com/aspuru-guzik-group/stoned-selfies

Apache-2.0

Up to 1000 derivatives of the input molecule

Generative

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos8fma

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8fma.zip

Local

8/8/2023

2023

smiles-pe

Ready

SmilesPE: tokenizer algorithm for SMILES, DeepSMILES, and SELFIES

The Smiles Pair Encoding method generates smiles substring tokens based on high-frequency token pairs from large chemical datasets. This method is well-suited for both QSAR activities as well as generative models. The model provided here has been pretrained using ChEMBL.

Compound

Single

Compound

Flexible List

String

Pretrained

Chemical language model

Chemical notation

ChEMBL

https://github.com/ersilia-os/eos1mxi

https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c01127

https://github.com/XinhaoLi74/SmilesPE

Apache-2.0

A data-driven tokenization method for SMILES-based deep learning models in cheminformatics, demonstrating high performance in molecular generation and QSAR prediction tasks compared to atom-level tokenization

Generative

Richiio

https://github.com/Richiio

https://hub.docker.com/r/ersiliaos/eos1mxi

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1mxi.zip

Local

2/8/2023

2023

osm-series4

Ready

Antimalarial activity from OSM

This model predicts the antimalarial potential of small molecules in vitro. We have collected the data available from the Open Source Malaria Series 4 molecules and used two cut-offs to define activity, 1 uM and 2.5 uM. The training has been done with the LazyQSAR package (Morgan Binary Classifier) and shows an AUROC >0.8 in a 5-fold cross-validation on 20% of the data held out as test. These models have been used to generate new series 4 candidates by Ersilia.

Compound

Single

Probability

List

Float

Pretrained

Malaria

P.falciparum

IC50

https://github.com/ersilia-os/eos7yti

https://pubs.acs.org/doi/10.1021/acscentsci.6b00086

https://github.com/ersilia-os/lazy-qsar

GPL-3.0

Probability of killing P.falciparum in vitro (IC50 < 1uM and 2.5uM, respectively)

Classification

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos7yti

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7yti.zip

Local

2/8/2023

2023

fasmifra

Ready

FasmiFra molecule generator

FasmiFra is a molecular generator based on (deep)SMILES fragments. The authors use Deep SMILES to ensure the generated molecules are syntactically valid, and by working on string operations they are able to obtain high performance (>340,000 molecule/s). Here, we use 100k compounds from ChEMBL to sample fragments. Only assembled molecules containing one of the fragments of the input molecule are retained.

Compound

Single

Compound

List

String

Pretrained

Compound generation

https://github.com/ersilia-os/eos4qda

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00566-4

https://github.com/UnixJunkie/FASMIFRA

GPL-3.0

1000 generated molecules per each input

Generative

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos4qda

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4qda.zip

Local

1/8/2023

2023

malaria-mam

Ready

Antimalarial activity for sexual stage and asexual blood stage (ABS)

Prediction of the antimalarial potential of small molecules using data from various chemical libraries that were screened against the asexual and sexual (gametocyte) stages of the parasite. Several compounds' molecular fingerprints were used to train machine learning models to recognize stage-specific active and inactive compounds.

Compound

Single

Score

List

Float

Pretrained

Malaria

P.falciparum

https://github.com/ersilia-os/eos80ch

https://pubs.acs.org/doi/10.1021/acsomega.3c05664

https://github.com/M2PL

GPL-3.0

Probability of inhibition of the malaria parasite growth

Annotation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos80ch

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos80ch.zip

Local

Yes

10/7/2023

2023

ncats-cyp3a4

Ready

CYP3A4 metabolism

Analysis of metabolic stability, determining the inhibition of CYP3A4 activity and whether the compounds are a substrate for the CYP3A$ enzyme. The data to build these models has been publicly available at PubChem (AID1645840, AID1645841, AID1645842) by ADME@NCATS.

Compound

Single

Probability

List

Float

Pretrained

CYP450

ADME

Metabolism

https://github.com/ersilia-os/eos3ev6

https://dmd.aspetjournals.org/content/49/9/822

https://github.com/ncats/ncats-adme

None

Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.

Classification

ZakiaYahya

https://github.com/ZakiaYahya

https://hub.docker.com/r/ersiliaos/eos3ev6

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ev6.zip

Local

Yes

6/7/2023

2023

ncats-cyp2d6

Ready

CYP2D6 metabolism

Analysis of metabolic stability, determining the inhibition of CYP2D6 activity and whether the compounds are a substrate for the CYP2D6 enzyme. The data to build these models has been publicly available at PubChem (AID1645840, AID1645841, AID1645842) by ADME@NCATS

Compound

Single

Probability

List

Float

Pretrained

CYP450

ADME

Metabolism

https://github.com/ersilia-os/eos7nno

https://dmd.aspetjournals.org/content/49/9/822

https://github.com/ncats/ncats-adme

None

Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.

Classification

ZakiaYahya

https://github.com/ZakiaYahya

https://hub.docker.com/r/ersiliaos/eos7nno

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7nno.zip

Local

Yes

6/7/2023

2023

ncats-cyp2c9

Ready

CYP2C9 metabolism

Analysis of metabolic stability, determining the inhibition of CYP2C9 activity and whether the compounds are a substrate for the CYP2C9 enzyme. The data to build these models has been publicly available at PubChem (AID1645840, AID1645841, AID1645842) by ADME@NCATS

Compound

Single

Probability

List

Float

Pretrained

CYP450

ADME

Metabolism

https://github.com/ersilia-os/eos5jz9

https://dmd.aspetjournals.org/content/49/9/822

https://github.com/ncats/ncats-adme

None

Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.

Classification

ZakiaYahya

https://github.com/ZakiaYahya

https://hub.docker.com/r/ersiliaos/eos5jz9

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5jz9.zip

Local

Yes

5/7/2023

2023

bidd-molmap-fingerprint

Ready

Molecular fingerprint maps based on broadly learned knowledge-based representations

Molecular representation of small molecules via ingerprint-based molecular maps (images). Typically, the goal is to use these images as inputs for an image-based deep learning model such as a convolutional neural network. The authors have demonstrated high performance of MolMap out-of-the-box with a broad range of tasks from MoleculeNet.

Compound

Single

Image

Descriptor

List

Float

Pretrained

Fingerprint

https://github.com/ersilia-os/eos59rr

https://www.nature.com/articles/s42256-021-00301-6

https://github.com/shenwanxiang/bidd-molmap

GPL-3.0

Image representation of a molecule. Each pixel represents a molecular feature (37 rows, 36 columns, flattened with reshape)

Representation

samuelmaina

https://github.com/samuelmaina

https://hub.docker.com/r/ersiliaos/eos59rr

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos59rr.zip

Local

3/7/2023

2023

h3d-virtual-screening-cascade-light

Ready

H3D virtual screening cascade light

This panel of models provides predictions for the H3D virtual screening cascade. It leverages the Ersilia Compound Embedding and FLAML. The H3D virtual screening cascade contains models for Mycobacterium tuberculosis and Plasmodium falciparum IC50 predictions, as well as ADME, cytotoxicity and solubility assays

Compound

Single

Probability

List

Float

In-house

Malaria

P.falciparum

Tuberculosis

M.tuberculosis

ADME

Cytotoxicity

Solubility

https://github.com/ersilia-os/eos7kpb

https://www.nature.com/articles/s41467-023-41512-2

https://github.com/ersilia-os/h3d-screening-cascade-models

GPL-3.0

The raw scores are the ones emerging from the FLAML model. The ones with a sufix _perc represent the percentile in the scale 0-1 over a ChEMBL dataset of 200k compounds.

Classification

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos7kpb

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7kpb.zip

Local

Yes

9/5/2023

2023

ersilia-compound-embedding

Ready

Ersilia Compound Embeddings

Bioactivity-aware chemical embeddings for small molecules. Using transfer learning, we have created a fast network that produces embeddings of 1024 features condensing physicochemical as well as bioactivity information The training of the network has been done using the FS-Mol and ChEMBL datasets, and Grover, Mordred and ECFP descriptors

Compound

Single

Descriptor

List

Float

In-house

Descriptor

Embedding

https://github.com/ersilia-os/eos2gw4

https://www.nature.com/articles/s41467-023-41512-2

https://github.com/ersilia-os/compound-embedding

GPL-3.0

Embedding of 1024 features representing a compound

Representation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos2gw4

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2gw4.zip

Local

13/4/2023

2023

molfeat-chemgpt

Ready

ChemGPT-4.7

ChemGPT (4.7M params) is a language-based transformer model for generative molecular modeling, which was pretrained on the PubChem10M dataset. Pre-trained ChemGPT models are also robust, self-supervised representation learners that generalize to previously unseen regions of chemical space and enable embedding-based nearest-neighbor search.

Compound

Single

Descriptor

List

Float

Pretrained

Descriptor

Chemical language model

Chemical graph model

Embedding

https://github.com/ersilia-os/eos3cf4

https://chemrxiv.org/engage/chemrxiv/article-details/627bddd544bdd532395fb4b5

https://molfeat.datamol.io/featurizers/ChemGPT-4.7M

Apache-2.0

128 features based on a chemical language model

Representation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos3cf4

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3cf4.zip

Local

11/4/2023

2023

molfeat-estate

Ready

Estate Molecular Descriptors

Electrotopological state (Estate) indices are numerical values computed for each atom in a molecule, and which encode information about both the topological environment of that atom and the electronic interactions due to all other atoms in the molecule

Compound

Single

Descriptor

List

Float

Pretrained

Fingerprint

Descriptor

https://github.com/ersilia-os/eos3zur

https://link.springer.com/article/10.1023/A:1015952613760

https://molfeat.datamol.io/featurizers/estate

Apache-2.0

79 Electrotopological features

Representation

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos3zur

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3zur.zip

Local

11/4/2023

2023

ncats-pampa74

Ready

Parallel Artificial Membrane Permeability Assay (PAMPA) 7

Parallel Artificial Membrane Permeability is an in vitro surrogate to determine the permeability of drugs across cellular membranes. PAMPA at pH 7.4 was experimentally determined in a dataset of 5,473 unique compounds by the NIH-NCATS. 50% of the dataset was used to train a classifier (SVM) to predict the permeability of new compounds, and validated on the remaining 50% of the data, rendering an AUC = 0.88. The Peff was converted to logarithmic, log Peff value lower than 2.0 were considered to h

Compound

Single

Probability

Single

Float

Pretrained

ADME

Permeability

LogP

https://github.com/ersilia-os/eos9tyg

https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext

https://github.com/ncats/ncats-adme

None

Probability of a compound being poorly permeable (logPeff < 1)

Classification

pauline-banye

https://github.com/pauline-banye

https://hub.docker.com/r/ersiliaos/eos9tyg

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9tyg.zip

Local

Yes

7/4/2023

2023

ncats-cyp450

Ready

CYP450 metabolism

Analysis of metabolic stability, determining the inhibition of CYP450 activity and whether the compounds are a substrate for the CYP450 enzymes. The data to build these models is publicly available at PubChem, AID1645840, AID1645841, AID1645842. The tested cyps include CYP2C9, CYP2D6 and CYP3A4.

Compound

Single

Probability

List

Float

Pretrained

CYP450

ADME

Metabolism

https://github.com/ersilia-os/eos44zp

https://dmd.aspetjournals.org/content/49/9/822

https://github.com/ncats/ncats-adme

None

Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.

Classification

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos44zp

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos44zp.zip

Local

Yes

6/4/2023

2023

qcrb-tb

Ready

QcrB Inhibition (M. tuberculosis)

The cytochrome bcc complex (QcrB) is a subunit of the mycobacterial cyt-bcc-aa3 oxidoreductase in the electron transport chain (ETC), and it has been suggested as a good M.tb target due to the bacteria's dependence on oxidative phosphorylation for its growth. The authors use a dataset of 352 molecules, of which 277 are classified as active (QIM < 1 uM), 58 as moderately active ( 1 > QIM < 20 uM) and 78 as inactive (QIM > 20). Qim refers to quantification of intracellular mycobacteria.

Compound

Single

Other value

Single

Integer

Pretrained

M.tuberculosis

Antimicrobial activity

https://github.com/ersilia-os/eos24jm

https://pubs.acs.org/doi/full/10.1021/acsomega.2c01613

https://github.com/CoutinhoLab/Q-TB/

Class 1: active(QIM < 1uM), Class 2:moerately active (1 < QIM < 20uM), Class 3:inactive (QIM > 20uM)

Classification

GemmaTuron

https://github.com/GemmaTuron

https://hub.docker.com/r/ersiliaos/eos24jm

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos24jm.zip

Local

Yes

6/4/2023

2023

rxn-fingerprint

Ready

RXNFP - chemical reaction fingerprints

RXNFP uses a pre-trained BERT Language Model to transform a reaction represented as smiles into a fingerprint amenable for downstream applications. The authors show how the RXN-fps can be used to identify nearest neighbors on reaction datasets, or map the reaction space without knowing the reaction centers.

Compound

Single

Descriptor

Matrix

Float

Pretrained

Fingerprint

Embedding

Chemical synthesis

https://github.com/ersilia-os/eos6aun

https://www.nature.com/articles/s42256-020-00284-w

https://github.com/rxn4chemistry/rxnfp/tree/master/

MIT

Fingerprint of the reaction.

Representation

samuelmaina

https://github.com/samuelmaina

https://hub.docker.com/r/ersiliaos/eos6aun

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6aun.zip

Local

28/3/2023

2023

ncats-hlm

Ready

Human Liver Microsomal Stability

The Human Liver Microsomal assay takes into account the liver-mediated drug metabolism to assess the stability of a compound in the human body. The NIH-NCATS group took a proprietary dataset of 4300 compounds with its associated HLM (in vitro half-life; unstable ≤ 30 min, stable >30 min) and used it to train a classifier.

Compound

Single

Probability

Single

Float

Pretrained

Metabolism

ADME

Human

Microsomal stability

Half-life

https://github.com/ersilia-os/eos31ve

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00426-7

https://github.com/ncats/ncats-adme/tree/master

None

Probability of a compound being unstable in a HLM assay (half-life ≤ 30min)

Classification

pauline-banye

https://github.com/pauline-banye

https://hub.docker.com/r/ersiliaos/eos31ve

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos31ve.zip

Local

Yes

27/3/2023

2023

s2dv-hepg2-toxicity

Ready

S2DV HepG2 toxicity

The model uses Word2Vec, a natural language processing technique to represent SMILES strings. The model was trained on over <2000 small molecules with associated experimental HepG2 cytotoxicity data (IC50) to classify compounds as HepG2 toxic (IC50 <= 30 uM) or non-toxic. Data was gathered from the public repository ChEMBL.

Compound

Single

Experimental value

Single

Float

Pretrained

ChEMBL

IC50

Toxicity

https://github.com/ersilia-os/eos2fy6

https://pubmed.ncbi.nlm.nih.gov/35062019/

https://github.com/NTU-MedAI/S2DV

Apache-2.0

Probability of HepG2 Toxicity (IC50 < 30 uM)

Classification

emmakodes

https://github.com/emmakodes

https://hub.docker.com/r/ersiliaos/eos2fy6

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2fy6.zip

Local

27/3/2023

2023

hob-pre

Ready

Human oral bioavailability prediction

HobPre predicts the oral bioavailability of small molecules in humans. It has been trained using public data on ~1200 molecules (Falcón-Cano et al, 2020, complemented with other literature and ChEMBL compounds). The molecules were labeled according to two cut-offs: HOB > 20% and HOB > 50%, due to ongoing discussions as to which would be a more appropriate cut-off.

Compound

Single

Probability

List

Float

Pretrained

ADME

Solubility

Human

https://github.com/ersilia-os/eos2lqb

https://doi.org/10.1186/s13321-021-00580-6

https://github.com/whymin/HOB

None

Probability of a compound having high oral bioavailability (HOB >20% and HOB >50%)

Classification

HellenNamulinda

https://github.com/HellenNamulinda

https://hub.docker.com/r/ersiliaos/eos2lqb

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2lqb.zip

Local

Yes

27/3/2023

2023

redial-2020

Ready

SARS-CoV-2 antiviral prediction: REDIAL-2020

Predictor of several endpoints related to Sars-CoV-2. It provides predictions for Live Virus Infectivity, Viral Entry, Viral Replication, In Vitro Infectivity and Human Cell Toxicity using a combination of three models. Consensus results are obtained by averaging the prediction for the three different models for each activity and toxicity models. The models have been built using NCATS COVID19 data. Further details on result interpretations can be found here: https://drugcentral.org/Redial

Compound

Single

Probability

Single

Float

Pretrained

Sars-CoV-2

COVID19

Antiviral activity

https://github.com/ersilia-os/eos8fth

https://www.nature.com/articles/s42256-021-00335-w#Sec9

https://github.com/sirimullalab/redial-2020/tree/v1.0

MIT

The model returns the probability of 1 (active) in each assay. Good drugs are active in CPE, 3CL and are inactive in cytotox, hCYTOX and ACE2 and/or are active in at least one of the following: AlphaLISA, CoV-PPE, MERS-PPE, while inactive in the counter screen, respectively: TruHit, CoV-PPE_cs, MERS-PPE_cs.

Classification

Pradnya2203

https://github.com/Pradnya2203

https://hub.docker.com/r/ersiliaos/eos8fth

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8fth.zip

Local

Yes

27/3/2023

2023

s2dv-hbv

Ready

Inhibition of Hepatits B virus

The model uses Word2Vec, a natural language processing technique to represent SMILES strings. The model was trained on over <4000 small molecules with associated experimental HBV inhibition data (IC50) to classify compounds as HBV inhibitors (IC50 <= 1 uM) or non-inhibitors. Data was gathered from the public repository ChEMBL.

Compound

Single

Experimental value

Single

Float

Pretrained

Antiviral activity

IC50

HBV

ChEMBL

https://github.com/ersilia-os/eos8lok

https://pubmed.ncbi.nlm.nih.gov/35062019/

https://github.com/NTU-MedAI/S2DV

Apache-2.0

Probability of inhibition of HBV (IC50 < 1uM)

Classification

emmakodes

https://github.com/emmakodes

https://hub.docker.com/r/ersiliaos/eos8lok

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8lok.zip

Local

Yes

24/3/2023

2023

ncats-hlcs

Ready

Human Liver Cytosolic Stability

The human liver cytosol stability model is used for predicting the stability of a drug in the cytosol of human liver cells, which is beneficial for identifying potential drug candidates early during the drug discovery process. If a drug compound is quickly absorbed, it may not reach the intended target in the body or become toxic. On the other hand, if a drug compound is too stable, it could accumulate and cause detrimental effects. The authors use an NCATS dataset of 1450 compounds screened in

Compound

Single

Probability

Single

Float

Pretrained

ADME

Metabolism

Human

Half-life

https://github.com/ersilia-os/eos9yy1

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00426-7

https://github.com/ncats/ncats-adme

None

Probability of a compound being unstable (half-life ≤ 30min) due to liver cells metabolism

Classification

pauline-banye

https://github.com/pauline-banye

https://hub.docker.com/r/ersiliaos/eos9yy1

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9yy1.zip

Local

Yes

1/3/2023

2023

idl-ppbopt

Ready

Human Plasma Protein Binding (PPB) of Compounds

IDL-PPB aims to obtain the plasma protein binding (PPB) values of a compound. Based on an interpretable deep learning model and using the algorithm fingerprinting (AFP) this model predicts the binding affinity of the plasma protein with the compound.

Compound

Single

Experimental value

Single

Float

Pretrained

Fraction bound

ADME

https://github.com/ersilia-os/eos22io

https://pubs.acs.org/doi/10.1021/acs.jcim.2c00297

https://github.com/Louchaofeng/IDL-PPBopt

GPL-3.0

This model receives smiles as input and returns as output the fraction PPB, which measures the affinity of the binding of the plasma protein. In the analysis of results by the author, they indicate high affinity (fraction of ppb >80%), medium affinity (40% <= fraction of ppb <=80%) and as low levels of affinity (fraction of ppb < 40%). Note: Inorganics and salts are out of the applicability domain of the model, So for these compounds the output is Null.

Regression

carcablop

https://github.com/carcablop

https://hub.docker.com/r/ersiliaos/eos22io

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos22io.zip

Local

3/2/2023

2023

ncats-solubility

Ready

Aqueous Kinetic Solubility

Kinetic aqueous solubility (μg/mL) was experimentally determined using the same SOP in over 200 NCATS drug discovery projects. A final dataset of 11780 non-redundant molecules and their associated solubility was used to train a SVM classifier. Approximately half of the dataset has poor solubility (< 10 μg/mL), and two-thirds of these low soluble molecules report values of < 1 μg/mL. A subset of the data used is available at PubChem (AID 1645848).

Compound

Single

Probability

Single

Float

Pretrained

ADME

Solubility

https://github.com/ersilia-os/eos74bo

https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext

https://github.com/ncats/ncats-adme

None

Probability of a compound having poor solublibity (< 10 µg/ml)

Classification

pauline-banye

https://github.com/pauline-banye

https://hub.docker.com/r/ersiliaos/eos74bo

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos74bo.zip

Local

Yes

31/1/2023

2023

ncats-pampa5

Ready

Parallel Artificial Membrane Permeability Assay 5

Parallel Artificial Membrane Permeability is an in vitro surrogate to determine the permeability of drugs across cellular membranes. PAMPA at pH 5 was experimentally determined in a dataset of 5,473 unique compounds by the NIH-NCATS. 50% of the dataset was used to train a classifier (SVM) to predict the permeability of new compounds, and validated on the remaining 50% of the data, rendering an AUC = 0.88. The Peff was converted to logarithmic, log Peff value lower than 2.0 were considered to hav

Compound

Single

Probability

Single

Float

Pretrained

ADME

Permeability

LogP

https://github.com/ersilia-os/eos81ew

https://www.sciencedirect.com/science/article/pii/S0968089621005964

https://github.com/ncats/ncats-adme

None

Probability of a compound being poorly permeable (logPeff < 1)

Classification

pauline-banye

https://github.com/pauline-banye

https://hub.docker.com/r/ersiliaos/eos81ew

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos81ew.zip

Local

Yes

29/1/2023

2023

image-mol-gpcr

Ready

imagemol-gpcr

ImageMol is a Representation Learning Framework that utilizes molecule images for encoding molecular inputs as machine readable vectors for downstream tasks such as bio-activity prediction, drug metabolism analysis, or drug toxicity prediction. The approach utilizes transfer learning, that is, pre-training the model on massive unlabeled datasets to help it in generalizing feature extraction and then fine tuning on specific tasks. This model is fine tuned on 10 GPCR assays with the largest number

Compound

Single

Score

Single

Float

Pretrained

Target identification

GPCR

https://github.com/ersilia-os/eos93h2

https://www.nature.com/articles/s42256-022-00557-6

https://github.com/HongxinXiang/ImageMol

MIT

Binding activity prediction (as a regression task) for the following GPCR assays: 5HT1A, 5HT2A, AA1R, AA2AR, AA3R, CNR2, DRD2, DRD3, HRH3, OPRM

Regression

DhanshreeA

https://github.com/DhanshreeA

https://hub.docker.com/r/ersiliaos/eos93h2

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos93h2.zip

Local

25/1/2023

2023

datamol-smiles2canonical

Ready

Converter of SMILES in Canonical, Selfie, Inchi, Inchi Key form

Using the Datamol package, the model receives a SMILE as input, then goes through a process of sanitizing and standardization of the molecule to generate four outputs: Canonical SMILES, SELFIES, InChI and InChIKey

Compound

Single

Compound

Matrix

String

Pretrained

Chemical notation

https://github.com/ersilia-os/eos7qga

https://doc.datamol.io/stable/tutorials/Preprocessing.html

https://github.com/datamol-org/datamol

Apache-2.0

Compound represented in its canonical SMILES, SELFIES, InChI and InChIKey forms

Representation

carcablop

https://github.com/carcablop

https://hub.docker.com/r/ersiliaos/eos7qga

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7qga.zip

Local

25/1/2023

2023

image-mol-embeddings

Ready

Molecular representation learning

Representation Learning Framework that utilizes molecule images for encoding molecular inputs as machine readable vectors for downstream tasks such as bio-activity prediction, drug metabolism analysis, or drug toxicity prediction. The approach utilizes transfer learning, that is, pre-training the model on massive unlabeled datasets to help it in generalizing feature extraction and then fine tuning on specific tasks.

Compound

Single

Descriptor

Matrix

Float

Pretrained

Embedding

https://github.com/ersilia-os/eos4avb

https://www.nature.com/articles/s42256-022-00557-6

https://github.com/HongxinXiang/ImageMol

MIT

ImageMol embeddings of shape [1512] reshaped as a Numpy 1D array before serializing. These embeddings can be used as the input features of a fully connected classification or regression layer in a neural network.

Representation

DhanshreeA

https://github.com/DhanshreeA

https://hub.docker.com/r/ersiliaos/eos4avb

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4avb.zip

Local

25/1/2023

2023

sars-cov-2-antiviral-screen

Ready

SARS-CoV-2 Anti viral screening

Compound

Single

Boolean

List

Integer

Pretrained

Sars-CoV-2

Antiviral activity

COVID19

https://github.com/ersilia-os/eos4cxk

https://www.nature.com/articles/s42256-022-00557-6

https://github.com/HongxinXiang/ImageMol

MIT

The output is comprised of binary classification across thirteen assays that are as follows: 3C-like enzymatic activity (3CL), ACE2 enzymatic activity (ACE2), Human Embryonic Kidney 293 Cell line toxicity (HEK293), Human fibroblast toxicity (Human), MERS Pseudotyped particle entry (MERS_PPE), MERS Pseudotyped particle entry counterscreen (MERS_PPE_cs), SarsCov Pseudotyped particle entry (Cov_PPE), SarsCov Pseudotyped particle entry counterscreen (Cov_PPE_cs), SarsCov2 cytopathic effect (COV2_CPE), SarsCov2 cytopathic effect counterscreen (COV2_Cytotox), Spike ACE2 Protein-protein interaction (AlphaLISA), Spike ACE2 Protein-protein interaction counterscreen (TruHit), Transmembrane protease serine 2 enzymatic activity (TMPRSS2)

Classification

DhanshreeA

https://github.com/DhanshreeA

https://hub.docker.com/r/ersiliaos/eos4cxk

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4cxk.zip

Local

Yes

25/1/2023

2023

image-mol-bace

Ready

ImageMol human beta-secretase-1 (BACE-1) inhibition

This model has been developed using ImageMol, a deep learning model pretrained on 10 million unlabelled small molecules and fine-tuned in a second step to predict the binding of inhibitors to the human beta secretase 1 (BACE-1) protein. The BACE-1 dataset from MoleculeNet contains 1522 compounds with their associated pIC50. A compound with pIC50 => 7 is considered a BACE-1 inhibitor.

Compound

Single

Probability

Single

Float

Pretrained

BACE

Chemical graph model

MoleculeNet

https://github.com/ersilia-os/eos8c0o

https://www.nature.com/articles/s42256-022-00557-6

https://github.com/ChengF-Lab/ImageMol

MIT

Probability of BACE-1 inhibition (>0.5: Inhibitor). Compounds with pIC50 => 7 are considered BACE-1 inhibitors

Classification

DhanshreeA

https://github.com/DhanshreeA

https://hub.docker.com/r/ersiliaos/eos8c0o

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8c0o.zip

Local

17/1/2023

2023

image-mol-hiv

Ready

ImageMol HIV growth inhibition

TThis model has been developed using ImageMol, a deep learning model pretrained on 10 million unlabelled small molecules and fine-tuned in a second step to predict the inhibition of the human immunodeficiency virus (HIV). The HIV dataset is from MoleculeNet and contains 43850 small molecules and their in vitro activity against HIV (CA - Confirmed active, CM - Confirmed moderately active, CI - Confirmed inactive). The classification was based on EC50 values and expert knowledge.

Compound

Single

Probability

Single

Float

Pretrained

HIV

Antiviral activity

MoleculeNet

https://github.com/ersilia-os/eos6hy3

https://www.nature.com/articles/s42256-022-00557-6

https://github.com/ChengF-Lab/ImageMol

MIT

Probability of HIV inhibition. Active compounds are considered those classified as CA/CM.

Classification

DhanshreeA

https://github.com/DhanshreeA

https://hub.docker.com/r/ersiliaos/eos6hy3

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6hy3.zip

Local

Yes

17/1/2023

2023

ncats-rlm

Ready

Rat liver microsomal stability

Hepatic metabolic stability is key to ensure the drug attains the desired concentration in the body. The Rat Liver Microsomal (RLM) stability is a good approximation of a compound’s stability in the human body, and NCATS has collected a proprietary dataset of 20216 compounds with its associated RLM (in vitro half-life; unstable ≤30 min, stable >30 min) and used it to train a classifier based on an ensemble of several ML approaches (random forest, deep neural networks, graph convolutional neural

Compound

Single

Probability

Single

Float

Pretrained

Microsomal stability

Rat

ADME

Metabolism

Half-life

https://github.com/ersilia-os/eos5505

https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext

https://github.com/ncats/ncats-adme

None

Probability of a compound being unstable in RLM assay (half-life ≤ 30min)

Classification

pauline-banye

https://github.com/pauline-banye

https://hub.docker.com/r/ersiliaos/eos5505

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5505.zip

Local

Yes

12/1/2023

2023

smiles2iupac

Ready

STOUT: SMILES to IUPAC name translator

Small molecules are represented by a variety of machine-readable strings (SMILES, InChi, SMARTS, among others). On the contrary, IUPAC (International Union of Pure and Applied Chemistry) names are devised for human readers. The authors trained a language translator model treating the SMILES and IUPAC as two different languages. 81 million SMILES were downloaded from PubChem and converted to SELFIES for model training. The corresponding IUPAC names for the 81 million SMILES were obtained with Che

Compound

Single

Text

Single

String

Pretrained

Chemical notation

Chemical language model

https://github.com/ersilia-os/eos4se9

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4

https://github.com/Kohulan/Smiles-TO-iUpac-Translator

MIT

IUPAC name of a specific SMILES

Representation

carcablop

https://github.com/carcablop

https://hub.docker.com/r/ersiliaos/eos4se9

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4se9.zip

Local

9/1/2023

2023

drugtax

Ready

DrugTax: Drug taxonomy

DrugTax takes SMILES inputs and classifies the molecule according to their taxonomy, organic or inorganic kingdom and their subclasses, using a 0/1 binary classification for each one. It generates a vector of 163 features including the taxonomy classification and other key information such as number of carbons, nitrogens… These vectors can be used for subsequent molecular representation in chemoinformatic pipelines.

Compound

Single

Descriptor

List

Integer

Pretrained

Fingerprint

Descriptor

https://github.com/ersilia-os/eos24ci

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-022-00649-w

https://github.com/MoreiraLAB/DrugTax

GPL-3.0

A vector of 163 points, each one corresponding to a particular taxonomic or structural molecular feature

Representation

Femme-js

https://github.com/Femme-js

https://hub.docker.com/r/ersiliaos/eos24ci

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos24ci.zip

Local

3/1/2023

2023

meta-trans

Ready

MetaTrans: human drug metabolites

Small molecules are metabolized by the liver in what is known as phase I and phase II reactions. Those can lead to reduced drug efficacy and generation of toxic metabolites, causing serious side effects. This model predicts the human metabolites of small molecules using a molecular transformer pr-trained on general chemical reactions and fine tuned to human metabolism. It provides up to 10 metabolites for each input molecule.

Compound

Single

Compound

List

String

Pretrained

Metabolism

https://github.com/ersilia-os/eos935d

https://pubs.rsc.org/en/content/articlelanding/2020/sc/d0sc02639e#fn1

https://github.com/KavrakiLab/MetaTrans

BSD-3.0

A maximum of 10 human metabolites generated from the input molecule

Generative

carcablop

https://github.com/carcablop

https://hub.docker.com/r/ersiliaos/eos935d

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos935d.zip

Local

20/12/2022

2022

crem-structure-generation

Ready

CReM fragment based structure generation

CReM (chemically reasonable mutations) is a fragment-based generative model that takes as input a small molecule, breaks it down into fragments and iteratively replaces them with other fragments from a database. It has three implementations (MUTATE: arbitrarily replaces one fragment with another one); GROW (arbitrarily replaces an hydrogen with another fragment) and LINK (replaces hydrogen atoms in two molecules to link them with a fragment). Here, we use a MUTATE and GROWTH approach, which prov

Compound

Single

Compound

List

String

Pretrained

Compound generation

https://github.com/ersilia-os/eos4q1a

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00431-w

https://github.com/DrrDom/crem

BSD-3.0

Up to 100 newly generated molecules

Generative

DhanshreeA

https://github.com/DhanshreeA

https://hub.docker.com/r/ersiliaos/eos4q1a

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4q1a.zip

Local

20/12/2022

2022

moler-enamine-fragments

Ready

Extending molecular scaffolds with fragments

Compound

Single

Compound

List

String

Pretrained

Chemical graph model

Compound generation

https://github.com/ersilia-os/eos9taz

https://arxiv.org/abs/2103.03864

https://github.com/microsoft/molecule-generation

MIT

1000 new molecules are sampled for each input molecule, preserving its scaffold.

Generative

anamika-yadav99

https://github.com/anamika-yadav99

https://hub.docker.com/r/ersiliaos/eos9taz

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9taz.zip

Local

16/11/2022

2022

molt5-smiles-to-caption

Ready

MolT5-Translation between Molecules and Natural Language

MolT5 (Molecular T5) is a self-supervised learning framework pretrained on unlabeled natural language text and molecule strings with two end goals: molecular captioning (given a molecule, generate its description) and text-based de novo molecular generation (given a description, propose a molecule that matches it). This implementation is focused on molecular captioning.

Compound

Single

Text

Single

String

Pretrained

Chemical language model

Chemical notation

https://github.com/ersilia-os/eos2rd8

https://arxiv.org/abs/2204.11817

https://github.com/blender-nlp/MolT5

None

Description of a molecule

Representation

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos2rd8

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2rd8.zip

Local

14/11/2022

2022

bayesian-drug-likeness

Ready

Drug-likeness prediction with Bayesian neural networks

To define drug-likeness, a set of 2136 approved drugs from DrugBank was taken as drug-like, and three negative datasets were selected from ZINC15 (19M), the Network of Organic Chemistry (6M) and ligands from the Protein Data Bank (13k), respectively. The drug dataset was combined with an equal subsampling of the negative dataset for each experiment, using five different molecular representations (Mold2, RDKit, MCS, EXFP4, Mol2Vec). We have re-trained it following the author’s specifications.

Compound

Single

Probability

Single

Float

Retrained

Drug-likeness

https://github.com/ersilia-os/eos9sa2

https://www.nature.com/articles/s42256-020-0209-y

https://github.com/Nanotekton/drugability/tree/v0.1

Non-commercial

Drug-likeness probability

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos9sa2

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9sa2.zip

Local

9/11/2022

2022

molbloom

Ready

MolBloom: molecule purchasability in ZINC20

This model uses a Bloom filter to query the ZINC20 database to identify if a molecule is purchasable. A bloom filter is a space-efficient probabilistic data structure to identify whether an element is in a given set. Due to the nature of bloom filters, false negatives are not possible (i.e if the model returns False, the molecule is not purchasable). As stated by the author, if the model returns True the molecule is purchasable with an error rate of 0.0003 (according to the ZINC20 catalog).

Compound

Single

Boolean

Single

String

Pretrained

ZINC

Compound generation

https://github.com/ersilia-os/eos8a5g

https://github.com/whitead/molbloom/blob/main/CITATION.cff

https://github.com/whitead/molbloom

MIT

It returns a boolean (True/False) suggesting whether the molecule is commercially available or not.

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos8a5g

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8a5g.zip

Local

2/11/2022

2022

mesh-therapeutic-use

Ready

MeSH therapeutic use based on chemical structure

Drug function, defined as Medical Subject Headings (MeSH) “therapeutic use” is predicted based on the chemical structure. 6955 non-redundant molecules, pertaining to one of the twelve therapeutic use classes selected, were downloaded from PubChem and used to train a binary classifier. The model provides the probability that a molecule has one of the following therapeutic uses: antineoplastic, cardiovascular, central nervous system (CNS), anti-infective, gastrointestinal, anti-inflammatory, derma

Compound

Single

Probability

List

Float

In-house

Therapeutic indication

https://github.com/ersilia-os/eos238c

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819987/

https://github.com/jgmeyerucsd/drug-class

GPL-3.0

Probability that the molecule belongs to each therapeutic use specified.

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos238c

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos238c.zip

Local

17/10/2022

2022

admetlab-2

Ready

ADMETlab-2

ADMETLab2 is the improved version of ADMETLab, a suite of models for systematic evaluation of ADMET properties. ADMETLab2 provides predictions on 17 physicochemical properties, 13 medicinal chemistry properties, 23 ADME properties, 27 toxicity endpoints and 8 toxicophore rules. The code and training data are not released, using this model posts predictions to the ADMETLab2 online server. The Ersilia Model Hub also offers ADMETLab (v1) as a downloadable package for IP-sensitive queries.

Compound

Single

Experimental value

Probability

List

Float

Online

Toxicity

ADME

Lipophilicity

Solubility

Permeability

https://github.com/ersilia-os/eos2v11

https://academic.oup.com/nar/article/49/W1/W5/6249611?login=false

https://admetmesh.scbdd.com/

Proprietary

Predicted relevant ADMET properties, Tox21 outcomes, physicochemical properties and drug-likeness. Outputs are of mixed type, including classification (labels) and continuous values.

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos2v11

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2v11.zip

Local

16/9/2022

2022

metabokiller

Ready

Carcinogenic potential of metabolites and small molecules

Carcinogenicity is a result of several potential effects on cells. This model predicts the carcinogenic potential of a small molecule based on their potential to induce cellular proliferation, genomic instability, oxidative stress, anti-apoptotic responses and epigenetic alterations. Metabokiller uses the Chemical Checker signaturizer to featurize the molecules, and the Lime package to provide interpretable results. Using Metabokiller, the authors screened a panel of human metabolites and exper

Compound

Single

Probability

List

Float

Pretrained

Toxicity

Cancer

Metabolism

https://github.com/ersilia-os/eos1579

https://doi.org/10.1038/s41589-022-01110-7

https://github.com/the-ahuja-lab/Metabokiller

Non-commercial

Probability that the molecule has each of the specified carcinogenic properties

Classification

brosular

https://github.com/brosular

https://hub.docker.com/r/ersiliaos/eos1579

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1579.zip

Local

30/8/2022

2022

bidd-molmap-desc

Ready

Molecular maps based on broadly learned knowledge-based representations

Molecular representation of small molecules via descriptor-based molecular maps (images). The fingerprint-based molecular maps are available at eos59rr. These images can be used as inputs for an image-based deep learning model such as a convolutional neural network. The authors have demonstrated high performance of MolMap out-of-the-box with a broad range of tasks from MoleculeNet.

Compound

Single

Image

Descriptor

Matrix

Float

Pretrained

Descriptor

https://github.com/ersilia-os/eos6m4j

https://www.nature.com/articles/s42256-021-00301-6

https://github.com/shenwanxiang/bidd-molmap

GPL-3.0

Image representation of a molecule. Each pixel represents a molecular feature

Generative

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos6m4j

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6m4j.zip

Local

25/8/2022

2022

maip-malaria

Ready

MAIP: antimalarial activity prediction

Prediction of the antimalarial potential of small molecules. This model is an ensemble of smaller QSAR models trained on proprietary data from various sources, up to a total of >7M compounds. The training sets belong to Evotec, Johns Hopkins, MRCT, MMV - St. Jude, AZ, GSK, and St. Jude Vendor Library. The code and training data are not released, using this model posts predictions to the MAIP online server. The Ersilia Model Hub also offers MAIP-surrogate as a downloadable package for IP-sensitiv

Compound

Single

Score

Single

Float

Online

P.falciparum

Malaria

Antimicrobial activity

https://github.com/ersilia-os/eos4zfy

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00487-2

https://www.ebi.ac.uk/chembl/maip/

None

Higher score indicates higher antimalarial potential

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos4zfy

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4zfy.zip

Local

Yes

18/8/2022

2022

chembl-similarity

Ready

Similarity search in ChEMBL

Given a molecule, this model looks for its 100 nearest neighbors in the ChEMBL database, according to ECFP4 Tanimoto similarity. Due to size constraints, the model redirects queries to the ChEMBL server, so when using this model predictions are posted online.

Compound

Single

Compound

List

String

Online

ChEMBL

Similarity

https://github.com/ersilia-os/eos2a9n

https://www.frontiersin.org/articles/10.3389/fchem.2020.00046/full

http://130.92.106.217:8080/chemblMuti.v1/

None

List of 100 nearest neighbors

Similarity

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos2a9n

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2a9n.zip

Local

18/8/2022

2022

medchem17-similarity

Ready

Similarity search in ChEMBL, DrugBank and UNPD

Given a molecule, this model for its 100 nearest neighbors, according to ECFP4 Tanimoto similarity, in the medicinal chemistry database ChEMBL17_DrugBank17_UNPD17. This combined database contains all the compounds from the three collections (DrugBank, ChEMBL22 and Universal natural product directory (UNPD)) with up to 17 heavy atoms. It features a total of 128k compounds. The whole ChEMBL17_DrugBank17_UNPD17 database is not downloaded with the model, by using it you post queries to an online ser

Compound

Single

Compound

List

String

Online

Similarity

ChEMBL

DrugBank

https://github.com/ersilia-os/eos9c7k

https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201900031

https://gdb-medchem-simsearch.gdb.tools/

None

List of 100 nearest neighbors

Similarity

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos9c7k

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9c7k.zip

Local

18/8/2022

2022

gdbmedchem-similarity

Ready

GDBMedChem similarity search

The model looks for 100 nearest neighbors of a given molecule, according to ECFP4 Tanimoto similarity, in the GDBMedChem database. GDBMedChem is a 10M molecule-sampling from GDB17, a database containing all the enumerated molecules of up to 17 atoms heavy atoms (166.4B molecules). GDBMedChem compounds have reduced complexity and better synthetic accessibility than GDB17 but retain high sp3 carbon fraction and natural product likeness, providing a database of diverse molecules for drug design. Th

Compound

Single

Compound

List

String

Online

Similarity

ChEMBL

https://github.com/ersilia-os/eos7jlv

https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201900031

https://gdb-medchem-simsearch.gdb.tools/

None

List of 100 nearest neighbors

Similarity

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos7jlv

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7jlv.zip

Local

18/8/2022

2022

gdbchembl-similarity

Ready

GDBChEMBL similarity search

The model looks for 100 nearest neighbors of a given molecule, according to ECFP4 Tanimoto similarity, in the GDBChEMBL database. GDBChEMBL is a 10M molecule-sampling from GDB17, a database containing all the enumerated molecules of up to 17 atoms heavy atoms (166.4B molecules). GDBChEMBL compounds were selected using a ChEMBL-likeness score, with the objective of having a collection with higher synthetic accessibility and high bioactivity while maintaining continuous coverage of the GDB17 chemi

Compound

Single

Compound

List

String

Online

Similarity

ChEMBL

https://github.com/ersilia-os/eos4b8j

https://www.frontiersin.org/articles/10.3389/fchem.2020.00046/full

https://gdb-chembl-simsearch.gdb.tools/

None

List of 100 nearest neighbors

Similarity

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos4b8j

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4b8j.zip

Local

15/8/2022

2022

chemical-vae

Ready

Variational autoencoder for small molecule generation

This variational autoencoder (VAE) for chemistry uses an encoder-decoder-predictor framework to predict new small molecules. The input SMILES molecule is converted into a continuous vector, and the decoder converts this molecular representation back to a discrete SMILES. These continuous molecular representations allow for simple operations to generate new chemical matter. The decoder is constrained to produce valid molecules. In addition, a predictor estimates the chemical properties of the mol

Compound

Single

Compound

List

String

Pretrained

Compound generation

https://github.com/ersilia-os/eos3ae7

https://pubs.acs.org/doi/10.1021/acscentsci.7b00572

https://github.com/aspuru-guzik-group/chemical_vae

Apache-2.0

Compounds generated based on the input molecule

Generative

brosular

https://github.com/brosular

https://hub.docker.com/r/ersiliaos/eos3ae7

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ae7.zip

Local

13/8/2022

2022

chemnet-distance

Ready

FCD: Fréchet ChemNet Distance to evaluate generative models

The Fréchet ChemNet distance is a metric to evaluate generative models. It unifies, in a single score, whether the generated molecules are valid according to chemical and biological properties as well as their diversity from the training set. The score measures the Fréchet Inception Distance between molecules represented by ChemNet, a deep neural network trained to predict biological and chemical properties of small molecules.

Compound

Pair of Lists

Distance

Single

Float

Pretrained

Similarity

Bioactivity profile

Compound generation

https://github.com/ersilia-os/eos9be7

https://pubs.acs.org/doi/10.1021/acs.jcim.8b00234

https://github.com/bioinf-jku/FCD

LGPL-3.0

Frechet ChemNet Distance (FCD). Higher FCD indicates higher difference to the training set

Similarity

brosular

https://github.com/brosular

https://hub.docker.com/r/ersiliaos/eos9be7

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9be7.zip

Local

12/8/2022

2022

bayesherg

Ready

BayeshERG: hERG channel blockade

BayeshERG is a predictor of small molecule-induced blockade of the hERG ion channel. To increase its predictive power, the authors pretrained a bayesian graph neural network with 300,000 molecules as a transfer learning exercise. The pretraining set was obtained from Du et al, 2015, and the fine tuning dataset is a collection of 14,322 molecules from public databases (8488 positives and 5834 negatives). The model was validated on external datasets and experimentally, from 12 selected compounds (

Compound

Single

Probability

Single

Float

Pretrained

hERG

Toxicity

Cardiotoxicity

https://github.com/ersilia-os/eos4tcc

https://academic.oup.com/bib/article-abstract/23/4/bbac211/6609519

https://github.com/GIST-CSBL/BayeshERG

GPL-3.0

Probability of hERG channel blockade. The cut-off used in the training set to define hERG blockade was IC50 <= 10 μM

Classification

azycn

https://github.com/azycn

https://hub.docker.com/r/ersiliaos/eos4tcc

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4tcc.zip

Local

10/8/2022

2022

rexgen

Ready

Organic reaction outcome prediction

Utilizes a Weisfeiler-Lehman network (attentive mechanism) to predict the products of an organic reaction given the reactants. The model identifies the reaction centers (set of atoms/bonds that change from reactant to product) and obtains the products directly from a graph-based neural network.

Compound

List

Compound

Flexible List

String

Pretrained

Chemical synthesis

https://github.com/ersilia-os/eos5qfo

https://arxiv.org/pdf/1709.04555v3.pdf

https://github.com/connorcoley/rexgen_direct

GPL-3.0

Products of an organic reaction

Generative

svolk19-stanford

https://github.com/svolk19-stanford

https://hub.docker.com/r/ersiliaos/eos5qfo

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5qfo.zip

Local

8/8/2022

2022

deepsmiles

Ready

DeepSMILES, an alternate SMILES representation for deep learning

DeepSMILES converts a SMILES string to a more accurate syntax for molecule representation, taking into account both the branches (closed parenthesis in the SMILES strings) and rings (using a single symbol at ring closure that also indicates ring size). This syntax is particularly suitable in generative models, when the output is a SMILES string. With DeepSMILES, scientists can train a network using this new syntax, generate new molecules represented as DeepSMILES and then decode them back to nor

Compound

Single

Compound

Single

String

Pretrained

Chemical language model

Chemical notation

https://github.com/ersilia-os/eos2mrz

https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c73ed6567dfe7e5fec388d/original/deep-smiles-an-adaptation-of-smiles-for-use-in-machine-learning-of-chemical-structures.pdf

https://github.com/baoilleach/deepsmiles

MIT

String representing a DeepSMILES

Representation

brosular

https://github.com/brosular

https://hub.docker.com/r/ersiliaos/eos2mrz

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2mrz.zip

Local

28/7/2022

2022

admetlab

Ready

ADMETlab models for evaluation of drug candidates

A series of models for the systematic ADMET evaluation of drug candidate molecules. Models include blood-brain barrier penetration; inhibition and substrate affinity for CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP3A4, and pgp; F 20% and F 30% bioavailability; human intestinal absorption; Ames mutagenicity; skin sensitization; plasma protein binding; volume distribution; LD50 of acute toxicity; human hepatotoxicity; hERG blocking; clearance; half-life; Papp (caco-2 permeability); LogD distribution coeff

Compound

Single

Experimental value

List

Float

Pretrained

ADME

Toxicity

Lipophilicity

Solubility

Permeability

https://github.com/ersilia-os/eos2re5

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0283-x

https://github.com/ifyoungnet/ADMETlab

GPL-3.0

Regression models provide a numerical result (LogS (log mol/L), LogP (distribution coefficient), Papp (Caco-2 permeability in cm/s), PPB (%)). Classifications provide the probability of activity according to ADMETlab thresholds.

Classification

svolk19-stanford

https://github.com/svolk19-stanford

https://hub.docker.com/r/ersiliaos/eos2re5

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2re5.zip

Local

28/7/2022

2022

deepherg

Ready

Classification of hERG blockers and nonblockers

This model used a multitask deep neural network (DNN) to predict the probability that a molecule is a hERG blocker. It was trained using 7889 compounds with experimental data available (IC50). The checkpoints of the pretrained model were not available, therefore we re-trained the model using the same method but without mol2vec featuriztion. Molecule featurization was instead done with Morgan fingerprints. Six models were tested, with several thresholds for negative decoys (10, 20, 40, 60, 80 and

Compound

Single

Probability

Single

Float

Retrained

Toxicity

hERG

Cardiotoxicity

https://github.com/ersilia-os/eos30gr

https://pubs.acs.org/doi/full/10.1021/acs.jcim.8b00769

https://github.com/ChengF-Lab/deephERG

None

Probability of hERG blockade. Actives are defined as IC50<10, inactives are defined as IC50>80

Classification

azycn

https://github.com/azycn

https://hub.docker.com/r/ersiliaos/eos30gr

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos30gr.zip

Local

22/7/2022

2022

aizynthfinder

Ready

Retrosynthesis planning

A tool for planning retrosynthesis of a target molecule based on template reactions and a stock of precursors. The algorithm breaks down the input molecule into purchasable blocks until it has been completely solved.

Compound

Single

Score

Flexible List

String

Float

Pretrained

Synthetic accessibility

Chemical synthesis

https://github.com/ersilia-os/eos526j

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00472-1

https://github.com/MolecularAI/aizynthfinder

MIT

The fraction of solved precursors and the number of reactions required for synthesis. Close to 1.0 for a solved compound, less than 0.8 for unsolved.

Generative

svolk19-stanford

https://github.com/svolk19-stanford

https://hub.docker.com/r/ersiliaos/eos526j

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos526j.zip

Local

19/7/2022

2022

selfies

Ready

SELF-referencIng Embedded Strings

String representation of small molecules that is more robust than SMILES, since, by design, all SELFIES strings are valid molecules. It is particularly helpful when applied in generative models, as all the SELFIES proposed are valid molecules. The authors also found that on generative models, SELFIES produces more diverse molecules than compared to SMILES.

Compound

Single

Compound

Single

String

Pretrained

Chemical notation

Chemical language model

Compound generation

https://github.com/ersilia-os/eos6pbf

https://arxiv.org/pdf/1905.13741

https://github.com/aspuru-guzik-group/selfies

Apache-2.0

String representation of a molecule (SELFIE)

Representation

brosular

https://github.com/brosular

https://hub.docker.com/r/ersiliaos/eos6pbf

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6pbf.zip

Local

14/7/2022

2022

pkasolver

Ready

Microstate pKa values

This model employs transfer learning with graph neural networks in order to predict micro-state pKa values of small molecules. The model enumerates the molecule's protonation states and predicts its pKa values. It was trained in two phases, first, using a large ChEMBL dataset and then fine-tuning the model for a small training set of molecules with available pKa values. The model in this repository is the pkasolver-light, which does not require an Epik license and is limited to monoprotic molecu

Compound

Single

Experimental value

Single

Float

Pretrained

pKa

ADME

https://github.com/ersilia-os/eos2b6f

https://www.biorxiv.org/content/10.1101/2022.01.20.476787v1

https://github.com/mayrf/pkasolver

MIT

Acidity of a molecule (lower pKa indicates stronger acid)

Regression

svolk19-stanford

https://github.com/svolk19-stanford

https://hub.docker.com/r/ersiliaos/eos2b6f

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2b6f.zip

Local

13/7/2022

2022

grover-qm8

Ready

Electronic spectra and excited state energy

Prediction of the electronic spectra and excited state energy of small molecules. The training set is the QM8 from Molecule Net, where the electronic properties have been calculated by multiple quantum mechanic methods. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound

Single

Other value

List

Float

Pretrained

MoleculeNet

Chemical graph model

Quantum properties

https://github.com/ersilia-os/eos3xip

https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html

https://github.com/tencent-ailab/grover

MIT

Predicted electronic spectra and excited state energy

Regression

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos3xip

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3xip.zip

Local

Yes

13/7/2022

2022

grover-qm7

Ready

Atomization energy of small molecules

The model predicts the atomization energy of a molecule. It has been trained using the QM7 dataset from MoleculeNet, a subset of GDB13 containing all molecules up to 23 atoms (7 heavy atoms + C, S, O, N). This dataset contains the computed atomization energy of 7165 molecules. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound

Single

Other value

Single

Float

Pretrained

MoleculeNet

Chemical graph model

Quantum properties

https://github.com/ersilia-os/eos6o0z

https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html

https://github.com/tencent-ailab/grover

MIT

Atomization energy of the molecue

Regression

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos6o0z

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6o0z.zip

Local

Yes

13/7/2022

2022

grover-lipo

Ready

Octanol/water distribution coefficient

Prediction of octanol/water distribution coefficient (logD at pH 7.4) trained using the Lipophilicity Molecule Net dataset. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound

Single

Experimental value

Single

Float

Pretrained

MoleculeNet

Lipophilicity

ADME

LogD

Chemical graph model

https://github.com/ersilia-os/eos85a3

https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html

https://github.com/tencent-ailab/grover

MIT

Predicted logD at pH 7.4

Regression

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos85a3

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos85a3.zip

Local

Yes

13/7/2022

2022

grover-esol

Ready

Water solubility

Prediction of water solubility data (log solubility in mols per litre) for common organic small molecules. trained using the Molecule Net ESOL dataset.

Compound

Single

Experimental value

Single

Float

Pretrained

Solubility

MoleculeNet

ADME

LogS

Chemical graph model

https://github.com/ersilia-os/eos8451

https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html

https://github.com/tencent-ailab/grover

MIT

Log Solubility (Mols/Litre)

Regression

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos8451

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8451.zip

Local

Yes

13/7/2022

2022

grover-freesolv

Ready

Hydration free energy of small molecules in water

Model based on experimental and calculated hydration free energy of small molecules in water, the FreeSolv dataset from MoleculeNet. Hydration free energies are relevant to understand the binding interaction between a molecule (in solution) into its binding site. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound

Single

Other value

Single

Float

Pretrained

MoleculeNet

Chemical graph model

Quantum properties

https://github.com/ersilia-os/eos157v

https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html

https://github.com/tencent-ailab/grover

MIT

Calculated Hydration Free energy in kcal/mol

Regression

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos157v

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos157v.zip

Local

Yes

13/7/2022

2022

grover-toxcast

Ready

ToxCast toxicity panel

Prediction across the ToxCast toxicity panel, containing hundreds of toxicity outcomes, as part of the MoleculeNet benchmark. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound

Single

Probability

List

Float

Pretrained

Toxicity

ToxCast

Chemical graph model

https://github.com/ersilia-os/eos481p

https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html

https://github.com/tencent-ailab/grover

MIT

Probability of toxicity against 617 biological targets

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos481p

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos481p.zip

Local

Yes

13/7/2022

2022

grover-bace

Ready

BACE-1 inhibition

Prediction of Beta-secretase 1 (BACE-1) inhibition. BACE-1 is expressed mainly in neurons and has been involved in the development of Alzheimer's disease. This model has been trained on the BACE dataset from MoleculeNet using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound

Single

Probability

Single

Float

Pretrained

Alzheimer

BACE

MoleculeNet

Chemical graph model

https://github.com/ersilia-os/eos2mhp

https://arxiv.org/abs/2007.02835

https://github.com/tencent-ailab/grover

MIT

Probability that the molecule is a BACE-1 inhibitor (using a 0.1 uM cut-off)

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos2mhp

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2mhp.zip

Local

Yes

13/7/2022

2022

grover-clintox

Ready

Toxicity at clinical trial stage

Using the Molecule Net dataset ClinTox, the authors trained a classification model to predict the likelihood of failure in clinical trials due to toxicity. The dataset has been built using FDA approved drugs (non-toxic) and a set of drugs that have failed at advanced clinical trial stages. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound

Single

Probability

List

Float

Pretrained

Toxicity

MoleculeNet

Chemical graph model

Side effects

https://github.com/ersilia-os/eos6fza

https://arxiv.org/abs/2007.02835

https://github.com/tencent-ailab/grover

MIT

Probability that a molecule is approved by the FDA and probability that a molecule shows toxicity in clinical trials

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos6fza

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6fza.zip

Local

Yes

13/7/2022

2022

grover-tox21

Ready

Predicts activity of compounds accross the Tox21 panel

Predicts activity of compounds in the Tox21 toxicity panel, comprising of 12 toxicity pathways, as part of the MoleculeNet benchmark datasets. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound

Single

Probability

List

Float

Pretrained

Tox21

Toxicity

Chemical graph model

https://github.com/ersilia-os/eos5smc

https://papers.nips.cc/paper/2020/file/94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf

https://github.com/tencent-ailab/grover

MIT

Toxicity measurements against 12 biological targets

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos5smc

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5smc.zip

Local

Yes

12/7/2022

2022

sa-score

Ready

Synthetic accessibility score

Estimation of synthetic accessibility score (SAScore) of drug-like molecules based on molecular complexity and fragment contributions. The fragment contributions are based on a 1M sample from PubChem and the molecular complexity is based on the presence/absence of non-standard structural features. It has been validated comparing the SAScore and the estimates of medicinal chemist experts for 40 molecules (r2 = 0.89). The SAScore has been contributed to the RDKit Package.

Compound

Single

Score

Single

Float

Pretrained

Synthetic accessibility

Chemical synthesis

https://github.com/ersilia-os/eos9ei3

https://jcheminf.biomedcentral.com/articles/10.1186/1758-2946-1-8

https://github.com/rdkit/rdkit/tree/master/Contrib/SA_Score

BSD-3.0

Low scores indicate higher synthetic accessibility

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://eos9ei3-tkreo.ondigitalocean.app/

https://hub.docker.com/r/ersiliaos/eos9ei3

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ei3.zip

https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos9ei3

Online

10/7/2022

2022

chemtb

Ready

Mycobacterium tuberculosis inhibitor prediction

Identification of active molecules against Mycobacterium tuberculosis using an ensemble of data from ChEMBL25 (Target IDs 360, 2111188 and 2366634). The final model is a stacking model integrating four algorithms, including support vector machine, random forest, extreme gradient boosting and deep neural networks.

Compound

Single

Probability

Single

Float

Pretrained

M.tuberculosis

IC50

Tuberculosis

Antimicrobial activity

https://github.com/ersilia-os/eos46ev

https://academic.oup.com/bib/article-abstract/22/5/bbab068/6209685

http://cadd.zju.edu.cn/chemtb/

None

Probability of M.tb inhibition (measured as IC50 at cut-off 5 uM)

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos46ev

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos46ev.zip

Local

Yes

28/6/2022

2022

ssl-gcn-tox21

Ready

Toxicity prediction across the Tox21 panel with semi-supervised learning

Toxicity prediction across the Tox21 panel from MoleculeNet, comprising 12 toxicity pathways. The model uses the Mean Teacher Semi-Supervised Learning (MT-SSL) approach to overcome the low number of data points experimentally annotated for toxicity tasks. For the MT-SSL, Tox21 (831 compounds and 12 different endpoints) was used as labeled data and a selection of 50K compounds from other MoleculeNet datasets was used as unlabeled data.

Compound

Single

Probability

List

Float

Pretrained

Tox21

Toxicity

MoleculeNet

https://github.com/ersilia-os/eos69p9

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00570-8

https://github.com/chen709847237/SSL-GCN

None

Probability of toxicity across 12 tasks defined in Tox21

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos69p9

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos69p9.zip

Local

16/6/2022

2022

coprinet-molecule-price

Ready

Small molecule price prediction

CoPriNet has been trained on 2D graph representations of small molecules with their associated price in the Mcule catalog. The predicted price provides a better overview of the compound availability than standard synthetic accessibility scores or retrosynthesis tools. The Mcule catalog is proprietary but the trained model as well as the test dataset (100K) are publicly available.

Compound

Single

Other value

Single

Float

Pretrained

Price

Compound generation

Chemical synthesis

https://github.com/ersilia-os/eos7a45

https://pubs.rsc.org/en/content/articlelanding/2023/dd/d2dd00071g

https://github.com/oxpig/CoPriNet

MIT

Price value prediction

Regression

anamika-yadav99

https://github.com/anamika-yadav99

https://hub.docker.com/r/ersiliaos/eos7a45

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7a45.zip

Local

28/3/2022

2022

deepfl-logp

Ready

Membrane permeability of fluorescent probes

A deep neural network was trained to predict the LogP value of small molecules and fluorescent probes using an experimentally annotated dataset of >13k molecules (OPERA). This dataset was complemented with fluorescent probes to improve the model accuracy in this space. Probes predicted impermeant to cell membranes consistently showed experimental LogP <1.

Compound

Single

Experimental value

Single

Float

Pretrained

Permeability

ADME

LogP

https://github.com/ersilia-os/eos65rt

https://www.nature.com/articles/s41598-021-86460-3.epdf?sharing_token=zmYZd6qpwnDwc8tCOYGGf9RgN0jAjWel9jnR3ZoTv0OXuXXr_ZS6VuKQMyMJiA3PeIcqAJZTcpcNZJHblyChkQ2eTpzGXq23YsIcFlG8ayuEptKCJ1DeyIRGrh9O2d5JvvGGB9qG8cXgAuy_k-e1ncAMkAzpTegmR0XUbnftjv0%3D

https://github.com/k-soliman/DeepFl-LogP

GPL-3.0

LogP values of > 1 indicate membrane permeability

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos65rt

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos65rt.zip

Local

10/11/2021

2021

passive-permeability

Ready

Passive permeability based on simulations

Using Coarse Grained (CG) models, where several atoms are aggregated into a single bead, the authors obtain a set of 500,000 compounds with their simulated permeability across a single-component DOPC lipid bilayer. With this approach, the authors are able to cover a large and representative portion of the chemical space. We have used the data generated in this publication to train a simple regression model to predict compound permeability.

Compound

Single

Experimental value

Single

Float

In-house

Permeability

ADME

Papp

https://github.com/ersilia-os/eos2hbd

https://pubs.acs.org/doi/full/10.1021/acscentsci.8b00718?ref=recommended

None

Permeability coefficient (P). Cut-off: 6

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos2hbd

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2hbd.zip

Local

Yes

10/11/2021

2021

pampa-permeability

Ready

PAMPA effective permeability

The authors provide a dataset of 200 small molecules and their experimentally measured permeability in a PAMPA assay. Using this data, we have trained a model that predicts the logarithm of the effective permeability coefficient.

Compound

Single

Experimental value

Single

Float

In-house

Permeability

ADME

LogP

https://github.com/ersilia-os/eos97yu

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651837/

None

logPe

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos97yu

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos97yu.zip

Local

Yes

10/11/2021

2021

natural-product-fingerprint

Ready

Natural product fingerprint

The model uses a combination of two multilayer perceptron networks (baseline and auxiliar) and an autoencoder-like network to extract natural-product specific fingerprints that outperform traditional methods for molecular representation. The training sets correspond to the coconut database (NP) and the Zinc database (synthetic).

Compound

Single

Descriptor

List

String

Pretrained

Natural product

Fingerprint

Descriptor

https://github.com/ersilia-os/eos6tg8

https://www.sciencedirect.com/science/article/pii/S2001037021003226?via%3Dihub#f0010

https://github.com/kochgroup/neural_npfp

None

Descriptor of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos6tg8

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6tg8.zip

Local

3/11/2021

2021

maip-malaria-surrogate

Ready

MAIP distillation: antimalarial potential prediction

Prediction of the antimalarial potential of small molecules. This model was originally trained on proprietary data from various sources, up to a total of >7M compounds. The training sets belong to Evotec, Johns Hopkins, MRCT, MMV - St. Jude, AZ, GSK, and St. Jude Vendor Library. In this implementation, we have used a teacher-student approach to train a surrogate model based on ChEMBL data (2M molecules) to provide a lite downloadable version of the original MAIP

Compound

Single

Score

Single

Float

Retrained

P.falciparum

Malaria

Antimicrobial activity

https://github.com/ersilia-os/eos2gth

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00487-2

https://www.ebi.ac.uk/chembl/maip/

None

Higher score indicates Higher antimalarial potential

Classification

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos2gth

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2gth.zip

Local

2/11/2021

2021

syba-synthetic-accessibility

Ready

Bayesian prediction of synthetic accessibility

SYBA uses a fragment-based approach to classify whether a molecule is easy or hard to synthesize, and it can also be used to analyze the contribution of individual fragments to the total synthetic accessibility. The easy-to-synthesize dataset is an extract of the ZINC purchasable compounds, and the hard-to-synthesize dataset is generated using a Nonpher approach (introducing small molecular perturbations to transform molecules into more complex compounds). The fragments are calculated with ECFP8

Compound

Single

Score

Single

Float

Pretrained

Synthetic accessibility

Chemical synthesis

https://github.com/ersilia-os/eos7pw8

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00439-2

https://github.com/lich-uct/syba

GPL-3.0

Higher score indicates higher confidence that the molecule is synthetically available

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos7pw8

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7pw8.zip

Local

25/10/2021

2021

natural-product-score

Ready

Natural product score

A simple score to distinguish between natural products (-like) and synthetic compounds. The score was calculated using an analysis of the structural features that distinguish natural products (NP) from synthetic molecules. NP structures were obtained from the CRC Dictionary of Natural products and synthetic molecules belong to an in-house collection. This method has been contributed to the RDKit package, Ersilia is simply implementing the RDKit NP_Score.

Compound

Single

Score

List

Float

Pretrained

Natural product

Drug-likeness

https://github.com/ersilia-os/eos8ioa

http://pubs.acs.org/doi/abs/10.1021/ci700286x

https://github.com/rdkit/rdkit/tree/master/Contrib/NP_Score

BSD-3.0

Higher score indicates higher natural product likeness

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos8ioa

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8ioa.zip

Local

19/10/2021

2021

natural-product-likeness

Ready

Natural product likeness score

The model is a derivation of the natural product fingerprint (eos6tg8). In addition to generating specific natural product fingerprints, the activation value of the neuron that predicts if a molecule is a natural product or not can be used as a NP-likeness score. The method outperforms the NP_Score implemented in RDKit.

Compound

Single

Score

Single

Float

Pretrained

Natural product

Drug-likeness

https://github.com/ersilia-os/eos9yui

https://www.sciencedirect.com/science/article/pii/S2001037021003226?

https://github.com/kochgroup/neural_npfp

None

Higher score indicates higher natural product likeness

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://eos9yui-7xpw3.ondigitalocean.app/

https://hub.docker.com/r/ersiliaos/eos9yui

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9yui.zip

https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos9yui

Online

19/10/2021

2021

retrosynthetic-accessibility

Ready

Retrosynthetic accessibility score

Retrosynthetic accessibility score based on the computer aided synthesis planning tool AiZynthfinder. The authors have selected a ChEMBL subset of 200.000 molecules, and checked whether AiZinthFinder could identify a synthetic route or not. This data has been trained to create a classifier that computes 4500 times faster than the underlying AiZynthFinder. Molecules outside the applicability domain, such as the GBD database, need to be fine tuned to their use case.

Compound

Single

Score

Single

Float

Pretrained

Synthetic accessibility

Chemical synthesis

https://github.com/ersilia-os/eos2r5a

https://pubs.rsc.org/en/content/articlelanding/2021/sc/d0sc05401a

https://github.com/reymond-group/RAscore

MIT

Higher score indicates easier retrosynthetic accessibility

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos2r5a

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2r5a.zip

Local

19/10/2021

2021

soltrannet-aqueous-solubility

Ready

Aqueous solubility prediction

Fast aqueous solubility prediction based on the Molecule Attention Transformer (MAT). The authors used AqSolDB to fine-tune the MAT network to solubility prediction, achieving competitive scores in the Second Challenge to Predict Aqueous Solubility (SC2).

Compound

Single

Experimental value

Single

Float

Pretrained

Solubility

ADME

LogS

https://github.com/ersilia-os/eos6oli

https://pubs.acs.org/doi/10.1021/acs.jcim.1c00331

https://github.com/gnina/SolTranNet

Apache-2.0

Predicted LogS (log of the solubility)

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos6oli

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6oli.zip

Local

Yes

19/10/2021

2021

molgrad-ppb

Ready

Coloring molecules for plasma protein binding prediction

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions. In this model, they train MolGrad with data from a Plasma-protein binding assay (PPB) to predict the fraction bound in plasma of small mo

Compound

Single

Experimental value

Single

Float

Pretrained

ADME

Fraction bound

Chemical graph model

https://github.com/ersilia-os/eos6ao8

https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344

https://github.com/josejimenezluna/molgrad/

AGPL-3.0

Fraction (%) bound in plasma

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos6ao8

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6ao8.zip

Local

Yes

19/10/2021

2021

molgrad-herg

Ready

Coloring molecules for hERG blockade

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions.In this model, they train MolGrad with a dataset of hERG channel blockers/non-blockers to predict the cardiotoxicity of small molecules (I

Compound

Single

Experimental value

Single

Float

Pretrained

hERG

Toxicity

Cardiotoxicity

Chemical graph model

https://github.com/ersilia-os/eos43at

https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344

https://github.com/josejimenezluna/molgrad/

AGPL-3.0

pIC50 of hERG inhibition

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://eos43at-zqx9x.ondigitalocean.app/

https://hub.docker.com/r/ersiliaos/eos43at

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos43at.zip

https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos43at

Online

Yes

19/10/2021

2021

molgrad-caco2

Ready

Coloring molecules for Caco-2 cell permeability

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions. This model has been trained using experimental data on the permeability of molecules across Caco2 cell membranes (Papp, cm s-1)

Compound

Single

Experimental value

Single

Float

Pretrained

Permeability

ADME

Papp

Chemical graph model

https://github.com/ersilia-os/eos1af5

https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344

https://github.com/josejimenezluna/molgrad/

AGPL-3.0

Log 10 of the Passive permeability in cm s-1

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos1af5

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1af5.zip

Local

Yes

19/10/2021

2021

cardiotoxnet-herg

Ready

Ligand-based prediction of hERG blockade

A robust predictor for hERG channel blockade based on an ensemble of five deep learning models. The authors have collected a dataset from public sources, such as BindingDB and ChEMBL on hERG blockers and non-blockers. The cut-off for hERG blockade was set at IC50 < 10 uM for the classifier.

Compound

Single

Probability

Single

Float

Pretrained

hERG

Toxicity

Cardiotoxicity

https://github.com/ersilia-os/eos2ta5

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00541-z

https://github.com/Abdulk084/CardioTox

None

Probability that the compound inhibits hERG (IC50 < 10 uM)

Classification

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos2ta5

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2ta5.zip

Local

18/10/2021

2021

molgrad-cyp3a4

Ready

Coloring molecules for interaction with CYP3A4

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions. This model has been trained using a ChEMBL dataset of CYP450 3A4 inhibitors (0) and non-inhibitors (1).

Compound

Single

Probability

Single

Float

Pretrained

CYP450

ADME

Chemical graph model

https://github.com/ersilia-os/eos96ia

https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344

https://github.com/josejimenezluna/molgrad/

GPL-3.0

Probability that the molecule is metabolized by Cyp3A4 (cut-off: 10 uM)

Classification

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos96ia

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos96ia.zip

Local

Yes

18/10/2021

2021

mycpermcheck

Ready

Membrane permeability in Mycobacterium tuberculosis

MycPermCheck predicts potential to permeate the Mycobacterium tuberculosis cell membrane based on physicochemical properties.

Compound

Single

Probability

Single

Float

Pretrained

Permeability

M.tuberculosis

ADME

Tuberculosis

https://github.com/ersilia-os/eos8d8a

https://academic.oup.com/bioinformatics/article/29/1/62/272745

https://www.mycpermcheck.aksotriffer.pharmazie.uni-wuerzburg.de/index.html

MIT

Probability of permeability across the M.tb cell wall

Classification

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos8d8a

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8d8a.zip

Local

Yes

14/10/2021

2021

padel

Ready

PADEL small molecule descriptors

PaDEL is a commonly used molecular descriptor. It calculates 1875 molecular descriptors (1444 1D and 2D descriptors, 431 3D descriptors) and 12 types of fingerprints for small molecule representation. Originally developed in Java, here we provide PaDDELPy, its python implementation.

Compound

Single

Descriptor

List

Float

Pretrained

Descriptor

https://github.com/ersilia-os/eos7asg

https://onlinelibrary.wiley.com/doi/10.1002/jcc.21707

https://github.com/ecrl/padelpy

MIT

Vector representation of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos7asg

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7asg.zip

Local

27/9/2021

2021

smiles-transformer

Ready

SMILES transformer descriptor

Molecular embedding based on natural language processing. It converts SMILES into fingerprints using an unsupervised model pre-trained on a very large SMILES dataset from ChEMBL. The transformer is particularly well-suited for low-data drug discovery.

Compound

Single

Descriptor

List

Float

Pretrained

Chemical language model

Descriptor

Embedding

https://github.com/ersilia-os/eos2lm8

https://arxiv.org/abs/1911.04738

https://github.com/DSPsleeporg/smiles-transformer

MIT

Vector representation of small molecules

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos2lm8

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2lm8.zip

Local

22/9/2021

2021

mordred

Ready

Mordred chemical descriptors

A set of ca 1,800 chemical descriptors, including both RDKit and original modules. It is comparable to the well known PaDEL-Descriptors (see eos7asg), but has shorter calculation times and can process larger molecules.

Compound

Single

Descriptor

List

Float

Pretrained

Descriptor

https://github.com/ersilia-os/eos78ao

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0258-y

https://github.com/mordred-descriptor/mordred

BSD-3.0

Vector representation of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos78ao

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos78ao.zip

Local

17/9/2021

2021

rdkit-fingerprint

Ready

Path-based fingerprint

Path-based fingerprints calculated with the RDKit package Chem.RDKFingerprint. It is inspired in the Daylight fingerprint. As explained in the RDKit Book, the fingerprinting algorithm identifies all subgraphs in the molecule within a particular range of sizes, hashes each subgraph to generate a raw bit ID, mods that raw bit ID to fit in the assigned fingerprint size, and then sets the corresponding bit.

Compound

Single

Descriptor

List

Float

Pretrained

Fingerprint

Descriptor

https://github.com/ersilia-os/eos7jio

https://www.rdkit.org/docs/RDKit_Book.html

https://github.com/rdkit/rdkit

BSD-3.0

Vector representation of small molecules

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos7jio

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7jio.zip

Local

17/9/2021

2021

molbert

Ready

MolBERT chemical language transformer

Molecular representation using the BERT language Transformer. The model has been pre-trained on the GuacaMol dataset (~1.6M molecules from ChEMBL), and can be fine-tuned to the desired QSAR tasks. It has been benchmarked in MoleculeNet.

Compound

Single

Descriptor

List

Float

Pretrained

Chemical language model

Embedding

Descriptor

https://github.com/ersilia-os/eos2thm

https://arxiv.org/abs/2011.13230

https://github.com/BenevolentAI/MolBERT

MIT

Embedding representation of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos2thm

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2thm.zip

Local

17/9/2021

2021

rdkit-descriptors

Ready

Physicochemical descriptors available from RDKIT

A set of 200 physicochemical descriptors available from the RDKIT, including molecular weight, solubility and druggability parameters. We have used the DescriptaStorus selection of RDKit descriptors for simplicity.

Compound

Single

Descriptor

List

Float

Pretrained

Descriptor

https://github.com/ersilia-os/eos8a4x

https://www.rdkit.org/docs/RDKit_Book.html

https://github.com/bp-kelley/descriptastorus

Proprietary

Vector representation of small molecules

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos8a4x

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8a4x.zip

Local

17/9/2021

2021

avalon

Ready

Avalon fingerprint

Avalon is a path-based substructure key fingerprint (1024 bits), developed for substructure screen-out when searching. It is part of the Avalon Chemoinformatics Toolkit and has also been implemented as an external RDKit tool.

Compound

Single

Descriptor

List

Integer

Pretrained

Fingerprint

https://github.com/ersilia-os/eos8h6g

https://pubs.acs.org/doi/full/10.1021/ci050413p

https://github.com/rdkit/rdkit/tree/master/External/AvalonTools

BSD-3.0

Bitvector representation of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos8h6g

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8h6g.zip

Local

14/9/2021

2021

molecular-weight

Ready

Molecular weight

The model is simply an implementation of the function Descriptors.MolWt of the chemoinformatics package RDKIT. It takes as input a small molecule (SMILES) and calculates its molecular weight in g/mol.

Compound

Single

Other value

Single

Float

Pretrained

Molecular weight

https://github.com/ersilia-os/eos3b5e

https://www.rdkit.org/docs/RDKit_Book.html

https://github.com/rdkit/rdkit

BSD-3.0

Calculated molecular weight (g/mol)

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos3b5e

AMD64

CPU

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3b5e.zip

Local

13/9/2021

2021

morgan-counts

Ready

Morgan counts fingerprints

The Morgan Fingerprints, or extended connectivity fingerprints (ECFP4) are one of the most widely used molecular representations. They are circular representations (from an atom, search the atoms around with a radius n) and can have thousands of features. This implementation uses the RDKit package and is done with radius 3 and 2048 dimensions.

Compound

Single

Descriptor

List

Integer

Pretrained

Fingerprint

Descriptor

https://github.com/ersilia-os/eos5axz

https://www.rdkit.org/docs/RDKit_Book.html

https://github.com/rdkit/rdkit

BSD-3.0

Vector representation of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos5axz

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5axz.zip

Local

30/8/2021

2021

whales-descriptor

Ready

Holistic molecular descriptors for scaffold hopping

Weighted Holistic Atom Localization and Entity Shape (WHALES) is a descriptors based on 3D structure to facilitate natural product featurization. It is aimed at scaffold hopping exercises from natural products to synthetic compounds

Compound

Single

Descriptor

List

Float

Pretrained

Natural product

Descriptor

https://github.com/ersilia-os/eos3ae6

https://www.nature.com/articles/s42004-018-0043-x

https://github.com/ETHmodlab/scaffold_hopping_whales

MIT

Vector representation of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos3ae6

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ae6.zip

Local

15/7/2021

2021

grover-embedding

Ready

Large-scale graph transformer

GROVER is a self-supervised Graph Neural Network for molecular representation pretrained with 10 million unlabelled molecules from ChEMBL and ZINC15. The model provided has been pre-trained on 10 million molecules (GROVERlarge). GROVER has then been fine-tuned to predict several activities from the MoleculeNet benchmark, consistently outperforming other state-of-the-art methods for serveral benchmark datasets.

Compound

Single

Descriptor

List

Float

Pretrained

Chemical graph model

Embedding

Descriptor

https://github.com/ersilia-os/eos7w6n

https://papers.nips.cc/paper/2020/file/94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf

https://github.com/tencent-ailab/grover

MIT

Embedding representation of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos7w6n

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7w6n.zip

Local

Yes

2/7/2021

2021

cc-signaturizer

Ready

Chemical Checker signaturizer

A set of 25 Chemical Checker bioactivity signatures (including 2D & 3D fingerprints, scaffold, binding, crystals, side effects, cell bioassays, etc) to capture properties of compounds beyond their structures. Each signature has a length of 128 dimensions. In total, there are 3200 dimensions. The signaturizer is periodically updated. We use the 2020-02 version of the signaturizer.

Compound

Single

Descriptor

List

Float

Pretrained

Descriptor

Bioactivity profile

Embedding

https://github.com/ersilia-os/eos4u6p

https://www.nature.com/articles/s41467-021-24150-4

http://gitlabsbnb.irbbarcelona.org/packages/signaturizer

MIT

2D projection of bioactivity signatures

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos4u6p

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4u6p.zip

Local

1/7/2021

2021

cdd-descriptor

Ready

Continuous and data-driven descriptors

Low dimension continuous descriptor based on a neural machine translation model. This model has been trained by inputting a IUPAC molecular representation to obtain its SMILES. The intermediate continuous vector representation encoded by when reading the IUPAC name is a representation of the molecule, containing all the information to generate the output sequence (SMILES). This model has been pretrained on a large dataset combining ChEMBL and ZINC.

Compound

Single

Descriptor

List

Float

Pretrained

Descriptor

Chemical language model

https://github.com/ersilia-os/eos7a04

https://pubs.rsc.org/en/content/articlelanding/2019/sc/c8sc04175j

https://github.com/jrwnter/cddd

MIT

Embedding representation of a molecule

Representation

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos7a04

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7a04.zip

Local

1/7/2021

2021

grover-sider

Ready

Adverse Drug Reactions

The model predicts the putative adverse drug reactions (ADR) of a molecule, using the SIDER database (MoleculeNet) that contains pairs of marketed drugs and their described ADRs. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound

Single

Probability

List

Float

Pretrained

Toxicity

MoleculeNet

Side effects

https://github.com/ersilia-os/eos77w8

https://arxiv.org/abs/2007.02835

https://github.com/tencent-ailab/grover

MIT

Predicted ADRs classified in 27 groups

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos77w8

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos77w8.zip

Local

Yes

4/6/2021

2021

grover-bbbp

Ready

Blood-brain barrier penetration

This model predicts the Blood-Brain Barrier (BBB) penetration potential of small molecules using as training data the curated MoleculeNet benchmark containing 2000 experimental data points. It has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound

Single

Probability

Single

Float

Pretrained

Permeability

MoleculeNet

Chemical graph model

Alzheimer

https://github.com/ersilia-os/eos1amr

https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html

https://github.com/tencent-ailab/grover

MIT

Probability that a molecule crosses the blood brain barrier

Classification

Amna-28

https://github.com/Amna-28

https://hub.docker.com/r/ersiliaos/eos1amr

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1amr.zip

Local

Yes

4/6/2021

2021

chembl-multitask-descriptor

Ready

Multi-target prediction based on ChEMBL data

This is a ligand-based target prediction model developed by the ChEMBL team. They trained the model using pairs of small molecules and their protein targets, and produced a multitask predictor. The thresholds of activity where determined by protein families (kinases: <= 30nM, GPCRs: <= 100nM, Nuclear Receptors: <= 100nM, Ion Channels: <= 10μM, Non-IDG Family Targets: <= 1μM). Here we provide the model trained on ChEMBL_28, which showed an accuracy of 85%.

Compound

Single

Probability

List

Float

Pretrained

Bioactivity profile

Target identification

ChEMBL

https://github.com/ersilia-os/eos1vms

http://chembl.blogspot.com/2019/05/multi-task-neural-network-on-chembl.html

https://github.com/chembl/chembl_multitask_model/

None

Probability of having the protein (identified by ChEMBL ID), as target

Classification

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos1vms

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1vms.zip

Local

4/6/2021

2021

etoxpred

Ready

Toxicity and synthetic accessibility prediction

The eToxPred tool has been developed to predict, on one hand, the synthetic accessibility (SA) score, or how easy it is to make the molecule in the laboratory, and, on the other hand, the toxicity (Tox) score, or the probability of the molecule of being toxic to humans. The authors trained and cross-validated both predictors on a large number of datasets, and demonstrated the method usefulness in building virtual custom libraries.

Compound

Single

Score

Single

Float

Pretrained

Toxicity

Synthetic accessibility

https://github.com/ersilia-os/eos92sw

https://bmcpharmacoltoxicol.biomedcentral.com/articles/10.1186/s40360-018-0282-6

https://github.com/pulimeng/eToxPred

GPL-3.0

Higher scores indicate easier synthetic accessibility and higher toxicity, respectively

Regression

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos92sw

AMD64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos92sw.zip

Local

4/6/2021

2021

chemprop-sars-cov-inhibition

Ready

SARS-CoV inhibition

This model was developed to support the early efforts in the identification of novel drugs against SARS-CoV2. It predicts the probability that a small molecule inhibits SARS-3CLpro-mediated peptide cleavage. It was developed using a high-throughput screening against the 3CL protease of SARS-CoV1, as no data was yet available for the new virus (SARS-CoV2) causing the COVID-19 pandemic. It uses the ChemProp model.

Compound

Single

Probability

Single

Float

Pretrained

COVID19

Antiviral activity

Sars-CoV-2

Chemical graph model

https://github.com/ersilia-os/eos9f6t

https://www.sciencedirect.com/science/article/pii/S0092867420301021

http://chemprop.csail.mit.edu/checkpoints

MIT

Probability of 3CL protease inhibition (%) The classifier was trained using a threshold of 12% of inhibition

Classification

miquelduranfrigola

https://github.com/miquelduranfrigola

https://hub.docker.com/r/ersiliaos/eos9f6t

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9f6t.zip

Local

Yes

3/6/2021

2021

chemprop-antibiotic

Ready

Broad spectrum antibiotic activity

Based on a simple E.coli growth inhibition assay, the authors trained a model capable of identifying antibiotic potential in compounds structurally divergent from conventional antibiotic drugs. One of the predicted active molecules, Halicin (SU3327), was experimentally validated in vitro and in vivo. Halicin is a drug under development as a treatment for diabetes.

Compound

Single

Probability

Single

Float

Pretrained

E.coli

IC50

Antimicrobial activity

Chemical graph model

https://github.com/ersilia-os/eos4e40

https://pubmed.ncbi.nlm.nih.gov/32084340/

http://chemprop.csail.mit.edu/checkpoints

MIT

Probability that a compound inhibits E.coli growth. The inhibition threshold was set at 80% growth inhibition in the training set.

Classification

miquelduranfrigola

https://github.com/miquelduranfrigola

https://eos4e40-rovva.ondigitalocean.app/

https://hub.docker.com/r/ersiliaos/eos4e40

AMD64

ARM64

https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4e40.zip

https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos4e40

Local

Yes

6/6/2018

2018

Alert

Lorem ipsum

Okay