Table
Report abuse
Use this data
Sign up for free
Identifier
1
eos7e3s
2
eos74km
3
eos8ub5
4
eos2db3
5
eos9gg2
6
eos3mk2
7
eos9p4a
8
eos39co
9
eos3wzy
10
eos3nn9
11
eos1pu1
12
eos39dp
13
eos6ru3
14
eos6ost
15
eos8aox
16
eos57bx
17
eos5guo
18
eos24ur
19
eos2401
20
eos5gge
21
eos7d58
22
eos694w
23
eos42ez
24
eos21q7
25
eos18ie
26
eos8bhe
27
eos5cl7
28
eos8aa5
29
eos3dq3
30
eos4djh
31
eos35g4
32
eos3ujl
33
eos9uqy
34
eos4f8y
35
eos30d7
36
eos69mr
37
eos8vud
38
eos9aqt
39
eos2fg2
40
eos5iy5
41
eos3nl8
42
eos1xje
43
eos1n4b
44
eos9ym3
45
eos30f3
46
eos5xng
47
eos69e6
48
eos4wt0
49
eos4x30
50
eos1ut3
51
eos9ivc
52
eos9zw0
53
eos633t
54
eos3kcw
55
eos1d7r
56
eos9ueu
57
eos4f95
58
eos2zmb
59
eos1noy
60
eos3le9
61
eos4rta
62
eos2l0q
63
eos3804
64
eos2hzy
65
eos8fma
66
eos1mxi
67
eos7yti
68
eos4qda
69
eos80ch
70
eos3ev6
71
eos7nno
72
eos5jz9
73
eos59rr
74
eos7kpb
75
eos2gw4
76
eos3cf4
77
eos3zur
78
eos9tyg
79
eos44zp
80
eos24jm
81
eos6aun
82
eos31ve
83
eos2fy6
84
eos2lqb
85
eos8fth
86
eos8lok
87
eos9yy1
88
eos22io
89
eos74bo
90
eos81ew
91
eos93h2
92
eos7qga
93
eos4avb
94
eos4cxk
95
eos8c0o
96
eos6hy3
97
eos5505
98
eos4se9
99
eos24ci
100
eos1086
101
eos5ecc
102
eos935d
103
eos4q1a
104
eos9taz
105
eos2rd8
106
eos9sa2
107
eos8a5g
108
eos238c
109
eos2v11
110
eos1579
111
eos6m4j
112
eos4zfy
113
eos2a9n
114
eos9c7k
115
eos7jlv
116
eos4b8j
117
eos3ae7
118
eos9be7
119
eos4tcc
120
eos5qfo
121
eos2mrz
122
eos2re5
123
eos30gr
124
eos526j
125
eos6pbf
126
eos2b6f
127
eos3xip
128
eos6o0z
129
eos85a3
130
eos8451
131
eos157v
132
eos481p
133
eos2mhp
134
eos6fza
135
eos5smc
136
eos9ei3
137
eos46ev
138
eos69p9
139
eos7a45
140
eos0t05
141
eos0t04
142
eos0t00
143
eos65rt
144
eos2hbd
145
eos97yu
146
eos6tg8
147
eos2gth
148
eos7pw8
149
eos8ioa
150
eos9yui
151
eos2r5a
152
eos6oli
153
eos6ao8
154
eos43at
155
eos1af5
156
eos2ta5
157
eos96ia
158
eos8d8a
159
eos7asg
160
eos2lm8
161
eos78ao
162
eos7jio
163
eos2thm
164
eos8a4x
165
eos8h6g
166
eos3b5e
167
eos5axz
168
eos3ae6
169
eos7w6n
170
eos4u6p
171
eos7a04
172
eos77w8
173
eos1amr
174
eos1vms
175
eos92sw
176
eos9f6t
177
eos0t03
178
eos0t02
179
eos0t01
180
eos4e40
Drag to adjust the number of frozen columns
Slug
Status
Repository
Title
Description
Mode
Task
Input
Input Shape
Output
Output Type
Output Shape
Interpretation
Tag
GitHub
Publication
Source Code
License
Host URL
Contributor
Incorporation Date
Contributor Profile
DockerHub
Docker Architecture
S3
DO Deployment
Biomodel Annotation
Runtime
Secrets
Deployment
Incorporation Quarter
Incorporation Year
Docker Pack Method
dili-pred
Archived
Drug-induced liver injury prediction

Prediction of clinically relevant drug-induced-liver-injury (DILI), based solely on drug structure using binary classification methods. The authors collected a public dataset of 475 molecules with associated DILI outcomes, and built a model with an accuracy of 0.89. The model checkpoints have not been provided so Ersilia has used the provided data to retrain the model.

Retrained
Classification
Compound
Single
Probability
Float
Single
Probability that a drug causes DILI
Metabolism
Toxicity
https://github.com/ersilia-os/eos7e3s
https://pubmed.ncbi.nlm.nih.gov/30325042/
https://github.com/cptbern/QSAR_DILI_2019
None
leilayesufu
2/1/2024
https://github.com/leilayesufu
https://hub.docker.com/r/ersiliaos/eos7e3s
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7e3s.zip
Q1
2024
antimicrobial-kg-ml
Ready
Antimicrobial class specificity prediction

Prediction of antimicrobial class specificity using simple machine learning methods applied to an antimicrobial knowledge graph. The knowledge graph is built on ChEMBL, Co-ADD and SPARK. Endpoints are broad terms such as activity against gram-positive or gram-negative bacteria. The best model according to the authors is a Random Forest with MHFP6 fingerprints.

Pretrained
Annotation
Compound
Single
Score
Float
List
Class probabilities for each antimicrobial class
Antimicrobial activity
https://github.com/ersilia-os/eos74km
https://www.biorxiv.org/content/10.1101/2024.12.02.626313v1.full
https://github.com/IMI-COMBINE/broad_spectrum_prediction
MIT
miquelduranfrigola
17/12/2024
https://github.com/miquelduranfrigola
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos74km.zip
Local
Q4
2024
chemical-space-projections-coconut
Ready
Projections against Coconut

This tool performs PCA, UMAP and tSNE projections taking the Coconut natural products database as a chemical space of reference. The Ersilia Compound Embeddings are used as descriptors. Four PCA components and two UMAP and tSNE components are returned.

In-house
Representation
Compound
Single
Value
Float
List
Coordinates of 2D projections, namely PCA, UMAP and tSNE.
Embedding
https://github.com/ersilia-os/eos8ub5
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00478-9
https://github.com/ersilia-os/compound-embedding
GPL-3.0-or-later
miquelduranfrigola
10/11/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8ub5
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8ub5.zip
Local
Q4
2024
chemical-space-projections-chemdiv
Ready
Chemical space 2D projections against ChemDiv

This tool performs PCA, UMAP and tSNE projections taking a 100k ChemDiv diversity set as a chemical space of reference. The Ersilia Compound Embeddings are used as descriptors. Four PCA components and two UMAP and tSNE components are returned.

In-house
Representation
Compound
Single
Value
Float
List
Coordinates of 2D projections, namely PCA, UMAP and tSNE.
Embedding
https://github.com/ersilia-os/eos2db3
https://www.chemdiv.com/catalog/diversity-libraries/representative-diversity-libraries-out-of-1-6m-stock/
https://github.com/ersilia-os/compound-embedding
GPL-3.0-or-later
miquelduranfrigola
9/11/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2db3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2db3.zip
Local
Q4
2024
chemical-space-projections-drugbank
Ready
Chemical space 2D projections against DrugBank

This tool performs PCA, UMAP and tSNE projections taking the DrugBank chemical space as a reference. The Ersilia Compound Embeddings are used as descriptors. Four PCA components and two UMAP and tSNE components are returned.

In-house
Representation
Compound
Single
Value
Float
List
Coordinates of 2D projections, namely PCA, UMAP and tSNE.
Embedding
https://github.com/ersilia-os/eos9gg2
https://academic.oup.com/nar/article/52/D1/D1265/7416367
https://github.com/ersilia-os/compound-embedding
GPL-3.0-or-later
miquelduranfrigola
9/11/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos9gg2
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9gg2.zip
Local
Q4
2024
bbbp-marine-kinase-inhibitors
Ready
BBBP model tested on marine-derived kinase inhibitors

A set of three binary classifiers (random forest, gradient boosting classifier, and logistic regression) to predict the Blood-Brain Barrier (BBB) permeability of small organic compounds. The best models were applied to natural products of marine origin, able to inhibit kinases associated with neurodegenerative disorders. The training set size was around 300 compounds.

Retrained
Annotation
Compound
Single
Score
Float
List
Classification score over three classifiers, namely random forest (rfc), gradient boosting classifier (gbc), and logistic regression (logreg).
Drug-likeness
Permeability
https://github.com/ersilia-os/eos3mk2
https://pubmed.ncbi.nlm.nih.gov/30699889/
https://github.com/plissonf/BBB-Models
MIT
miquelduranfrigola
23/10/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3mk2
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3mk2.zip
Local
Q4
2024
deep-dl
Ready
Drug-likeness scoring based on unsupervised learning

This model evaluates drug-likeness using an unsupervised learning approach, eliminating the need for labeled data and avoiding biases from incomplete negative sets. It extracts features directly from known drug molecules, identifying common characteristics through a recurrent neural network (RNN) language model. By representing molecules as SMILES strings, the model learns the probability distribution of known drugs and assesses new molecules based on their likelihood of appearing in this space.

Pretrained
Annotation
Compound
Single
Score
Float
Single
Higher score indicates higher drug likeness
Drug-likeness
https://github.com/ersilia-os/eos9p4a
https://pubs.rsc.org/en/content/articlehtml/2022/sc/d1sc05248a
https://github.com/SeonghwanSeo/DeepDL
GPL-3.0-or-later
https://eos9p4a-izpny.ondigitalocean.app/
miquelduranfrigola
4/9/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos9p4a
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9p4a.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos9p4a
Online
Q3
2024
unimol-representation
Ready
Uni-Mol molecular representation

Uni-Mol offers a simple and effective SE(3) equivariant transformer architecture for pre-training molecular representations that capture 3D information. The model is trained on >200M conformations. The current model outputs a representation embedding.

Pretrained
Representation
Compound
Single
Value
Float
List
Uni-Mol representation embedding
Fingerprint
https://github.com/ersilia-os/eos39co
https://openreview.net/forum?id=6K2RM6wVqKu
https://github.com/deepmodeling/Uni-Mol
GPL-3.0-only
miquelduranfrigola
22/7/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos39co
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos39co.zip
Local
Q3
2024
qupkake
Ready
Predict micro-pKa of organic molecules

QupKake is an innovative approach that combines graph neural network (GNN) models with semiempirical quantum mechanical (QM) features to forecast the micro-pKa values of organic molecules. QM has a significant role in both identifying reaction sites and predicting micro-pKa values. Precisely predicting micro-pKa values is vital for comprehending and adjusting the acidity and basicity of organic compounds, This has significant applications in drug discovery, materials science, and environmental c

Pretrained
Annotation
Compound
Single
Value
Float
List
Up to 10 pKa values for the molecule
pKa
https://github.com/ersilia-os/eos3wzy
https://doi.org/10.1021/acs.jctc.4c00328
https://github.com/hutchisonlab/QupKake
BSD-3-Clause
LauraGomezjurado
17/7/2024
https://github.com/LauraGomezjurado
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3wzy.zip
Local
Q3
2024
mpro-covid19
Ready
Predict bioactivity against Main Protease of SARS-CoV-2

MProPred predicts the efficacy of compounds against the main protease of SARS-CoV-2, which is a promising drug target since it processes polyproteins of SARS-CoV-2. This model uses PaDEL-Descriptor to calculate molecular descriptors of compounds. It is based on a dataset of 758 compounds that have inhibition efficacy against the Main Protease, as published in peer-reviewed journals between January, 2020 and August, 2021. Input compounds are compared to compounds in the dataset to measure molecul

Pretrained
Annotation
Compound
Single
Value
Float
Single
Gives the pIC50 values for each compound to compare their bioactivity against the main protease
COVID19
https://github.com/ersilia-os/eos3nn9
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10289339/
https://github.com/Nadimfrds/Mpropred
MIT
HarmonySosa
1/7/2024
https://github.com/HarmonySosa
https://hub.docker.com/r/ersiliaos/eos3nn9
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3nn9.zip
Local
Q3
2024
cardiotox-dictrank
Ready
Cardiotoxicity Classifier

Prediction of drug-induced cardiotoxicity as a binary classification of cardiotoxicity risk. The probability score depicts risk of the compound being cardiotoxic. Classification is based on the chemical data such as SMILES representations of compounds and a variety of descriptors such as Morgan fingerprints and Mordred physicochemical descriptors that describe the molecular structure of the drug interactions. Biological data is also used including gene expression and cellular paintings after dru

Retrained
Annotation
Compound
Single
Score
Float
Single
The model provides a probability score indicating the likelihood of a compound being cardiotoxic
Cardiotoxicity
DrugBank
https://github.com/ersilia-os/eos1pu1
https://doi.org/10.1021/acs.jcim.3c01834
https://github.com/srijitseal/DICTrank
None
kurysauce
29/6/2024
https://github.com/kurysauce
https://hub.docker.com/r/ersiliaos/eos1pu1
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1pu1.zip
Local
Q2
2024
phakinpro
Ready
Pharmacokinetics Profiler (PhaKinPro)

Pharmacokinetics Profiler (PhaKinPro) predicts the pharmacokinetic (PK) properties of drug candidates. It has been built using a manually curated database of 10.000 compounds with information for 12 PK endpoints. Each model provides a multi-classifier output for a single endpoint, along with a confidence estimate of the prediction and whether the query molecule is within the applicability domain of the model.

Pretrained
Annotation
Compound
Single
Score
String
List
A list of several ADME predictions
Microsomal stability
ADME
Metabolism
Half-life
Permeability
https://github.com/ersilia-os/eos39dp
https://pubs.acs.org/doi/10.1021/acs.jmedchem.3c02446
https://github.com/molecularmodelinglab/PhaKinPro
MIT
sucksido
3/5/2024
https://github.com/sucksido
https://hub.docker.com/r/ersiliaos/eos39dp
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos39dp.zip
Local
Q2
2024
whales-qmug
Ready
WHALES similarity search on 600k molecules from Q-Mug

Search Q-Mug based on WHALES descriptors. Q-Mug is a subset of 600k bioactive molecules from ChEMBL. Three conformers are given for each molecule. WhALES is a simple descriptor useful for scaffold hopping.

Pretrained
Sampling
Compound
Single
Compound
String
List
The top 100 most similar molecules are returned, based on WHALES descriptors. 3D conformer generation is done internally.
Similarity
https://github.com/ersilia-os/eos6ru3
https://link.springer.com/protocol/10.1007/978-1-0716-1209-5_2
https://github.com/ETHmodlab/scaffold_hopping_whales
GPL-3.0
miquelduranfrigola
22/4/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6ru3
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6ru3.zip
Local
Q2
2024
reinvent4-libinvent
Ready
REINVENT 4 LibInvent

REINVENT 4 LibInvent creates new molecules by appending R groups to a given input. If the input SMILES string contains specified attachment points, it is directly processed by LibInvent to generate new molecules. If no attachment points given, the model try to find potential attachment points, and iterates through different combinations of these points. It passes each combination to LibInvent to generate new molecules.

Pretrained
Sampling
Compound
Single
Compound
String
List
Model generates up to 1000 similar molecules per input molecule.
Similarity
https://github.com/ersilia-os/eos6ost
https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a
https://github.com/MolecularAI/REINVENT4
Apache-2.0
ankitskvmdam
18/4/2024
https://github.com/ankitskvmdam
https://hub.docker.com/r/ersiliaos/eos6ost
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6ost.zip
Local
Q2
2024
cc-signaturizer-3d
Ready
Chemical Checker Signaturizer 3D

Building on the Chemical Checker bioactivity signatures (available as eos4u6p), the authors use the relation between stereoisomers and bioactivity of over 1M compounds to train stereochemically-aware signaturizers that better describe small molecule bioactivity properties. In this implementation we provide the A1, A2, A3, B1, B4 and C3 signatures

Pretrained
Representation
Compound
Single
Value
Float
List
2D projection of bioactivity signatures
Descriptor
Bioactivity profile
Embedding
https://github.com/ersilia-os/eos8aox
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-024-00867-4
https://gitlabsbnb.irbbarcelona.org/packages/signaturizer3d
MIT
GemmaTuron
19/3/2024
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos8aox
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8aox.zip
Local
Q1
2024
reinvent4-mol2mol-scaffold
Ready
REINVENT 4 Mol2MolScaffold

Mol2MolScaffold uses REINVENT4's mol2mol scaffold prior and mol2mol scaffold generic prior to generate around 500 new molecules similar to the provided molecules. The generated molecules will be relatively similar to the input molecules.

Pretrained
Sampling
Compound
Single
Compound
String
List
Model generates up to 500 similar molecules per input molecule.
Similarity
https://github.com/ersilia-os/eos57bx
https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a
https://github.com/MolecularAI/REINVENT4
Apache-2.0
ankitskvmdam
8/3/2024
https://github.com/ankitskvmdam
https://hub.docker.com/r/ersiliaos/eos57bx
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos57bx.zip
Local
Q1
2024
erg-fingerprints
Ready
ErG 2D Descriptors

The Extended Reduced Graph (ErG) approach uses the description of pharmacophore nodes to encode molecular properties, with the goal of correctly describing pharmacophoric properties, size and shape of molecules. It was benchmarked against Daylight fingerprints and outperformed them in 10 out of 11 cases. ErG descriptors are well suited for scaffold hopping approaches.

Pretrained
Representation
Compound
Single
Value
Float
List
Vector representing ErG fingerprint values
Descriptor
Fingerprint
https://github.com/ersilia-os/eos5guo
https://pubs.acs.org/doi/10.1021/ci050457y
https://www.rdkit.org/docs/source/rdkit.Chem.rdReducedGraphs.html
BSD-3.0
GemmaTuron
6/3/2024
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos5guo
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5guo.zip
Local
Q1
2024
whales-scaled
Ready
WHALES scaled

Scaled version of the WHALES descriptors (see eos3ae6). WHALES are holistic molecular descriptors useful for scaffold hopping, based on 3D structure to facilitate natural product featurization. The scaling uses sklearn's Robust Scaler trained on a random set of 100K molecules from ChEMBL.

Pretrained
Representation
Compound
Single
Value
Float
List
Scaled vector representation of a molecule
Natural product
Descriptor
https://github.com/ersilia-os/eos24ur
https://www.nature.com/articles/s42004-018-0043-x
https://github.com/grisoniFr/scaffold_hopping_whales
MIT
miquelduranfrigola
5/3/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos24ur
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos24ur.zip
Local
Q1
2024
scaffold-decoration
Ready
Scaffold decoration

The context discusses a novel notation system called Sequential Attachment-based Fragment Embedding (SAFE) that improves upon traditional molecular string representations like SMILES. SAFE reframes SMILES strings as an unordered sequence of interconnected fragment blocks while maintaining compatibility with existing SMILES parsers. This streamlines complex molecular design tasks by facilitating autoregressive generation under various constraints. The effectiveness of SAFE is demonstrated by trai

Pretrained
Sampling
Compound
Single
Compound
String
List
Model generates up to 1000 new molecules from input molecule by replacing side chains of the scaffold
Compound generation
https://github.com/ersilia-os/eos2401
https://arxiv.org/pdf/2310.10773.pdf
https://github.com/datamol-io/safe/tree/main
CC
Inyrkz
20/2/2024
https://github.com/Inyrkz
https://hub.docker.com/r/ersiliaos/eos2401
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2401.zip
Local
Q1
2024
dili-predictor
Ready
Early prediction of Drug-Induced Liver Injury

The DILI-Predictor predicts 10 features related to DILI toxicity including in-vivo and in-vitro and physicochemical parameters. It has been developed by the Broad Institute using the DILIst dataset (1020 compounds) from the FDA and achieved an accuracy balance of 70% on a test set of 255 compounds held out from the same dataset. The authors show how the model can correctly predict compounds that are not toxic in human despite being toxic in mice.

Pretrained
Annotation
Compound
Single
Score
Float
List
Prediction of 10 DILI-related endpoints. The most important is the first, DILI. Threshold for DILI active is set at 0.16 by the authors.
Toxicity
Metabolism
https://github.com/ersilia-os/eos5gge
https://pubs.acs.org/doi/10.1021/acs.chemrestox.4c00015
https://github.com/Manas02/dili-pip
None
Zainab-ik
19/2/2024
https://github.com/Zainab-ik
https://hub.docker.com/r/ersiliaos/eos5gge
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5gge.zip
Local
Q1
2024
admet-ai-prediction
Ready
ADMET properties prediction

ADMET AI is a framework for carrying out fast batch predictions for ADMET properties. It is based on ensemble of five Chemprop-RDKit models and has been trained on 41 tasks from the ADMET group in Therapeutics Data Commons (v0.4.1). Out of these 41 tasks, there are 31 classification tasks and 10 regression tasks. In addition to that output also contains 8 physicochemical properties, namely, molecular weight, logP, hydrogen bond acceptors, hydrogen bond doners, Lipinski's Rule of 5, QED, stereo c

Pretrained
Annotation
Compound
Single
Score
Value
Float
List
ADMET outcomes, including physicochemical properties and classification tasks, as well as percentile normalizations based on the DrugBank chemical space.
ADME
Toxicity
https://github.com/ersilia-os/eos7d58
https://academic.oup.com/bioinformatics/article/40/7/btae416/7698030
https://github.com/swansonk14/admet_ai
MIT
https://eos7d58-awe6b.ondigitalocean.app/
DhanshreeA
7/2/2024
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos7d58
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7d58.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos7d58
Yes
Local
Q1
2024
reinvent4-mol2mol-medium-similarity
Ready
REINVENT 4 Mol2MolMediumSimilarity

The Mol2MolMediumSimilarity leverages REINVENT4's mol2mol medium similarity prior to generate up to 100 unique molecules. The generated molecules will be relatively similar to the input molecule.

Pretrained
Sampling
Compound
Single
Compound
String
List
Model generates up to 100 similar molecules per input molecule.
Similarity
https://github.com/ersilia-os/eos694w
https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a
https://github.com/MolecularAI/REINVENT4
Apache-2.0
ankitskvmdam
7/2/2024
https://github.com/ankitskvmdam
https://hub.docker.com/r/ersiliaos/eos694w
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos694w.zip
Local
Q1
2024
antibiotics-ai-cytotox
Ready
Human cytotoxicity endpoints

The authors tested the dataset of 39312 compounds used to train the antibiotics-ai model (eos18ie) against several cytotoxicity endpoints; human liver carcinoma cells (HepG2), human primary skeletal muscle cells (HSkMCs) and human lung fibroblast cells (IMR-90). Cellular viability was measured after 20133 days of treatment with each compound at 10 μM and activities were binarized using a 90% cell viability cut-off. 341 (8.5%), 490 (3.8%) and 447 (8.8%) compounds classified as cytotoxic for HepG2

Pretrained
Annotation
Compound
Single
Score
Float
List
Predicting cytotoxicity in human liver carcinoma cells (HepG2), human primary skeletal muscle cells (HSkMCs) and human lung fibroblast cells (IMR-90)
Cytotoxicity
https://github.com/ersilia-os/eos42ez
https://www.nature.com/articles/s41586-023-06887-8
https://github.com/felixjwong/antibioticsai
MIT
Richiio
5/2/2024
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos42ez
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos42ez.zip
Yes
Local
Q1
2024
inter-dili
Ready
InterDILI: drug-induced injury prediction

This model has been trained on a publicly available collection of 5 datasets manually curated for drug-induced-liver-injury (DILI). DILI outcome has been binarised, and ECFP descriptors, together with physicochemical properties have been used to train a random forest classifier which achieves AUROC > 0.9

Retrained
Annotation
Compound
Single
Score
Float
Single
Probability of Drug-Induced Liver Injury (DILI), higher score indicates higer risk
Toxicity
Human
Metabolism
https://github.com/ersilia-os/eos21q7
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-023-00796-8
https://github.com/bmil-jnu/InterDILI
None
leilayesufu
30/1/2024
https://github.com/leilayesufu
https://hub.docker.com/r/ersiliaos/eos21q7
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos21q7.zip
Local
Q1
2024
antibiotics-ai-saureus
Ready
Antibiotic activity prediction against Staphylococcus aureus

The authors use a mid-size dataset (more than 30k compounds) to train an explainable graph-based model to identify potential antibiotics with low cytotoxicity. The model uses a substructure-based approach to explore the chemical space. Using this method, they were able to screen 283 compounds and identify a candidate active against methicillin-resistant S. aureus (MRSA) and vancomycin-resistant enterococci.

Pretrained
Annotation
Compound
Single
Score
Float
Single
Probability of growth inhibition (80% cut off at 50uM)
Antimicrobial activity
ESKAPE
https://github.com/ersilia-os/eos18ie
https://www.nature.com/articles/s41586-023-06887-8
https://github.com/felixjwong/antibioticsai
MIT
Richiio
26/1/2024
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos18ie
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos18ie.zip
Yes
Local
Q1
2024
scaffold-morphing
Ready
Scaffold morphing

The context discusses a novel notation system called Sequential Attachment-based Fragment Embedding (SAFE) that improves upon traditional molecular string representations like SMILES. SAFE reframes SMILES strings as an unordered sequence of interconnected fragment blocks while maintaining compatibility with existing SMILES parsers. This streamlines complex molecular design tasks by facilitating autoregressive generation under various constraints. The effectiveness of SAFE is demonstrated by trai

Pretrained
Sampling
Compound
Single
Compound
String
List
Model generates new molecules from input molecule by replacing core structures of input molecule.
Compound generation
https://github.com/ersilia-os/eos8bhe
https://arxiv.org/pdf/2310.10773.pdf
https://github.com/datamol-io/safe/tree/main
CC
Inyrkz
12/1/2024
https://github.com/Inyrkz
https://hub.docker.com/r/ersiliaos/eos8bhe
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8bhe.zip
Local
Q1
2024
ngonorrhoeae-inhibition
Ready
Growth Inhibitors of Neisseria gonorrhoeae

The authors curated a dataset of 282 compounds from ChEMBL, of which 160 (56.7%) were labeled as active N. gonorrhoeae inhibitor compounds. They used this dataset to build a naïve Bayesian model and used it to screen a commercial library. With this method, they identified and validated two hits. We have used the dataset to build a model using LazyQSAR with Ersilia Compound Embeddings as molecular descriptors. LazyQSAR is an AutoML Ersilia-developed library.

Retrained
Annotation
Compound
Single
Score
Float
Single
Probability of activity for the inhibition of the pathogen N. gonorrhoeae
Antimicrobial activity
ChEMBL
N.gonorrhoeae
https://github.com/ersilia-os/eos5cl7
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8274436/
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
Richiio
3/1/2024
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos5cl7
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5cl7.zip
Yes
Local
Q1
2024
kgpgt-embedding
In progress
Knowledge-guided pre-trained graph transformer

Neural fingerprints (embeddings) based on a knowledge-guided graph transformer. This model reprsents a novel self-supervised learning framework for the representation learning of molecular graphs, consisting of a novel graph transformer architecture, LiGhT, and a knowledge-guided pre-training strategy.

Pretrained
Representation
Compound
Single
Value
Float
List
Knowledge-driven embedding
Descriptor
https://github.com/ersilia-os/eos8aa5
https://www.nature.com/articles/s41467-023-43214-1
https://github.com/lihan97/KPGT
Apache-2.0
miquelduranfrigola
17/12/2024
https://github.com/miquelduranfrigola
Local
Q4
2024
mole-embeddings
In progress
MolE molecular embeddings
Representation
Compound
Single
https://github.com/ersilia-os/eos3dq3
https://www.nature.com/articles/s41467-024-53751-y
https://github.com/recursionpharma/mole_public
miquelduranfrigola
18/11/2024
https://github.com/miquelduranfrigola
Local
Q4
2024
datamol-basic-descriptors
In progress
Basic molecular descriptors from Datamol

Basic molecular descriptors calculated with the Datamol package, including molecular weight, lipophilicity (cLogP), hydrogen bond donnors, hydrogen bond acceptors, etc. These descriptors are generally useful to annotate small molecule libraries. They are not recommended for QSAR modeling since they are probably too simple for most scenarios.

Pretrained
Representation
Compound
Single
Value
Float
List
Basic molecular descriptors. Some descriptors are floats and some are counts.
Descriptor
https://github.com/ersilia-os/eos4djh
https://github.com/datamol-io/datamol
https://docs.datamol.io/0.7.4/api/datamol.descriptors.html
Apache-2.0
miquelduranfrigola
9/11/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos4djh
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4djh.zip
Local
Q4
2024
drug-metabolites
In progress
Drug metabolites prediction
https://github.com/ersilia-os/eos35g4
https://pubs.rsc.org/en/content/articlelanding/2020/sc/d0sc02639e
https://github.com/KavrakiLab/MetaTrans
miquelduranfrigola
24/10/2024
https://github.com/miquelduranfrigola
Local
Q4
2024
mtb-permeability
In progress
Mtb cell wall permeability

This model predicts the probability of a compound of passing the Mycobacterium tuberculosis cell wall membrane. The classifier (permeable vs not permeable) model was trained on a dataset of 5368 molecules. It is a simple classifier (SVC) using Mordred descriptors.

Pretrained
Annotation
Compound
Single
Score
Float
Single
Probability score of a compound passing the Mtb cell wall membrane
M.tuberculosis
Permeability
https://github.com/ersilia-os/eos3ujl
https://link.springer.com/article/10.1007/s11030-024-10952-3
https://github.com/PGlab-NIPER/MTB_Permeability
GPL-3.0-or-later
miquelduranfrigola
16/10/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3ujl
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ujl.zip
Local
Q4
2024
cheese-sampler
In progress
CHEESE similarity search with multiple similarity measures and against various databases

CHEESE is a chemical embeddings search engine based on approximate nearest neighbors. It supports multiple similarity measures and can search against various databases, including ENAMINE REAL, ZINC, and others. Among the similarity measures, CHEESE supports the classical Morgan fingerprints as well as 3D shape and electrostatics similarities. The search engine is available online. This model from the Ersilia Model Hub is intended to be used a sampler for the CHEESE search engine, where the user

Online
Sampling
Compound
Single
Compound
String
List
A list of up to 100 similar compounds to the input compound.
Similarity
https://github.com/ersilia-os/eos9uqy
https://chemrxiv.org/engage/chemrxiv/article-details/67250915f9980725cfcd1f6f
https://cheese.deepmedchem.com/
GPL-3.0-or-later
miquelduranfrigola
19/8/2024
https://github.com/miquelduranfrigola
Local
Q3
2024
one-molecule-mollib
In progress
One-molecule MolLib

MolLib is a low-resource generative model trained on ChEMBL data. It is able to generate drug-like and natural-product-like compounds. In this implementation, given an intial molecule, we first sample similar compounds and then we train the generator.

Pretrained
Sampling
Compound
Single
Compound
String
List
Compounds generated by mollib around the chemical space of the input compound
Similarity
https://github.com/ersilia-os/eos4f8y
https://www.nature.com/articles/s42256-020-0160-y
https://github.com/ETHmodlab/virtual_libraries
GPL-3.0-only
miquelduranfrigola
23/7/2024
https://github.com/miquelduranfrigola
Local
Q3
2024
unit-testing-compounds
In progress
Unit Testing Compounds Ersilia Pack
https://github.com/ersilia-os/eos30d7
https://ersilia.io
https://github.com/ersilia-os/ersilia
DhanshreeA
15/7/2024
https://github.com/DhanshreeA
Local
Q3
2024
reinvent4-linkinvent
In progress
REINVENT 4 LinkInvent
https://github.com/ersilia-os/eos69mr
https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a
https://github.com/MolecularAI/REINVENT4
ankitskvmdam
19/5/2024
https://github.com/ankitskvmdam
Local
Q2
2024
squid
In progress
SQUID 3D shape generation

Equivariant shape-conditioned generation of 3D molecules for ligand-based drug design. SQUID can generate chemically diverse molecules for arbitrary molecular shapes. Shape is defined by the input molecule.

Pretrained
Sampling
Compound
Single
Compound
String
List
Molecules matching the 3D shape of the input compound are suggested
Compound generation
https://github.com/ersilia-os/eos8vud
https://arxiv.org/abs/2210.04893
https://github.com/keiradams/SQUID
MIT
miquelduranfrigola
1/5/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8vud
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8vud.zip
Local
Q2
2024
delfta-qm
In progress
DelFTa quantum mechanical properties prediction
https://github.com/ersilia-os/eos9aqt
https://pubs.rsc.org/en/content/articlehtml/2022/cp/d2cp00834c
https://github.com/josejimenezluna/delfta
miquelduranfrigola
24/4/2024
https://github.com/miquelduranfrigola
Local
Q2
2024
opt-admet
To do
ADMET Properties Optimization
https://github.com/ersilia-os/eos2fg2
https://www.nature.com/articles/s41596-023-00942-4#code-availability
https://github.com/antwiser/OptADMET
Zainab-ik
13/2/2024
https://github.com/Zainab-ik
Local
Q1
2024
unit-test-compounds
Test
Unit test model for compounds

Given a SMILES string, the model counts the number of characters and other string metrics. This model is just for unit testing, it is not intended to be used in a real-world scenario. Ersilia codebase will heavily rely on this model repository to test various functionalities of the CLI such as fetching from GitHub, DockerHub and this repository will function as the model fixture for Ersilia's integration tests.

In-house
Regression
Compound
Single
Value
Integer
List
Simple count of characters in a SMILES string
Fingerprint
https://github.com/ersilia-os/eos5iy5
https://ersilia.io
https://github.com/ersilia-os/ersilia
GPL-3.0
miquelduranfrigola
3/7/2024
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos5iy5
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5iy5.zip
Local
Q3
2024
covid-19-drug-repurposing
Archived
DRKG_COVID19
https://github.com/ersilia-os/eos3nl8
https://arxiv.org/abs/2007.10261v1
https://github.com/gnn4dr/DRKG
Inyrkz
5/12/2023
https://github.com/Inyrkz
Q4
2023
biogpt-embeddings
Archived
BioGPT embeddings

BioGPT is a pre-trained transformer for biomedical text. This domain-specific model has been trained on large-scale biomedical literature. In this implementation, we use BioGPT to generate numerical embeddings for bioassay and other biomedical texts.

Pretrained
Representation
Text
Single
Descriptor
Float
List
Biomedical text embedding
Embedding
Biomedical text
https://github.com/ersilia-os/eos1xje
https://academic.oup.com/bib/article/23/6/bbac409/6713511?guestAccessKey=a66d9b5d-4f83-4017-bb52-405815c907b9&login=false
https://github.com/microsoft/biogpt
MIT
miquelduranfrigola
30/8/2023
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos1xje
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1xje.zip
Local
Q3
2023
hdac3-inhibition
Ready
Identifying HDAC3 inhibitors

The model predicts the inhibitory potential of small molecules against Histone deacetylase 3 (HDAC3), a relevant human target for cancer, inflammation, neurodegenerative diseases and diabetes. The authors have used a dataset of 1098 compounds from ChEMBL and validated the model using the benchmark MUBD-HDAC3.

Pretrained
Annotation
Compound
Single
Score
Float
Single
Probability that the molecule is a HDAC3 inhibitor
Cancer
ChEMBL
https://github.com/ersilia-os/eos1n4b
https://onlinelibrary.wiley.com/doi/10.1002/minf.202000105
https://github.com/jwxia2014/HDAC3i-Finder
GPL-3.0
Richiio
14/12/2023
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos1n4b
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1n4b.zip
Local
Q4
2023
mrlogp
Ready
MRlogP: neural network-based logP prediction for druglike small molecules

The authors use a two-step approach to build a model that accurately predicts the lipophilicity (LogP) of small molecules. First, they train the model on a large amount of low accuracy predicted LogP values and then they fine-tune the network using a small, accurate dataset of 244 druglike compounds. The model achieves an average root mean squared error of 0.988 and 0.715 against druglike molecules from Reaxys and PHYSPROP.

Pretrained
Annotation
Compound
Single
Value
Float
Single
Predicted LogP of small molecules
Lipophilicity
LogP
https://github.com/ersilia-os/eos9ym3
https://www.mdpi.com/2227-9717/9/11/2029/htm
https://github.com/JustinYKC/MRlogP
MIT
leilayesufu
12/12/2023
https://github.com/leilayesufu
https://hub.docker.com/r/ersiliaos/eos9ym3
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ym3.zip
Local
Q4
2023
dmpnn-herg
Ready
Prediction of hERG channel blockers with directed message passing neural networks

This model leverages the ChemProp network (D-MPNN) to build a predictor of hERG-mediated cardiotoxicity. The model has been trained using a published dataset which contains 7889 molecules with several cut-offs for hERG blocking activity. The authors select a 10 uM cut-off. This implementation of the model does not use any specific featurizer, though the authors suggest the moe206 descriptors (closed-source) improve performance even further.

Pretrained
Annotation
Compound
Single
Score
Float
Single
Probability of blocking hERG (cut-off: 10uM)
Cardiotoxicity
hERG
Toxicity
Descriptor
https://github.com/ersilia-os/eos30f3
https://pubs.rsc.org/en/content/articlehtml/2022/ra/d1ra07956e
https://github.com/AI-amateur/DMPNN-hERG
None
leilayesufu
4/12/2023
https://github.com/leilayesufu
https://hub.docker.com/r/ersiliaos/eos30f3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos30f3.zip
Local
Q4
2023
chemprop-burkholderia
Ready
Burkholderia cenocepacia inhibition

Prediction of antimicrobial potential using a dataset of 29537 compounds screened against the antibiotic resistant pathogen Burkholderia cenocepacia. The model uses the Chemprop Direct Message Passing Neural Network (D-MPNN) abd has an AUC score of 0.823 for the test set. It has been used to virtually screen the FDA approved drugs as well as a collection of natural product list (>200k compounds) with hit rates of 26% and 12% respectively.

Pretrained
Annotation
Compound
Single
Score
Float
Single
Probability that a compound inhibits the drug resistant bacteria Burkholderia cenocepacia. Scores range from 0 to 1. With 1 indicating the highest probability for growth inhibitory activity.
Antimicrobial activity
https://github.com/ersilia-os/eos5xng
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9624395/
https://github.com/cardonalab/Prediction-of-ATB-Activity
GPL-3.0
Richioo
3/12/2023
https://github.com/Richioo
https://hub.docker.com/r/ersiliaos/eos5xng
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5xng.zip
Yes
Local
Q4
2023
pgmg-pharmacophore
Ready
Pharmacophore-guided molecular generation

Based on a molecule's pharmacophore, this model generates new molecules de-novo to match the pharmacophore. Internally, pharmacophore hypotheses are generated for a given ligand. A graph neural network encodes spatially distributed chemical features and a transformer decoder generates molecules.

Pretrained
Sampling
Compound
Single
Compound
String
List
Model generates new molecules from input molecule by first creating pharmacophore hypotheses and then constraining generation.
Chemical graph model
Compound generation
https://github.com/ersilia-os/eos69e6
https://www.nature.com/articles/s41467-023-41454-9
https://github.com/CSUBioGroup/PGMG
MIT
miquelduranfrigola
1/12/2023
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos69e6
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos69e6.zip
Local
Q4
2023
morgan-binary-fps
Ready
Morgan fingerprints in binary form (radius 3, 2048 dimensions)

The Morgan Fingerprints are one of the most widely used molecular representations. They are circular representations (from an atom,search the atoms around with a radius n) and can have thousands of features. This implementation uses the RDKit package and is done with radius 3 and 2048 dimensions, providing a binary vector as output. For Morgan counts, see model eos5axz.

Pretrained
Representation
Compound
Single
Value
Integer
List
Binary vector representing the SMILES
Descriptor
Fingerprint
https://github.com/ersilia-os/eos4wt0
https://pubmed.ncbi.nlm.nih.gov/20426451/
https://www.rdkit.org/docs
BSD-3.0
GemmaTuron
1/12/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos4wt0
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4wt0.zip
Local
Q4
2023
pmapper-3d
Ready
3D pharmacophore descriptor

The pharmacophore mapper (pmapper) identifies common 3D pharmacophores of active compounds against a specific target and uniquely encodes them with hashes suitable for fast identification of identical pharmacophores. The obtained signatures are amenable for downstream ML tasks.

Pretrained
Representation
Compound
Single
Value
Integer
List
Vector representation of pharmacophores
Descriptor
Fingerprint
https://github.com/ersilia-os/eos4x30
https://www.mdpi.com/1422-0067/20/23/5834
https://github.com/DrrDom/pmapper
BSD-3.0
GemmaTuron
28/11/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos4x30
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4x30.zip
Local
Q4
2023
molfeat-usrcat
Ready
USR descriptors with pharmacophoric constraints

USRCAT is a real-time ultrafast molecular shape recognition with pharmacophoric constraints. It integrates atom type to the traditional Ultrafast Shape Recognition (USR) descriptor to improve the performance of shape-based virtual screening, being able to discriminate between compounds with similar shape but distinct pharmacophoric features.

Pretrained
Representation
Compound
Single
Value
Float
List
60 features based on USRCAT
Descriptor
Embedding
https://github.com/ersilia-os/eos1ut3
https://jcheminf.biomedcentral.com/articles/10.1186/1758-2946-4-27
https://molfeat.datamol.io/featurizers/usrcat
Apache-2.0
GemmaTuron
28/11/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos1ut3
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1ut3.zip
Local
Q4
2023
antitb-seattle
Ready
Antituberculosis activity prediction

Prediction of the activity of small molecules against Mycobacterium tuberculosis. This model has been developed by Ersilia thanks to the data provided by the Seattle Children's (Dr. Tanya Parish research group). In vitro activity against M. tuberculosis was measured i na single point inhibition assay (10000 molecules) and selected compounds (259) were assayed in MIC50 and MIC90 assays. Cut-offs have been determined according to the researcher's guidance.

In-house
Classification
Compound
Single
Compound
Float
List
Probability of inhibition of M.tb in vitro in the MIC50, MIC90 and whole cell assays at cut-offs 10 and 20 uM and 50%, respectively
M.tuberculosis
Antimicrobial activity
MIC90
Tuberculosis
https://github.com/ersilia-os/eos9ivc
https://pubmed.ncbi.nlm.nih.gov/30650074/
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
GemmaTuron
24/11/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos9ivc
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ivc.zip
No
Local
Q4
2023
molpmofit
Ready
Molecular Prediction Model Fine-Tuning (MolPMoFiT) encodings

Using self-supervised learning, the authors pre-trained a large model using one millon unlabelled molecules from ChEMBL. This model can subsequently be fine-tuned for various QSAR tasks. Here, we provide the encodings for the molecular structures using the pre-trained model, not the fine-tuned QSAR models.

Pretrained
Representation
Compound
Single
Value
Float
List
Embedding vectors of each smiles are obtained, represented in a matrix, where each row is a vector of embedding of each smiles character, with a dimension of 400. The pretrained model is loaded using the fastai library
Descriptor
Embedding
https://github.com/ersilia-os/eos9zw0
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00430-x
https://github.com/XinhaoLi74/MolPMoFiT
CC
GemmaTuron
6/11/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos9zw0
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9zw0.zip
Local
Q4
2023
moler-enamine-blocks
Ready
Extending molecular scaffolds with building blocks

MoLeR is a graph-based generative model that combines fragment-based and atom-by-atom generation of new molecules with scaffold-constrained optimization. It does not depend on generation history and therefore MoLeR is able to complete arbitrary scaffolds. The model has been trained on the GuacaMol dataset. Here we sample the 300k building blocks library from Enamine.

Pretrained
Sampling
Compound
Single
Compound
String
List
1000 new molecules are sampled for each input molecule, preserving its scaffold.
Chemical graph model
Compound generation
https://github.com/ersilia-os/eos633t
https://arxiv.org/abs/2103.03864
https://github.com/microsoft/molecule-generation
MIT
miquelduranfrigola
3/11/2023
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos633t
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos633t.zip
Local
Q4
2023
small-world-wuxi
Ready
Small World Wuxi search

Small World is an index of chemical space containing more than 230B molecular substructures. Here we use the Small World API to post a query to the SmallWorld server. We sample 100 molecules within a distance of 10 specifically for the Wuxi map, not the entire SmallWorld domain. Please check other small-world models available in our hub.

Online
Sampling
Compound
Single
Compound
String
List
List of 100 nearest neighbors
Similarity
https://github.com/ersilia-os/eos3kcw
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606195/
https://pypi.org/project/smallworld-api/
MIT
miquelduranfrigola
2/11/2023
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3kcw
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3kcw.zip
Local
Q4
2023
small-world-zinc
Ready
Small World Zinc search

Small World is an index of chemical space containing more than 230B molecular substructures. Here we use the Small World API to post a query to the SmallWorld server. We sample 100 molecules within a distance of 10 specifically for the ZINC map, not the entire SmallWorld domain. Please check other small-world models available in our hub.

Online
Sampling
Compound
Single
Compound
String
List
List of 100 nearest neighbors
Similarity
https://github.com/ersilia-os/eos1d7r
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606195/
https://pypi.org/project/smallworld-api/
MIT
miquelduranfrigola
2/11/2023
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos1d7r
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1d7r.zip
Local
Q4
2023
small-world-enamine-real
Ready
Small World Enamine REAL search

Small World is an index of chemical space containing more than 230B molecular substructures. Here we use the Small World API to post a query to the SmallWorld server. We sample 100 molecules within a distance of 10 specifically for the Enamine REAL map, not the entire SmallWorld domain. Please check other small-world models available in our hub.

Online
Sampling
Compound
Single
Compound
String
List
List of 100 nearest neighbors
Similarity
https://github.com/ersilia-os/eos9ueu
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606195/
https://pypi.org/project/smallworld-api/
MIT
miquelduranfrigola
1/11/2023
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos9ueu
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ueu.zip
Local
Q4
2023
mycetos
Ready
Inhibition of Eumycetoma from MycetOS

This model predicts the growth of the fungus M. mycetomatis, causal agent of Mycetoma, in presence of small drugs. It has been developed using the data from MycetOS, an opemn source initiative aiming at finding new patent-free drugs. The model has been trained using the LazyQSAR package (MorganBinaryClassifier) from Ersilia.

In-house
Annotation
Compound
Single
Score
Float
Single
Probability of inhibition of M. mycetomatis (growth assay, cut-off at 20% growth)
Mycetoma
Antifungal activity
https://github.com/ersilia-os/eos4f95
https://www.ijidonline.com/article/S1201-9712(20)31735-5/fulltext
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
GemmaTuron
27/9/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos4f95
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4f95.zip
No
Local
Q3
2023
hdac1-inhibition
Ready
Inhibition of HDAC1

Prediction of the inhibition of the Human Histone Deacetylase 1 to revert HIV latency. The dataset is composed of all available pIC50 values from ChEMBL target 325, and the model has been developed using Ersilia's LazyQsar package (MorganBinaryClassifier)

In-house
Annotation
Compound
Single
Score
Float
List
Probability of inhibition of HDAC1 at cut-offs pIC50 7 (0.1uM) and 8 (10nM)
HIV
Human
HDAC1
https://github.com/ersilia-os/eos2zmb
https://www.ebi.ac.uk/chembl/target_report_card/CHEMBL325/
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
GemmaTuron
27/9/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos2zmb
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2zmb.zip
No
Local
Q3
2023
chembl-sampler
Ready
ChEMBL Molecular Sampler

A simple sampler of the ChEMBL database using their API. It looks for similar molecules to the input molecule and returns a list of 100 molecules by default. This model has been developed by Ersilia. It posts queries to an online server.

Pretrained
Sampling
Compound
Single
Compound
String
List
100 nearest molecules in ChEMBL
Similarity
https://github.com/ersilia-os/eos1noy
https://academic.oup.com/nar/article/40/D1/D1100/2903401
https://github.com/ersilia-os/chem-sampler/blob/main/chemsampler/samplers/chembl/sampler.py
GPL-3.0
GemmaTuron
4/9/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos1noy
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1noy.zip
Local
Q3
2023
hepg2-mmv
Ready
HepG2 Toxicity - MMV

This model predicts the toxicity of small molecules in HepG2 cells. It has been developed by Ersilia thanks to data provided by MMV. We have used two cut-offs to define activity (5 and 10 uM respectively) with a dataset of 1335 molecules. 5-fold crossvalidation showed an AUROC of 0.8 and 0.77 respectively

In-house
Classification
Compound
Single
Probability
Float
List
Probability of toxicity in HepG2 cells. Cut-offs: 5 and 10 uM
Toxicity
Human
https://github.com/ersilia-os/eos3le9
https://ersilia.io
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
GemmaTuron
24/8/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos3le9
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3le9.zip
No
Local
Q3
2023
malaria-mmv
Ready
Antimalarial activity (MMV)

Prediction of the in vitro antimalarial potential of small molecules. This model has been developed by Ersilia thanks to experimental data provided by MMV. The model provides the probability of inhibition of the malaria parasite (NF54) measured both as percentage of inhibition (with luminescence and LDH) and IC50. 5-fold crossvalidation of the models shows AUROC>0.75 in all models.

In-house
Classification
Compound
Single
Probability
Float
Single
Probability of inhibiting the malaria parasite (strain NF54) in IC50 (threshold 1uM) and percentage of inhibition (50%, measured by LDH and Lum)
Malaria
P.falciparum
IC50
https://github.com/ersilia-os/eos4rta
https://ersilia.io
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
GemmaTuron
24/8/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos4rta
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4rta.zip
No
Local
Q3
2023
schisto-swisstph
Ready
Anti-schistosomiasis activity

Prediction of the activity of small molecules against the schistosoma parasite. This model has been developed by Ersilia thanks to the data provided by the Swiss TPH. In vitro activity against newly transformed schistosoma (nts) and adult worms was measured (% of inhibition of activity and IC50, respectively)

In-house
Classification
Compound
Single
Probability
Float
List
The probabilities of the molecule being active against schistosoma in NTS stage (in a % of inhibition assay at 70 and 90% inhibition 10uM) and adult stage (in IC50 assay at cut-offs 5 and 10uM
Neglected tropical disease
Schistosomiasis
IC50
https://github.com/ersilia-os/eos2l0q
https://pubmed.ncbi.nlm.nih.gov/30398059
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
GemmaTuron
24/8/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos2l0q
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2l0q.zip
No
Local
Q3
2023
chemprop-abaumannii
Ready
Inhibition of Acinetobacter baumannii growth

This model is a Chemprop neural network trained with a growth inhibition dataset. Authors screened ~7,500 molecules for those that inhibited the growth of A. baumannii in vitro. They discovered abaucin, an antibacterial compound with narrow-spectrum activity against A. baumannii.

Pretrained
Annotation
Compound
Single
Score
Float
Single
Probability of growth inhibition of the bacteria A. Baumannii (threshold > 80%)
A.baumannii
Antimicrobial activity
https://github.com/ersilia-os/eos3804
https://www.nature.com/articles/s41589-023-01349-8
https://github.com/GaryLiu152/chemprop_abaucin
None
https://eos3894-gz5nz.ondigitalocean.app/
miquelduranfrigola
23/8/2023
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3804
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3804.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos3804
Yes
Online
Q3
2023
pubchem-sampler
Ready
PubChem Molecular Sampler

A simple sampler of the PubChem database using their API. It looks for similar molecules to the input molecule and returns a list of 100 molecules by default. This model has been developed by Ersilia and posts queries to an online server.

Pretrained
Similarity
Compound
Single
Compound
String
List
100 nearest molecules in PubChem
Similarity
https://github.com/ersilia-os/eos2hzy
https://academic.oup.com/nar/article/51/D1/D1373/6777787
https://github.com/ersilia-os/chem-sampler/blob/main/chemsampler/samplers/pubchem/sampler.py
GPL-3.0
GemmaTuron
10/8/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos2hzy
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2hzy.zip
Local
Q3
2023
stoned-sampler
Ready
Stoned Sampler

The STONED sampler uses small modifications to molecules represented as SELFIES to perform a search of the chemical space and generate new molecules. The use of string modifications in the SELFIES molecular representation bypasses the need for large amounts of data while maintaining a performance comparable to deep generative models.

Pretrained
Generative
Compound
Single
Compound
String
List
Up to 1000 derivatives of the input molecule
Compound generation
https://github.com/ersilia-os/eos8fma
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153210/
https://github.com/aspuru-guzik-group/stoned-selfies
Apache-2.0
GemmaTuron
8/8/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos8fma
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8fma.zip
Local
Q3
2023
smiles-pe
Ready
SmilesPE: tokenizer algorithm for SMILES, DeepSMILES, and SELFIES

The Smiles Pair Encoding method generates smiles substring tokens based on high-frequency token pairs from large chemical datasets. This method is well-suited for both QSAR activities as well as generative models. The model provided here has been pretrained using ChEMBL.

Pretrained
Generative
Compound
Single
Compound
String
Flexible List
A data-driven tokenization method for SMILES-based deep learning models in cheminformatics, demonstrating high performance in molecular generation and QSAR prediction tasks compared to atom-level tokenization
Chemical language model
Chemical notation
ChEMBL
https://github.com/ersilia-os/eos1mxi
https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c01127
https://github.com/XinhaoLi74/SmilesPE
Apache-2.0
Richiio
2/8/2023
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos1mxi
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1mxi.zip
Local
Q3
2023
osm-series4
Ready
Antimalarial activity from OSM

This model predicts the antimalarial potential of small molecules in vitro. We have collected the data available from the Open Source Malaria Series 4 molecules and used two cut-offs to define activity, 1 uM and 2.5 uM. The training has been done with the LazyQSAR package (Morgan Binary Classifier) and shows an AUROC >0.8 in a 5-fold cross-validation on 20% of the data held out as test. These models have been used to generate new series 4 candidates by Ersilia.

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability of killing P.falciparum in vitro (IC50 < 1uM and 2.5uM, respectively)
Malaria
P.falciparum
IC50
https://github.com/ersilia-os/eos7yti
https://pubs.acs.org/doi/10.1021/acscentsci.6b00086
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
GemmaTuron
2/8/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos7yti
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7yti.zip
No
Local
Q3
2023
fasmifra
Ready
FasmiFra molecule generator

FasmiFra is a molecular generator based on (deep)SMILES fragments. The authors use Deep SMILES to ensure the generated molecules are syntactically valid, and by working on string operations they are able to obtain high performance (>340,000 molecule/s). Here, we use 100k compounds from ChEMBL to sample fragments. Only assembled molecules containing one of the fragments of the input molecule are retained.

Pretrained
Generative
Compound
Single
Compound
String
List
1000 generated molecules per each input
Compound generation
https://github.com/ersilia-os/eos4qda
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00566-4
https://github.com/UnixJunkie/FASMIFRA
GPL-3.0
miquelduranfrigola
1/8/2023
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos4qda
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4qda.zip
Local
Q3
2023
malaria-mam
Ready
Antimalarial activity for sexual stage and asexual blood stage (ABS)

Prediction of the antimalarial potential of small molecules using data from various chemical libraries that were screened against the asexual and sexual (gametocyte) stages of the parasite. Several compounds' molecular fingerprints were used to train machine learning models to recognize stage-specific active and inactive compounds.

Pretrained
Annotation
Compound
Single
Score
Float
List
Probability of inhibition of the malaria parasite growth
Malaria
P.falciparum
https://github.com/ersilia-os/eos80ch
https://pubs.acs.org/doi/10.1021/acsomega.3c05664
https://github.com/M2PL
GPL-3.0
GemmaTuron
10/7/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos80ch
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos80ch.zip
Yes
Local
Q3
2023
ncats-cyp3a4
Ready
CYP3A4 metabolism

Analysis of metabolic stability, determining the inhibition of CYP3A4 activity and whether the compounds are a substrate for the CYP3A$ enzyme. The data to build these models has been publicly available at PubChem (AID1645840, AID1645841, AID1645842) by ADME@NCATS.

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.
CYP450
ADME
Metabolism
https://github.com/ersilia-os/eos3ev6
https://dmd.aspetjournals.org/content/49/9/822
https://github.com/ncats/ncats-adme
None
ZakiaYahya
6/7/2023
https://github.com/ZakiaYahya
https://hub.docker.com/r/ersiliaos/eos3ev6
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ev6.zip
Yes
Local
Q3
2023
ncats-cyp2d6
Ready
CYP2D6 metabolism

Analysis of metabolic stability, determining the inhibition of CYP2D6 activity and whether the compounds are a substrate for the CYP2D6 enzyme. The data to build these models has been publicly available at PubChem (AID1645840, AID1645841, AID1645842) by ADME@NCATS

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.
CYP450
ADME
Metabolism
https://github.com/ersilia-os/eos7nno
https://dmd.aspetjournals.org/content/49/9/822
https://github.com/ncats/ncats-adme
None
ZakiaYahya
6/7/2023
https://github.com/ZakiaYahya
https://hub.docker.com/r/ersiliaos/eos7nno
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7nno.zip
Yes
Local
Q3
2023
ncats-cyp2c9
Ready
CYP2C9 metabolism

Analysis of metabolic stability, determining the inhibition of CYP2C9 activity and whether the compounds are a substrate for the CYP2C9 enzyme. The data to build these models has been publicly available at PubChem (AID1645840, AID1645841, AID1645842) by ADME@NCATS

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.
CYP450
ADME
Metabolism
https://github.com/ersilia-os/eos5jz9
https://dmd.aspetjournals.org/content/49/9/822
https://github.com/ncats/ncats-adme
None
ZakiaYahya
5/7/2023
https://github.com/ZakiaYahya
https://hub.docker.com/r/ersiliaos/eos5jz9
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5jz9.zip
Yes
Local
Q3
2023
bidd-molmap-fingerprint
Ready
Molecular fingerprint maps based on broadly learned knowledge-based representations

Molecular representation of small molecules via ingerprint-based molecular maps (images). Typically, the goal is to use these images as inputs for an image-based deep learning model such as a convolutional neural network. The authors have demonstrated high performance of MolMap out-of-the-box with a broad range of tasks from MoleculeNet.

Pretrained
Representation
Compound
Single
Image
Descriptor
Float
List
Image representation of a molecule. Each pixel represents a molecular feature (37 rows, 36 columns, flattened with reshape)
Fingerprint
https://github.com/ersilia-os/eos59rr
https://www.nature.com/articles/s42256-021-00301-6
https://github.com/shenwanxiang/bidd-molmap
GPL-3.0
samuelmaina
3/7/2023
https://github.com/samuelmaina
https://hub.docker.com/r/ersiliaos/eos59rr
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos59rr.zip
Local
Q3
2023
h3d-virtual-screening-cascade-light
Ready
H3D virtual screening cascade light

This panel of models provides predictions for the H3D virtual screening cascade. It leverages the Ersilia Compound Embedding and FLAML. The H3D virtual screening cascade contains models for Mycobacterium tuberculosis and Plasmodium falciparum IC50 predictions, as well as ADME, cytotoxicity and solubility assays

In-house
Classification
Compound
Single
Probability
Float
List
The raw scores are the ones emerging from the FLAML model. The ones with a sufix _perc represent the percentile in the scale 0-1 over a ChEMBL dataset of 200k compounds.
Malaria
P.falciparum
Tuberculosis
M.tuberculosis
ADME
Cytotoxicity
Solubility
https://github.com/ersilia-os/eos7kpb
https://www.nature.com/articles/s41467-023-41512-2
https://github.com/ersilia-os/h3d-screening-cascade-models
GPL-3.0
miquelduranfrigola
9/5/2023
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7kpb
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7kpb.zip
Yes
Local
Q2
2023
ersilia-compound-embedding
Ready
Ersilia Compound Embeddings

Bioactivity-aware chemical embeddings for small molecules. Using transfer learning, we have created a fast network that produces embeddings of 1024 features condensing physicochemical as well as bioactivity information The training of the network has been done using the FS-Mol and ChEMBL datasets, and Grover, Mordred and ECFP descriptors

In-house
Representation
Compound
Single
Descriptor
Float
List
Embedding of 1024 features representing a compound
Descriptor
Embedding
https://github.com/ersilia-os/eos2gw4
https://www.nature.com/articles/s41467-023-41512-2
https://github.com/ersilia-os/compound-embedding
GPL-3.0
GemmaTuron
13/4/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos2gw4
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2gw4.zip
Local
Q2
2023
molfeat-chemgpt
Ready
ChemGPT-4.7

ChemGPT (4.7M params) is a language-based transformer model for generative molecular modeling, which was pretrained on the PubChem10M dataset. Pre-trained ChemGPT models are also robust, self-supervised representation learners that generalize to previously unseen regions of chemical space and enable embedding-based nearest-neighbor search.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
128 features based on a chemical language model
Descriptor
Chemical language model
Chemical graph model
Embedding
https://github.com/ersilia-os/eos3cf4
https://chemrxiv.org/engage/chemrxiv/article-details/627bddd544bdd532395fb4b5
https://molfeat.datamol.io/featurizers/ChemGPT-4.7M
Apache-2.0
GemmaTuron
11/4/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos3cf4
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3cf4.zip
Local
Q2
2023
molfeat-estate
Ready
Estate Molecular Descriptors

Electrotopological state (Estate) indices are numerical values computed for each atom in a molecule, and which encode information about both the topological environment of that atom and the electronic interactions due to all other atoms in the molecule

Pretrained
Representation
Compound
Single
Descriptor
Float
List
79 Electrotopological features
Fingerprint
Descriptor
https://github.com/ersilia-os/eos3zur
https://link.springer.com/article/10.1023/A:1015952613760
https://molfeat.datamol.io/featurizers/estate
Apache-2.0
GemmaTuron
11/4/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos3zur
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3zur.zip
Local
Q2
2023
ncats-pampa74
Ready
Parallel Artificial Membrane Permeability Assay (PAMPA) 7

Parallel Artificial Membrane Permeability is an in vitro surrogate to determine the permeability of drugs across cellular membranes. PAMPA at pH 7.4 was experimentally determined in a dataset of 5,473 unique compounds by the NIH-NCATS. 50% of the dataset was used to train a classifier (SVM) to predict the permeability of new compounds, and validated on the remaining 50% of the data, rendering an AUC = 0.88. The Peff was converted to logarithmic, log Peff value lower than 2.0 were considered to h

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of a compound being poorly permeable (logPeff < 1)
ADME
Permeability
LogP
https://github.com/ersilia-os/eos9tyg
https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext
https://github.com/ncats/ncats-adme
None
pauline-banye
7/4/2023
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos9tyg
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9tyg.zip
Yes
Local
Q2
2023
ncats-cyp450
Ready
CYP450 metabolism

Analysis of metabolic stability, determining the inhibition of CYP450 activity and whether the compounds are a substrate for the CYP450 enzymes. The data to build these models is publicly available at PubChem, AID1645840, AID1645841, AID1645842. The tested cyps include CYP2C9, CYP2D6 and CYP3A4.

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.
CYP450
ADME
Metabolism
https://github.com/ersilia-os/eos44zp
https://dmd.aspetjournals.org/content/49/9/822
https://github.com/ncats/ncats-adme
None
GemmaTuron
6/4/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos44zp
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos44zp.zip
Yes
Local
Q2
2023
qcrb-tb
Ready
QcrB Inhibition (M. tuberculosis)

The cytochrome bcc complex (QcrB) is a subunit of the mycobacterial cyt-bcc-aa3 oxidoreductase in the electron transport chain (ETC), and it has been suggested as a good M.tb target due to the bacteria's dependence on oxidative phosphorylation for its growth. The authors use a dataset of 352 molecules, of which 277 are classified as active (QIM < 1 uM), 58 as moderately active ( 1 > QIM < 20 uM) and 78 as inactive (QIM > 20). Qim refers to quantification of intracellular mycobacteria.

Pretrained
Classification
Compound
Single
Other value
Integer
Single
Class 1: active(QIM < 1uM), Class 2:moerately active (1 < QIM < 20uM), Class 3:inactive (QIM > 20uM)
M.tuberculosis
Antimicrobial activity
https://github.com/ersilia-os/eos24jm
https://pubs.acs.org/doi/full/10.1021/acsomega.2c01613
https://github.com/CoutinhoLab/Q-TB/
CC
GemmaTuron
6/4/2023
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos24jm
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos24jm.zip
Yes
Local
Q2
2023
rxn-fingerprint
Ready
RXNFP - chemical reaction fingerprints

RXNFP uses a pre-trained BERT Language Model to transform a reaction represented as smiles into a fingerprint amenable for downstream applications. The authors show how the RXN-fps can be used to identify nearest neighbors on reaction datasets, or map the reaction space without knowing the reaction centers.

Pretrained
Representation
Compound
Single
Descriptor
Float
Matrix
Fingerprint of the reaction.
Fingerprint
Embedding
Chemical synthesis
https://github.com/ersilia-os/eos6aun
https://www.nature.com/articles/s42256-020-00284-w
https://github.com/rxn4chemistry/rxnfp/tree/master/
MIT
samuelmaina
28/3/2023
https://github.com/samuelmaina
https://hub.docker.com/r/ersiliaos/eos6aun
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6aun.zip
Local
Q1
2023
ncats-hlm
Ready
Human Liver Microsomal Stability

The Human Liver Microsomal assay takes into account the liver-mediated drug metabolism to assess the stability of a compound in the human body. The NIH-NCATS group took a proprietary dataset of 4300 compounds with its associated HLM (in vitro half-life; unstable ≤  30 min, stable >30 min) and used it to train a classifier.

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of a compound being unstable in a HLM assay (half-life ≤ 30min)
Metabolism
ADME
Human
Microsomal stability
Half-life
https://github.com/ersilia-os/eos31ve
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00426-7
https://github.com/ncats/ncats-adme/tree/master
None
pauline-banye
27/3/2023
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos31ve
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos31ve.zip
Yes
Local
Q1
2023
s2dv-hepg2-toxicity
Ready
S2DV HepG2 toxicity

The model uses Word2Vec, a natural language processing technique to represent SMILES strings. The model was trained on over <2000 small molecules with associated experimental HepG2 cytotoxicity data (IC50) to classify compounds as HepG2 toxic (IC50 <= 30 uM) or non-toxic. Data was gathered from the public repository ChEMBL.

Pretrained
Classification
Compound
Single
Experimental value
Float
Single
Probability of HepG2 Toxicity (IC50 < 30 uM)
ChEMBL
IC50
Toxicity
https://github.com/ersilia-os/eos2fy6
https://pubmed.ncbi.nlm.nih.gov/35062019/
https://github.com/NTU-MedAI/S2DV
Apache-2.0
emmakodes
27/3/2023
https://github.com/emmakodes
https://hub.docker.com/r/ersiliaos/eos2fy6
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2fy6.zip
Local
Q1
2023
hob-pre
Ready
Human oral bioavailability prediction

HobPre predicts the oral bioavailability of small molecules in humans. It has been trained using public data on ~1200 molecules (Falcón-Cano et al, 2020, complemented with other literature and ChEMBL compounds). The molecules were labeled according to two cut-offs: HOB > 20% and HOB > 50%, due to ongoing discussions as to which would be a more appropriate cut-off.

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability of a compound having high oral bioavailability (HOB >20% and HOB >50%)
ADME
Solubility
Human
https://github.com/ersilia-os/eos2lqb
https://doi.org/10.1186/s13321-021-00580-6
https://github.com/whymin/HOB
None
HellenNamulinda
27/3/2023
https://github.com/HellenNamulinda
https://hub.docker.com/r/ersiliaos/eos2lqb
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2lqb.zip
Yes
Local
Q1
2023
redial-2020
Ready
SARS-CoV-2 antiviral prediction: REDIAL-2020

Predictor of several endpoints related to Sars-CoV-2. It provides predictions for Live Virus Infectivity, Viral Entry, Viral Replication, In Vitro Infectivity and Human Cell Toxicity using a combination of three models. Consensus results are obtained by averaging the prediction for the three different models for each activity and toxicity models. The models have been built using NCATS COVID19 data. Further details on result interpretations can be found here: https://drugcentral.org/Redial

Pretrained
Classification
Compound
Single
Probability
Float
Single
The model returns the probability of 1 (active) in each assay. Good drugs are active in CPE, 3CL and are inactive in cytotox, hCYTOX and ACE2 and/or are active in at least one of the following: AlphaLISA, CoV-PPE, MERS-PPE, while inactive in the counter screen, respectively: TruHit, CoV-PPE_cs, MERS-PPE_cs.
Sars-CoV-2
COVID19
Antiviral activity
https://github.com/ersilia-os/eos8fth
https://www.nature.com/articles/s42256-021-00335-w#Sec9
https://github.com/sirimullalab/redial-2020/tree/v1.0
MIT
Pradnya2203
27/3/2023
https://github.com/Pradnya2203
https://hub.docker.com/r/ersiliaos/eos8fth
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8fth.zip
Yes
Local
Q1
2023
s2dv-hbv
Ready
Inhibition of Hepatits B virus

The model uses Word2Vec, a natural language processing technique to represent SMILES strings. The model was trained on over <4000 small molecules with associated experimental HBV inhibition data (IC50) to classify compounds as HBV inhibitors (IC50 <= 1 uM) or non-inhibitors. Data was gathered from the public repository ChEMBL.

Pretrained
Classification
Compound
Single
Experimental value
Float
Single
Probability of inhibition of HBV (IC50 < 1uM)
Antiviral activity
IC50
HBV
ChEMBL
https://github.com/ersilia-os/eos8lok
https://pubmed.ncbi.nlm.nih.gov/35062019/
https://github.com/NTU-MedAI/S2DV
Apache-2.0
emmakodes
24/3/2023
https://github.com/emmakodes
https://hub.docker.com/r/ersiliaos/eos8lok
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8lok.zip
Yes
Local
Q1
2023
ncats-hlcs
Ready
Human Liver Cytosolic Stability

The human liver cytosol stability model is used for predicting the stability of a drug in the cytosol of human liver cells, which is beneficial for identifying potential drug candidates early during the drug discovery process. If a drug compound is quickly absorbed, it may not reach the intended target in the body or become toxic. On the other hand, if a drug compound is too stable, it could accumulate and cause detrimental effects. The authors use an NCATS dataset of 1450 compounds screened in

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of a compound being unstable (half-life ≤ 30min) due to liver cells metabolism
ADME
Metabolism
Human
Half-life
https://github.com/ersilia-os/eos9yy1
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00426-7
https://github.com/ncats/ncats-adme
None
pauline-banye
1/3/2023
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos9yy1
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9yy1.zip
Yes
Local
Q1
2023
idl-ppbopt
Ready
Human Plasma Protein Binding (PPB) of Compounds

IDL-PPB aims to obtain the plasma protein binding (PPB) values of a compound. Based on an interpretable deep learning model and using the algorithm fingerprinting (AFP) this model predicts the binding affinity of the plasma protein with the compound.

Pretrained
Regression
Compound
Single
Experimental value
Float
Single
This model receives smiles as input and returns as output the fraction PPB, which measures the affinity of the binding of the plasma protein. In the analysis of results by the author, they indicate high affinity (fraction of ppb >80%), medium affinity (40% <= fraction of ppb <=80%) and as low levels of affinity (fraction of ppb < 40%). Note: Inorganics and salts are out of the applicability domain of the model, So for these compounds the output is Null.
Fraction bound
ADME
https://github.com/ersilia-os/eos22io
https://pubs.acs.org/doi/10.1021/acs.jcim.2c00297
https://github.com/Louchaofeng/IDL-PPBopt
GPL-3.0
carcablop
3/2/2023
https://github.com/carcablop
https://hub.docker.com/r/ersiliaos/eos22io
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos22io.zip
Local
Q1
2023
ncats-solubility
Ready
Aqueous Kinetic Solubility

Kinetic aqueous solubility (μg/mL) was experimentally determined using the same SOP in over 200 NCATS drug discovery projects. A final dataset of 11780 non-redundant molecules and their associated solubility was used to train a SVM classifier. Approximately half of the dataset has poor solubility (< 10 μg/mL), and two-thirds of these low soluble molecules report values of < 1 μg/mL. A subset of the data used is available at PubChem (AID 1645848).

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of a compound having poor solublibity (< 10 µg/ml)
ADME
Solubility
https://github.com/ersilia-os/eos74bo
https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext
https://github.com/ncats/ncats-adme
None
pauline-banye
31/1/2023
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos74bo
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos74bo.zip
Yes
Local
Q1
2023
ncats-pampa5
Ready
Parallel Artificial Membrane Permeability Assay 5

Parallel Artificial Membrane Permeability is an in vitro surrogate to determine the permeability of drugs across cellular membranes. PAMPA at pH 5 was experimentally determined in a dataset of 5,473 unique compounds by the NIH-NCATS. 50% of the dataset was used to train a classifier (SVM) to predict the permeability of new compounds, and validated on the remaining 50% of the data, rendering an AUC = 0.88. The Peff was converted to logarithmic, log Peff value lower than 2.0 were considered to hav

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of a compound being poorly permeable (logPeff < 1)
ADME
Permeability
LogP
https://github.com/ersilia-os/eos81ew
https://www.sciencedirect.com/science/article/pii/S0968089621005964
https://github.com/ncats/ncats-adme
None
pauline-banye
29/1/2023
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos81ew
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos81ew.zip
Yes
Local
Q1
2023
image-mol-gpcr
Ready
imagemol-gpcr

ImageMol is a Representation Learning Framework that utilizes molecule images for encoding molecular inputs as machine readable vectors for downstream tasks such as bio-activity prediction, drug metabolism analysis, or drug toxicity prediction. The approach utilizes transfer learning, that is, pre-training the model on massive unlabeled datasets to help it in generalizing feature extraction and then fine tuning on specific tasks. This model is fine tuned on 10 GPCR assays with the largest number

Pretrained
Regression
Compound
Single
Score
Float
Single
Binding activity prediction (as a regression task) for the following GPCR assays: 5HT1A, 5HT2A, AA1R, AA2AR, AA3R, CNR2, DRD2, DRD3, HRH3, OPRM
Target identification
GPCR
https://github.com/ersilia-os/eos93h2
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/HongxinXiang/ImageMol
MIT
DhanshreeA
25/1/2023
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos93h2
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos93h2.zip
Local
Q1
2023
datamol-smiles2canonical
Ready
Converter of SMILES in Canonical, Selfie, Inchi, Inchi Key form

Using the Datamol package, the model receives a SMILE as input, then goes through a process of sanitizing and standardization of the molecule to generate four outputs: Canonical SMILES, SELFIES, InChI and InChIKey

Pretrained
Representation
Compound
Single
Compound
String
Matrix
Compound represented in its canonical SMILES, SELFIES, InChI and InChIKey forms
Chemical notation
https://github.com/ersilia-os/eos7qga
https://doc.datamol.io/stable/tutorials/Preprocessing.html
https://github.com/datamol-org/datamol
Apache-2.0
carcablop
25/1/2023
https://github.com/carcablop
https://hub.docker.com/r/ersiliaos/eos7qga
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7qga.zip
Local
Q1
2023
image-mol-embeddings
Ready
Molecular representation learning

Representation Learning Framework that utilizes molecule images for encoding molecular inputs as machine readable vectors for downstream tasks such as bio-activity prediction, drug metabolism analysis, or drug toxicity prediction. The approach utilizes transfer learning, that is, pre-training the model on massive unlabeled datasets to help it in generalizing feature extraction and then fine tuning on specific tasks.

Pretrained
Representation
Compound
Single
Descriptor
Float
Matrix
ImageMol embeddings of shape [1512] reshaped as a Numpy 1D array before serializing. These embeddings can be used as the input features of a fully connected classification or regression layer in a neural network.
Embedding
https://github.com/ersilia-os/eos4avb
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/HongxinXiang/ImageMol
MIT
DhanshreeA
25/1/2023
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos4avb
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4avb.zip
Local
Q1
2023
sars-cov-2-antiviral-screen
Ready
SARS-CoV-2 Anti viral screening

ImageMol is a Representation Learning Framework that utilizes molecule images for encoding molecular inputs as machine readable vectors for downstream tasks such as bio-activity prediction, drug metabolism analysis, or drug toxicity prediction. The approach utilizes transfer learning, that is, pre-training the model on massive unlabeled datasets to help it in generalizing feature extraction and then fine tuning on specific tasks. This model is fine tuned on 13 assays concerned with a number of t

Pretrained
Classification
Compound
Single
Boolean
Integer
List
The output is comprised of binary classification across thirteen assays that are as follows: 3C-like enzymatic activity (3CL), ACE2 enzymatic activity (ACE2), Human Embryonic Kidney 293 Cell line toxicity (HEK293), Human fibroblast toxicity (Human), MERS Pseudotyped particle entry (MERS_PPE), MERS Pseudotyped particle entry counterscreen (MERS_PPE_cs), SarsCov Pseudotyped particle entry (Cov_PPE), SarsCov Pseudotyped particle entry counterscreen (Cov_PPE_cs), SarsCov2 cytopathic effect (COV2_CPE), SarsCov2 cytopathic effect counterscreen (COV2_Cytotox), Spike ACE2 Protein-protein interaction (AlphaLISA), Spike ACE2 Protein-protein interaction counterscreen (TruHit), Transmembrane protease serine 2 enzymatic activity (TMPRSS2)
Sars-CoV-2
Antiviral activity
COVID19
https://github.com/ersilia-os/eos4cxk
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/HongxinXiang/ImageMol
MIT
DhanshreeA
25/1/2023
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos4cxk
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4cxk.zip
Yes
Local
Q1
2023
image-mol-bace
Ready
ImageMol human beta-secretase-1 (BACE-1) inhibition

This model has been developed using ImageMol, a deep learning model pretrained on 10 million unlabelled small molecules and fine-tuned in a second step to predict the binding of inhibitors to the human beta secretase 1 (BACE-1) protein. The BACE-1 dataset from MoleculeNet contains 1522 compounds with their associated pIC50. A compound with pIC50 => 7 is considered a BACE-1 inhibitor.

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of BACE-1 inhibition (>0.5: Inhibitor). Compounds with pIC50 => 7 are considered BACE-1 inhibitors
BACE
Chemical graph model
MoleculeNet
https://github.com/ersilia-os/eos8c0o
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/ChengF-Lab/ImageMol
MIT
DhanshreeA
17/1/2023
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos8c0o
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8c0o.zip
Local
Q1
2023
image-mol-hiv
Ready
ImageMol HIV growth inhibition

TThis model has been developed using ImageMol, a deep learning model pretrained on 10 million unlabelled small molecules and fine-tuned in a second step to predict the inhibition of the human immunodeficiency virus (HIV). The HIV dataset is from MoleculeNet and contains 43850 small molecules and their in vitro activity against HIV (CA - Confirmed active, CM - Confirmed moderately active, CI - Confirmed inactive). The classification was based on EC50 values and expert knowledge.

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of HIV inhibition. Active compounds are considered those classified as CA/CM.
HIV
Antiviral activity
MoleculeNet
https://github.com/ersilia-os/eos6hy3
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/ChengF-Lab/ImageMol
MIT
DhanshreeA
17/1/2023
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos6hy3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6hy3.zip
Yes
Local
Q1
2023
ncats-rlm
Ready
Rat liver microsomal stability

Hepatic metabolic stability is key to ensure the drug attains the desired concentration in the body. The Rat Liver Microsomal (RLM) stability is a good approximation of a compound’s stability in the human body, and NCATS has collected a proprietary dataset of 20216 compounds with its associated RLM (in vitro half-life; unstable ≤30 min, stable >30 min) and used it to train a classifier based on an ensemble of several ML approaches (random forest, deep neural networks, graph convolutional neural

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of a compound being unstable in RLM assay (half-life ≤ 30min)
Microsomal stability
Rat
ADME
Metabolism
Half-life
https://github.com/ersilia-os/eos5505
https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext
https://github.com/ncats/ncats-adme
None
pauline-banye
12/1/2023
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos5505
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5505.zip
Yes
Local
Q1
2023
smiles2iupac
Ready
STOUT: SMILES to IUPAC name translator

Small molecules are represented by a variety of machine-readable strings (SMILES, InChi, SMARTS, among others). On the contrary, IUPAC (International Union of Pure and Applied Chemistry) names are devised for human readers. The authors trained a language translator model treating the SMILES and IUPAC as two different languages. 81 million SMILES were downloaded from PubChem and converted to SELFIES for model training. The corresponding IUPAC names for the 81 million SMILES were obtained with Che

Pretrained
Representation
Compound
Single
Text
String
Single
IUPAC name of a specific SMILES
Chemical notation
Chemical language model
https://github.com/ersilia-os/eos4se9
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4
https://github.com/Kohulan/Smiles-TO-iUpac-Translator
MIT
carcablop
9/1/2023
https://github.com/carcablop
https://hub.docker.com/r/ersiliaos/eos4se9
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4se9.zip
Local
Q1
2023
drugtax
Ready
DrugTax: Drug taxonomy

DrugTax takes SMILES inputs and classifies the molecule according to their taxonomy, organic or inorganic kingdom and their subclasses, using a 0/1 binary classification for each one. It generates a vector of 163 features including the taxonomy classification and other key information such as number of carbons, nitrogens… These vectors can be used for subsequent molecular representation in chemoinformatic pipelines.

Pretrained
Representation
Compound
Single
Descriptor
Integer
List
A vector of 163 points, each one corresponding to a particular taxonomic or structural molecular feature
Fingerprint
Descriptor
https://github.com/ersilia-os/eos24ci
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-022-00649-w
https://github.com/MoreiraLAB/DrugTax
GPL-3.0
Femme-js
3/1/2023
https://github.com/Femme-js
https://hub.docker.com/r/ersiliaos/eos24ci
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos24ci.zip
Local
Q1
2023
embeddings-extraction
To do
Text Embeddings Extraction using Pretrained Lamguage Models

Syntactic relationship and intrinsic information carried out in textual input data can be represented in the form of text embeddings. These embeddings can be utilised for the downstream tasks like classification, regression etc. BioMed-RoBERTa-base is a trandformer-based language model adapted from RoBERTa-base, pretrained on 2.68 million biomedical domain specific scientific papers (7.55B tokens and 47GB of data). The multi-layer structure of transformer captures different levels of representat

Pretrained
Representation
Text
List
Descriptor
Float
List
A list consisting of 768 float points values which is representation of textual input in numerical vector form.
Chemical language model
Embedding
https://github.com/ersilia-os/eos1086
https://aclanthology.org/2020.acl-main.740/
https://huggingface.co/allenai/biomed_roberta_base
Apache-2.0
Femme-js
25/1/2023
https://github.com/Femme-js
Local
Q1
2023
iupac2smiles
To do
STOUT: SMILES to IUPAC name translator

Small molecules are represented by a variety of machine-readable strings (SMILES, InChi, SMARTS, among others). On the contrary, IUPAC (International Union of Pure and Applied Chemistry) names are devised for human readers. The authors trained a language translator model treating the SMILES and IUPAC as two different languages. 81 million SMILES were downloaded from PubChem and converted to SELFIES for model training. The corresponding IUPAC names for the 81 million SMILES were obtained with Che

Pretrained
Representation
Text
Single
Compound
String
Single
SMILES of the molecule corresponding to the IUPAC name input
Chemical notation
Chemical language model
https://github.com/ersilia-os/eos5ecc
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4
https://github.com/Kohulan/Smiles-TO-iUpac-Translator
MIT
carcablop
13/1/2023
https://github.com/carcablop
Local
Q1
2023
meta-trans
Ready
MetaTrans: human drug metabolites

Small molecules are metabolized by the liver in what is known as phase I and phase II reactions. Those can lead to reduced drug efficacy and generation of toxic metabolites, causing serious side effects. This model predicts the human metabolites of small molecules using a molecular transformer pr-trained on general chemical reactions and fine tuned to human metabolism. It provides up to 10 metabolites for each input molecule.

Pretrained
Generative
Compound
Single
Compound
String
List
A maximum of 10 human metabolites generated from the input molecule
Metabolism
https://github.com/ersilia-os/eos935d
https://pubs.rsc.org/en/content/articlelanding/2020/sc/d0sc02639e#fn1
https://github.com/KavrakiLab/MetaTrans
BSD-3.0
carcablop
20/12/2022
https://github.com/carcablop
https://hub.docker.com/r/ersiliaos/eos935d
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos935d.zip
Local
Q4
2022
crem-structure-generation
Ready
CReM fragment based structure generation

CReM (chemically reasonable mutations) is a fragment-based generative model that takes as input a small molecule, breaks it down into fragments and iteratively replaces them with other fragments from a database. It has three implementations (MUTATE: arbitrarily replaces one fragment with another one); GROW (arbitrarily replaces an hydrogen with another fragment) and LINK (replaces hydrogen atoms in two molecules to link them with a fragment). Here, we use a MUTATE and GROWTH approach, which prov

Pretrained
Generative
Compound
Single
Compound
String
List
Up to 100 newly generated molecules
Compound generation
https://github.com/ersilia-os/eos4q1a
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00431-w
https://github.com/DrrDom/crem
BSD-3.0
DhanshreeA
20/12/2022
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos4q1a
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4q1a.zip
Local
Q4
2022
moler-enamine-fragments
Ready
Extending molecular scaffolds with fragments

MoLeR is a graph-based generative model that combines fragment-based and atom-by-atom generation of new molecules with scaffold-constrained optimization. It does not depend on generation history and therefore MoLeR is able to complete arbitrary scaffolds. The model has been trained on the GuacaMol dataset. Here we sample a fragment library from Enamine.

Pretrained
Generative
Compound
Single
Compound
String
List
1000 new molecules are sampled for each input molecule, preserving its scaffold.
Chemical graph model
Compound generation
https://github.com/ersilia-os/eos9taz
https://arxiv.org/abs/2103.03864
https://github.com/microsoft/molecule-generation
MIT
anamika-yadav99
16/11/2022
https://github.com/anamika-yadav99
https://hub.docker.com/r/ersiliaos/eos9taz
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9taz.zip
Local
Q4
2022
molt5-smiles-to-caption
Ready
MolT5-Translation between Molecules and Natural Language

MolT5 (Molecular T5) is a self-supervised learning framework pretrained on unlabeled natural language text and molecule strings with two end goals: molecular captioning (given a molecule, generate its description) and text-based de novo molecular generation (given a description, propose a molecule that matches it). This implementation is focused on molecular captioning.

Pretrained
Representation
Compound
Single
Text
String
Single
Description of a molecule
Chemical language model
Chemical notation
https://github.com/ersilia-os/eos2rd8
https://arxiv.org/abs/2204.11817
https://github.com/blender-nlp/MolT5
None
Amna-28
14/11/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos2rd8
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2rd8.zip
Local
Q4
2022
bayesian-drug-likeness
Ready
Drug-likeness prediction with Bayesian neural networks

To define drug-likeness, a set of 2136 approved drugs from DrugBank was taken as drug-like, and three negative datasets were selected from ZINC15 (19M), the Network of Organic Chemistry (6M) and ligands from the Protein Data Bank (13k), respectively. The drug dataset was combined with an equal subsampling of the negative dataset for each experiment, using five different molecular representations (Mold2, RDKit, MCS, EXFP4, Mol2Vec). We have re-trained it following the author’s specifications.

Retrained
Classification
Compound
Single
Probability
Float
Single
Drug-likeness probability
Drug-likeness
https://github.com/ersilia-os/eos9sa2
https://www.nature.com/articles/s42256-020-0209-y
https://github.com/Nanotekton/drugability/tree/v0.1
Non-commercial
Amna-28
9/11/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos9sa2
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9sa2.zip
Local
Q4
2022
molbloom
Ready
MolBloom: molecule purchasability in ZINC20

This model uses a Bloom filter to query the ZINC20 database to identify if a molecule is purchasable. A bloom filter is a space-efficient probabilistic data structure to identify whether an element is in a given set. Due to the nature of bloom filters, false negatives are not possible (i.e if the model returns False, the molecule is not purchasable). As stated by the author, if the model returns True the molecule is purchasable with an error rate of 0.0003 (according to the ZINC20 catalog).

Pretrained
Classification
Compound
Single
Boolean
String
Single
It returns a boolean (True/False) suggesting whether the molecule is commercially available or not.
ZINC
Compound generation
https://github.com/ersilia-os/eos8a5g
https://github.com/whitead/molbloom/blob/main/CITATION.cff
https://github.com/whitead/molbloom
MIT
Amna-28
2/11/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos8a5g
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8a5g.zip
Local
Q4
2022
mesh-therapeutic-use
Ready
MeSH therapeutic use based on chemical structure

Drug function, defined as Medical Subject Headings (MeSH) “therapeutic use” is predicted based on the chemical structure. 6955 non-redundant molecules, pertaining to one of the twelve therapeutic use classes selected, were downloaded from PubChem and used to train a binary classifier. The model provides the probability that a molecule has one of the following therapeutic uses: antineoplastic, cardiovascular, central nervous system (CNS), anti-infective, gastrointestinal, anti-inflammatory, derma

In-house
Classification
Compound
Single
Probability
Float
List
Probability that the molecule belongs to each therapeutic use specified.
Therapeutic indication
https://github.com/ersilia-os/eos238c
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819987/
https://github.com/jgmeyerucsd/drug-class
GPL-3.0
Amna-28
17/10/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos238c
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos238c.zip
Local
Q4
2022
admetlab-2
Ready
ADMETlab-2

ADMETLab2 is the improved version of ADMETLab, a suite of models for systematic evaluation of ADMET properties. ADMETLab2 provides predictions on 17 physicochemical properties, 13 medicinal chemistry properties, 23 ADME properties, 27 toxicity endpoints and 8 toxicophore rules. The code and training data are not released, using this model posts predictions to the ADMETLab2 online server. The Ersilia Model Hub also offers ADMETLab (v1) as a downloadable package for IP-sensitive queries.

Online
Regression
Compound
Single
Experimental value
Probability
Float
List
Predicted relevant ADMET properties, Tox21 outcomes, physicochemical properties and drug-likeness. Outputs are of mixed type, including classification (labels) and continuous values.
Toxicity
ADME
Lipophilicity
Solubility
Permeability
https://github.com/ersilia-os/eos2v11
https://academic.oup.com/nar/article/49/W1/W5/6249611?login=false
https://admetmesh.scbdd.com/
Proprietary
miquelduranfrigola
16/9/2022
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2v11
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2v11.zip
Local
Q3
2022
metabokiller
Ready
Carcinogenic potential of metabolites and small molecules

Carcinogenicity is a result of several potential effects on cells. This model predicts the carcinogenic potential of a small molecule based on their potential to induce cellular proliferation, genomic instability, oxidative stress, anti-apoptotic responses and epigenetic alterations. Metabokiller uses the Chemical Checker signaturizer to featurize the molecules, and the Lime package to provide interpretable results. Using Metabokiller, the authors screened a panel of human metabolites and exper

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability that the molecule has each of the specified carcinogenic properties
Toxicity
Cancer
Metabolism
https://github.com/ersilia-os/eos1579
https://doi.org/10.1038/s41589-022-01110-7
https://github.com/the-ahuja-lab/Metabokiller
Non-commercial
brosular
30/8/2022
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos1579
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1579.zip
Local
Q3
2022
bidd-molmap-desc
Ready
Molecular maps based on broadly learned knowledge-based representations

Molecular representation of small molecules via descriptor-based molecular maps (images). The fingerprint-based molecular maps are available at eos59rr. These images can be used as inputs for an image-based deep learning model such as a convolutional neural network. The authors have demonstrated high performance of MolMap out-of-the-box with a broad range of tasks from MoleculeNet.

Pretrained
Generative
Compound
Single
Image
Descriptor
Float
Matrix
Image representation of a molecule. Each pixel represents a molecular feature
Descriptor
https://github.com/ersilia-os/eos6m4j
https://www.nature.com/articles/s42256-021-00301-6
https://github.com/shenwanxiang/bidd-molmap
GPL-3.0
miquelduranfrigola
25/8/2022
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6m4j
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6m4j.zip
Local
Q3
2022
maip-malaria
Ready
MAIP: antimalarial activity prediction

Prediction of the antimalarial potential of small molecules. This model is an ensemble of smaller QSAR models trained on proprietary data from various sources, up to a total of >7M compounds. The training sets belong to Evotec, Johns Hopkins, MRCT, MMV - St. Jude, AZ, GSK, and St. Jude Vendor Library. The code and training data are not released, using this model posts predictions to the MAIP online server. The Ersilia Model Hub also offers MAIP-surrogate as a downloadable package for IP-sensitiv

Online
Classification
Compound
Single
Score
Float
Single
Higher score indicates higher antimalarial potential
P.falciparum
Malaria
Antimicrobial activity
https://github.com/ersilia-os/eos4zfy
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00487-2
https://www.ebi.ac.uk/chembl/maip/
None
Amna-28
18/8/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos4zfy
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4zfy.zip
Yes
Local
Q3
2022
chembl-similarity
Ready
Similarity search in ChEMBL

Given a molecule, this model looks for its 100 nearest neighbors in the ChEMBL database, according to ECFP4 Tanimoto similarity. Due to size constraints, the model redirects queries to the ChEMBL server, so when using this model predictions are posted online.

Online
Similarity
Compound
Single
Compound
String
List
List of 100 nearest neighbors
ChEMBL
Similarity
https://github.com/ersilia-os/eos2a9n
https://www.frontiersin.org/articles/10.3389/fchem.2020.00046/full
http://130.92.106.217:8080/chemblMuti.v1/
None
Amna-28
18/8/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos2a9n
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2a9n.zip
Local
Q3
2022
medchem17-similarity
Ready
Similarity search in ChEMBL, DrugBank and UNPD

Given a molecule, this model for its 100 nearest neighbors, according to ECFP4 Tanimoto similarity, in the medicinal chemistry database ChEMBL17_DrugBank17_UNPD17. This combined database contains all the compounds from the three collections (DrugBank, ChEMBL22 and Universal natural product directory (UNPD)) with up to 17 heavy atoms. It features a total of 128k compounds. The whole ChEMBL17_DrugBank17_UNPD17 database is not downloaded with the model, by using it you post queries to an online ser

Online
Similarity
Compound
Single
Compound
String
List
List of 100 nearest neighbors
Similarity
ChEMBL
DrugBank
https://github.com/ersilia-os/eos9c7k
https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201900031
https://gdb-medchem-simsearch.gdb.tools/
None
Amna-28
18/8/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos9c7k
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9c7k.zip
Local
Q3
2022
gdbmedchem-similarity
Ready
GDBMedChem similarity search

The model looks for 100 nearest neighbors of a given molecule, according to ECFP4 Tanimoto similarity, in the GDBMedChem database. GDBMedChem is a 10M molecule-sampling from GDB17, a database containing all the enumerated molecules of up to 17 atoms heavy atoms (166.4B molecules). GDBMedChem compounds have reduced complexity and better synthetic accessibility than GDB17 but retain high sp3 carbon fraction and natural product likeness, providing a database of diverse molecules for drug design. Th

Online
Similarity
Compound
Single
Compound
String
List
List of 100 nearest neighbors
Similarity
ChEMBL
https://github.com/ersilia-os/eos7jlv
https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201900031
https://gdb-medchem-simsearch.gdb.tools/
None
Amna-28
18/8/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos7jlv
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7jlv.zip
Local
Q3
2022
gdbchembl-similarity
Ready
GDBChEMBL similarity search

The model looks for 100 nearest neighbors of a given molecule, according to ECFP4 Tanimoto similarity, in the GDBChEMBL database. GDBChEMBL is a 10M molecule-sampling from GDB17, a database containing all the enumerated molecules of up to 17 atoms heavy atoms (166.4B molecules). GDBChEMBL compounds were selected using a ChEMBL-likeness score, with the objective of having a collection with higher synthetic accessibility and high bioactivity while maintaining continuous coverage of the GDB17 chemi

Online
Similarity
Compound
Single
Compound
String
List
List of 100 nearest neighbors
Similarity
ChEMBL
https://github.com/ersilia-os/eos4b8j
https://www.frontiersin.org/articles/10.3389/fchem.2020.00046/full
https://gdb-chembl-simsearch.gdb.tools/
None
Amna-28
15/8/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos4b8j
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4b8j.zip
Local
Q3
2022
chemical-vae
Ready
Variational autoencoder for small molecule generation

This variational autoencoder (VAE) for chemistry uses an encoder-decoder-predictor framework to predict new small molecules. The input SMILES molecule is converted into a continuous vector, and the decoder converts this molecular representation back to a discrete SMILES. These continuous molecular representations allow for simple operations to generate new chemical matter. The decoder is constrained to produce valid molecules. In addition, a predictor estimates the chemical properties of the mol

Pretrained
Generative
Compound
Single
Compound
String
List
Compounds generated based on the input molecule
Compound generation
https://github.com/ersilia-os/eos3ae7
https://pubs.acs.org/doi/10.1021/acscentsci.7b00572
https://github.com/aspuru-guzik-group/chemical_vae
Apache-2.0
brosular
13/8/2022
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos3ae7
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ae7.zip
Local
Q3
2022
chemnet-distance
Ready
FCD: Fréchet ChemNet Distance to evaluate generative models

The Fréchet ChemNet distance is a metric to evaluate generative models. It unifies, in a single score, whether the generated molecules are valid according to chemical and biological properties as well as their diversity from the training set. The score measures the Fréchet Inception Distance between molecules represented by ChemNet, a deep neural network trained to predict biological and chemical properties of small molecules.

Pretrained
Similarity
Compound
Pair of Lists
Distance
Float
Single
Frechet ChemNet Distance (FCD). Higher FCD indicates higher difference to the training set
Similarity
Bioactivity profile
Compound generation
https://github.com/ersilia-os/eos9be7
https://pubs.acs.org/doi/10.1021/acs.jcim.8b00234
https://github.com/bioinf-jku/FCD
LGPL-3.0
brosular
12/8/2022
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos9be7
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9be7.zip
Local
Q3
2022
bayesherg
Ready
BayeshERG: hERG channel blockade

BayeshERG is a predictor of small molecule-induced blockade of the hERG ion channel. To increase its predictive power, the authors pretrained a bayesian graph neural network with 300,000 molecules as a transfer learning exercise. The pretraining set was obtained from Du et al, 2015, and the fine tuning dataset is a collection of 14,322 molecules from public databases (8488 positives and 5834 negatives). The model was validated on external datasets and experimentally, from 12 selected compounds (

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of hERG channel blockade. The cut-off used in the training set to define hERG blockade was IC50 <= 10 μM
hERG
Toxicity
Cardiotoxicity
https://github.com/ersilia-os/eos4tcc
https://academic.oup.com/bib/article-abstract/23/4/bbac211/6609519
https://github.com/GIST-CSBL/BayeshERG
GPL-3.0
azycn
10/8/2022
https://github.com/azycn
https://hub.docker.com/r/ersiliaos/eos4tcc
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4tcc.zip
Local
Q3
2022
rexgen
Ready
Organic reaction outcome prediction

Utilizes a Weisfeiler-Lehman network (attentive mechanism) to predict the products of an organic reaction given the reactants. The model identifies the reaction centers (set of atoms/bonds that change from reactant to product) and obtains the products directly from a graph-based neural network.

Pretrained
Generative
Compound
List
Compound
String
Flexible List
Products of an organic reaction
Chemical synthesis
https://github.com/ersilia-os/eos5qfo
https://arxiv.org/pdf/1709.04555v3.pdf
https://github.com/connorcoley/rexgen_direct
GPL-3.0
svolk19-stanford
8/8/2022
https://github.com/svolk19-stanford
https://hub.docker.com/r/ersiliaos/eos5qfo
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5qfo.zip
Local
Q3
2022
deepsmiles
Ready
DeepSMILES, an alternate SMILES representation for deep learning

DeepSMILES converts a SMILES string to a more accurate syntax for molecule representation, taking into account both the branches (closed parenthesis in the SMILES strings) and rings (using a single symbol at ring closure that also indicates ring size). This syntax is particularly suitable in generative models, when the output is a SMILES string. With DeepSMILES, scientists can train a network using this new syntax, generate new molecules represented as DeepSMILES and then decode them back to nor

Pretrained
Representation
Compound
Single
Compound
String
Single
String representing a DeepSMILES
Chemical language model
Chemical notation
https://github.com/ersilia-os/eos2mrz
https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c73ed6567dfe7e5fec388d/original/deep-smiles-an-adaptation-of-smiles-for-use-in-machine-learning-of-chemical-structures.pdf
https://github.com/baoilleach/deepsmiles
MIT
brosular
28/7/2022
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos2mrz
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2mrz.zip
Local
Q3
2022
admetlab
Ready
ADMETlab models for evaluation of drug candidates

A series of models for the systematic ADMET evaluation of drug candidate molecules. Models include blood-brain barrier penetration; inhibition and substrate affinity for CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP3A4, and pgp; F 20% and F 30% bioavailability; human intestinal absorption; Ames mutagenicity; skin sensitization; plasma protein binding; volume distribution; LD50 of acute toxicity; human hepatotoxicity; hERG blocking; clearance; half-life; Papp (caco-2 permeability); LogD distribution coeff

Pretrained
Classification
Compound
Single
Experimental value
Float
List
Regression models provide a numerical result (LogS (log mol/L), LogP (distribution coefficient), Papp (Caco-2 permeability in cm/s), PPB (%)). Classifications provide the probability of activity according to ADMETlab thresholds.
ADME
Toxicity
Lipophilicity
Solubility
Permeability
https://github.com/ersilia-os/eos2re5
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0283-x
https://github.com/ifyoungnet/ADMETlab
GPL-3.0
svolk19-stanford
28/7/2022
https://github.com/svolk19-stanford
https://hub.docker.com/r/ersiliaos/eos2re5
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2re5.zip
Local
Q3
2022
deepherg
Ready
Classification of hERG blockers and nonblockers

This model used a multitask deep neural network (DNN) to predict the probability that a molecule is a hERG blocker. It was trained using 7889 compounds with experimental data available (IC50). The checkpoints of the pretrained model were not available, therefore we re-trained the model using the same method but without mol2vec featuriztion. Molecule featurization was instead done with Morgan fingerprints. Six models were tested, with several thresholds for negative decoys (10, 20, 40, 60, 80 and

Retrained
Classification
Compound
Single
Probability
Float
Single
Probability of hERG blockade. Actives are defined as IC50<10, inactives are defined as IC50>80
Toxicity
hERG
Cardiotoxicity
https://github.com/ersilia-os/eos30gr
https://pubs.acs.org/doi/full/10.1021/acs.jcim.8b00769
https://github.com/ChengF-Lab/deephERG
None
azycn
22/7/2022
https://github.com/azycn
https://hub.docker.com/r/ersiliaos/eos30gr
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos30gr.zip
Local
Q3
2022
aizynthfinder
Ready
Retrosynthesis planning

A tool for planning retrosynthesis of a target molecule based on template reactions and a stock of precursors. The algorithm breaks down the input molecule into purchasable blocks until it has been completely solved.

Pretrained
Generative
Compound
Single
Score
String
Float
Flexible List
The fraction of solved precursors and the number of reactions required for synthesis. Close to 1.0 for a solved compound, less than 0.8 for unsolved.
Synthetic accessibility
Chemical synthesis
https://github.com/ersilia-os/eos526j
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00472-1
https://github.com/MolecularAI/aizynthfinder
MIT
svolk19-stanford
19/7/2022
https://github.com/svolk19-stanford
https://hub.docker.com/r/ersiliaos/eos526j
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos526j.zip
Local
Q3
2022
selfies
Ready
SELF-referencIng Embedded Strings

String representation of small molecules that is more robust than SMILES, since, by design, all SELFIES strings are valid molecules. It is particularly helpful when applied in generative models, as all the SELFIES proposed are valid molecules. The authors also found that on generative models, SELFIES produces more diverse molecules than compared to SMILES.

Pretrained
Representation
Compound
Single
Compound
String
Single
String representation of a molecule (SELFIE)
Chemical notation
Chemical language model
Compound generation
https://github.com/ersilia-os/eos6pbf
https://arxiv.org/pdf/1905.13741
https://github.com/aspuru-guzik-group/selfies
Apache-2.0
brosular
14/7/2022
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos6pbf
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6pbf.zip
Local
Q3
2022
pkasolver
Ready
Microstate pKa values

This model employs transfer learning with graph neural networks in order to predict micro-state pKa values of small molecules. The model enumerates the molecule's protonation states and predicts its pKa values. It was trained in two phases, first, using a large ChEMBL dataset and then fine-tuning the model for a small training set of molecules with available pKa values. The model in this repository is the pkasolver-light, which does not require an Epik license and is limited to monoprotic molecu

Pretrained
Regression
Compound
Single
Experimental value
Float
Single
Acidity of a molecule (lower pKa indicates stronger acid)
pKa
ADME
https://github.com/ersilia-os/eos2b6f
https://www.biorxiv.org/content/10.1101/2022.01.20.476787v1
https://github.com/mayrf/pkasolver
MIT
svolk19-stanford
13/7/2022
https://github.com/svolk19-stanford
https://hub.docker.com/r/ersiliaos/eos2b6f
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2b6f.zip
Local
Q3
2022
grover-qm8
Ready
Electronic spectra and excited state energy

Prediction of the electronic spectra and excited state energy of small molecules. The training set is the QM8 from Molecule Net, where the electronic properties have been calculated by multiple quantum mechanic methods. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Pretrained
Regression
Compound
Single
Other value
Float
List
Predicted electronic spectra and excited state energy
MoleculeNet
Chemical graph model
Quantum properties
https://github.com/ersilia-os/eos3xip
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Amna-28
13/7/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos3xip
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3xip.zip
Yes
Local
Q3
2022
grover-qm7
Ready
Atomization energy of small molecules

The model predicts the atomization energy of a molecule. It has been trained using the QM7 dataset from MoleculeNet, a subset of GDB13 containing all molecules up to 23 atoms (7 heavy atoms + C, S, O, N). This dataset contains the computed atomization energy of 7165 molecules. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Pretrained
Regression
Compound
Single
Other value
Float
Single
Atomization energy of the molecue
MoleculeNet
Chemical graph model
Quantum properties
https://github.com/ersilia-os/eos6o0z
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Amna-28
13/7/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos6o0z
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6o0z.zip
Yes
Local
Q3
2022
grover-lipo
Ready
Octanol/water distribution coefficient

Prediction of octanol/water distribution coefficient (logD at pH 7.4) trained using the Lipophilicity Molecule Net dataset. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Pretrained
Regression
Compound
Single
Experimental value
Float
Single
Predicted logD at pH 7.4
MoleculeNet
Lipophilicity
ADME
LogD
Chemical graph model
https://github.com/ersilia-os/eos85a3
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Amna-28
13/7/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos85a3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos85a3.zip
Yes
Local
Q3
2022
grover-esol
Ready
Water solubility

Prediction of water solubility data (log solubility in mols per litre) for common organic small molecules. trained using the Molecule Net ESOL dataset.

Pretrained
Regression
Compound
Single
Experimental value
Float
Single
Log Solubility (Mols/Litre)
Solubility
MoleculeNet
ADME
LogS
Chemical graph model
https://github.com/ersilia-os/eos8451
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Amna-28
13/7/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos8451
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8451.zip
Yes
Local
Q3
2022
grover-freesolv
Ready
Hydration free energy of small molecules in water

Model based on experimental and calculated hydration free energy of small molecules in water, the FreeSolv dataset from MoleculeNet. Hydration free energies are relevant to understand the binding interaction between a molecule (in solution) into its binding site. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Pretrained
Regression
Compound
Single
Other value
Float
Single
Calculated Hydration Free energy in kcal/mol
MoleculeNet
Chemical graph model
Quantum properties
https://github.com/ersilia-os/eos157v
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Amna-28
13/7/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos157v
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos157v.zip
Yes
Local
Q3
2022
grover-toxcast
Ready
ToxCast toxicity panel

Prediction across the ToxCast toxicity panel, containing hundreds of toxicity outcomes, as part of the MoleculeNet benchmark. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability of toxicity against 617 biological targets
Toxicity
ToxCast
Chemical graph model
https://github.com/ersilia-os/eos481p
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Amna-28
13/7/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos481p
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos481p.zip
Yes
Local
Q3
2022
grover-bace
Ready
BACE-1 inhibition

Prediction of Beta-secretase 1 (BACE-1) inhibition. BACE-1 is expressed mainly in neurons and has been involved in the development of Alzheimer's disease. This model has been trained on the BACE dataset from MoleculeNet using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability that the molecule is a BACE-1 inhibitor (using a 0.1 uM cut-off)
Alzheimer
BACE
MoleculeNet
Chemical graph model
https://github.com/ersilia-os/eos2mhp
https://arxiv.org/abs/2007.02835
https://github.com/tencent-ailab/grover
MIT
Amna-28
13/7/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos2mhp
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2mhp.zip
Yes
Local
Q3
2022
grover-clintox
Ready
Toxicity at clinical trial stage

Using the Molecule Net dataset ClinTox, the authors trained a classification model to predict the likelihood of failure in clinical trials due to toxicity. The dataset has been built using FDA approved drugs (non-toxic) and a set of drugs that have failed at advanced clinical trial stages. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability that a molecule is approved by the FDA and probability that a molecule shows toxicity in clinical trials
Toxicity
MoleculeNet
Chemical graph model
Side effects
https://github.com/ersilia-os/eos6fza
https://arxiv.org/abs/2007.02835
https://github.com/tencent-ailab/grover
MIT
Amna-28
13/7/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos6fza
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6fza.zip
Yes
Local
Q3
2022
grover-tox21
Ready
Predicts activity of compounds accross the Tox21 panel

Predicts activity of compounds in the Tox21 toxicity panel, comprising of 12 toxicity pathways, as part of the MoleculeNet benchmark datasets. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Pretrained
Classification
Compound
Single
Probability
Float
List
Toxicity measurements against 12 biological targets
Tox21
Toxicity
Chemical graph model
https://github.com/ersilia-os/eos5smc
https://papers.nips.cc/paper/2020/file/94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf
https://github.com/tencent-ailab/grover
MIT
Amna-28
12/7/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos5smc
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5smc.zip
Yes
Local
Q3
2022
sa-score
Ready
Synthetic accessibility score

Estimation of synthetic accessibility score (SAScore) of drug-like molecules based on molecular complexity and fragment contributions. The fragment contributions are based on a 1M sample from PubChem and the molecular complexity is based on the presence/absence of non-standard structural features. It has been validated comparing the SAScore and the estimates of medicinal chemist experts for 40 molecules (r2 = 0.89). The SAScore has been contributed to the RDKit Package.

Pretrained
Regression
Compound
Single
Score
Float
Single
Low scores indicate higher synthetic accessibility
Synthetic accessibility
Chemical synthesis
https://github.com/ersilia-os/eos9ei3
https://jcheminf.biomedcentral.com/articles/10.1186/1758-2946-1-8
https://github.com/rdkit/rdkit/tree/master/Contrib/SA_Score
BSD-3.0
https://eos9ei3-tkreo.ondigitalocean.app/
miquelduranfrigola
10/7/2022
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos9ei3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ei3.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos9ei3
Online
Q3
2022
chemtb
Ready
Mycobacterium tuberculosis inhibitor prediction

Identification of active molecules against Mycobacterium tuberculosis using an ensemble of data from ChEMBL25 (Target IDs 360, 2111188 and 2366634). The final model is a stacking model integrating four algorithms, including support vector machine, random forest, extreme gradient boosting and deep neural networks.

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of M.tb inhibition (measured as IC50 at cut-off 5 uM)
M.tuberculosis
IC50
Tuberculosis
Antimicrobial activity
https://github.com/ersilia-os/eos46ev
https://academic.oup.com/bib/article-abstract/22/5/bbab068/6209685
http://cadd.zju.edu.cn/chemtb/
None
Amna-28
28/6/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos46ev
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos46ev.zip
Yes
Local
Q2
2022
ssl-gcn-tox21
Ready
Toxicity prediction across the Tox21 panel with semi-supervised learning

Toxicity prediction across the Tox21 panel from MoleculeNet, comprising 12 toxicity pathways. The model uses the Mean Teacher Semi-Supervised Learning (MT-SSL) approach to overcome the low number of data points experimentally annotated for toxicity tasks. For the MT-SSL, Tox21 (831 compounds and 12 different endpoints) was used as labeled data and a selection of 50K compounds from other MoleculeNet datasets was used as unlabeled data.

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability of toxicity across 12 tasks defined in Tox21
Tox21
Toxicity
MoleculeNet
https://github.com/ersilia-os/eos69p9
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00570-8
https://github.com/chen709847237/SSL-GCN
None
Amna-28
16/6/2022
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos69p9
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos69p9.zip
Local
Q2
2022
coprinet-molecule-price
Ready
Small molecule price prediction

CoPriNet has been trained on 2D graph representations of small molecules with their associated price in the Mcule catalog. The predicted price provides a better overview of the compound availability than standard synthetic accessibility scores or retrosynthesis tools. The Mcule catalog is proprietary but the trained model as well as the test dataset (100K) are publicly available.

Pretrained
Regression
Compound
Single
Other value
Float
Single
Price value prediction
Price
Compound generation
Chemical synthesis
https://github.com/ersilia-os/eos7a45
https://pubs.rsc.org/en/content/articlelanding/2023/dd/d2dd00071g
https://github.com/oxpig/CoPriNet
MIT
anamika-yadav99
28/3/2022
https://github.com/anamika-yadav99
https://hub.docker.com/r/ersiliaos/eos7a45
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7a45.zip
Local
Q1
2022
compound-test-5
Test
Test model 5

Test model 5

Dummy
Dummy
Compound
List
Dummy
Dummy model
Dummy
GPL-3.0
miquelduranfrigola
19/8/2022
https://github.com/miquelduranfrigola
Local
Q3
2022
compound-test-4
Test
Test model 4

Test model 4

Dummy
Dummy
Compound
Single
Dummy
Dummy model
Dummy
GPL-3.0
miquelduranfrigola
21/7/2022
https://github.com/miquelduranfrigola
Local
Q3
2022
eos-template-test
Test
Test for the eos-template

This is a vanilla test for the eos-template

Dummy
Dummy
Compound
Single
Dummy
Dummy model
Dummy
https://github.com/ersilia-os/eost00
GPL-3.0
miquelduranfrigola
12/7/2022
https://github.com/miquelduranfrigola
Local
Q3
2022
deepfl-logp
Ready
Membrane permeability of fluorescent probes

A deep neural network was trained to predict the LogP value of small molecules and fluorescent probes using an experimentally annotated dataset of >13k molecules (OPERA). This dataset was complemented with fluorescent probes to improve the model accuracy in this space. Probes predicted impermeant to cell membranes consistently showed experimental LogP <1.

Pretrained
Regression
Compound
Single
Experimental value
Float
Single
LogP values of > 1 indicate membrane permeability
Permeability
ADME
LogP
https://github.com/ersilia-os/eos65rt
https://www.nature.com/articles/s41598-021-86460-3.epdf?sharing_token=zmYZd6qpwnDwc8tCOYGGf9RgN0jAjWel9jnR3ZoTv0OXuXXr_ZS6VuKQMyMJiA3PeIcqAJZTcpcNZJHblyChkQ2eTpzGXq23YsIcFlG8ayuEptKCJ1DeyIRGrh9O2d5JvvGGB9qG8cXgAuy_k-e1ncAMkAzpTegmR0XUbnftjv0%3D
https://github.com/k-soliman/DeepFl-LogP
GPL-3.0
miquelduranfrigola
10/11/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos65rt
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos65rt.zip
Local
Q4
2021
passive-permeability
Ready
Passive permeability based on simulations

Using Coarse Grained (CG) models, where several atoms are aggregated into a single bead, the authors obtain a set of 500,000 compounds with their simulated permeability across a single-component DOPC lipid bilayer. With this approach, the authors are able to cover a large and representative portion of the chemical space. We have used the data generated in this publication to train a simple regression model to predict compound permeability.

In-house
Regression
Compound
Single
Experimental value
Float
Single
Permeability coefficient (P). Cut-off: 6
Permeability
ADME
Papp
https://github.com/ersilia-os/eos2hbd
https://pubs.acs.org/doi/full/10.1021/acscentsci.8b00718?ref=recommended
https://pubs.acs.org/doi/full/10.1021/acscentsci.8b00718?ref=recommended
None
miquelduranfrigola
10/11/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2hbd
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2hbd.zip
Yes
Local
Q4
2021
pampa-permeability
Ready
PAMPA effective permeability

The authors provide a dataset of 200 small molecules and their experimentally measured permeability in a PAMPA assay. Using this data, we have trained a model that predicts the logarithm of the effective permeability coefficient.

In-house
Regression
Compound
Single
Experimental value
Float
Single
logPe
Permeability
ADME
LogP
https://github.com/ersilia-os/eos97yu
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651837/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651837/
None
miquelduranfrigola
10/11/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos97yu
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos97yu.zip
Yes
Local
Q4
2021
natural-product-fingerprint
Ready
Natural product fingerprint

The model uses a combination of two multilayer perceptron networks (baseline and auxiliar) and an autoencoder-like network to extract natural-product specific fingerprints that outperform traditional methods for molecular representation. The training sets correspond to the coconut database (NP) and the Zinc database (synthetic).

Pretrained
Representation
Compound
Single
Descriptor
String
List
Descriptor of a molecule
Natural product
Fingerprint
Descriptor
https://github.com/ersilia-os/eos6tg8
https://www.sciencedirect.com/science/article/pii/S2001037021003226?via%3Dihub#f0010
https://github.com/kochgroup/neural_npfp
None
miquelduranfrigola
3/11/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6tg8
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6tg8.zip
Local
Q4
2021
maip-malaria-surrogate
Ready
MAIP distillation: antimalarial potential prediction

Prediction of the antimalarial potential of small molecules. This model was originally trained on proprietary data from various sources, up to a total of >7M compounds. The training sets belong to Evotec, Johns Hopkins, MRCT, MMV - St. Jude, AZ, GSK, and St. Jude Vendor Library. In this implementation, we have used a teacher-student approach to train a surrogate model based on ChEMBL data (2M molecules) to provide a lite downloadable version of the original MAIP

Retrained
Classification
Compound
Single
Score
Float
Single
Higher score indicates Higher antimalarial potential
P.falciparum
Malaria
Antimicrobial activity
https://github.com/ersilia-os/eos2gth
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00487-2
https://www.ebi.ac.uk/chembl/maip/
None
miquelduranfrigola
2/11/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2gth
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2gth.zip
No
Local
Q4
2021
syba-synthetic-accessibility
Ready
Bayesian prediction of synthetic accessibility

SYBA uses a fragment-based approach to classify whether a molecule is easy or hard to synthesize, and it can also be used to analyze the contribution of individual fragments to the total synthetic accessibility. The easy-to-synthesize dataset is an extract of the ZINC purchasable compounds, and the hard-to-synthesize dataset is generated using a Nonpher approach (introducing small molecular perturbations to transform molecules into more complex compounds). The fragments are calculated with ECFP8

Pretrained
Regression
Compound
Single
Score
Float
Single
Higher score indicates higher confidence that the molecule is synthetically available
Synthetic accessibility
Chemical synthesis
https://github.com/ersilia-os/eos7pw8
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00439-2
https://github.com/lich-uct/syba
GPL-3.0
miquelduranfrigola
25/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7pw8
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7pw8.zip
Local
Q4
2021
natural-product-score
Ready
Natural product score

A simple score to distinguish between natural products (-like) and synthetic compounds. The score was calculated using an analysis of the structural features that distinguish natural products (NP) from synthetic molecules. NP structures were obtained from the CRC Dictionary of Natural products and synthetic molecules belong to an in-house collection. This method has been contributed to the RDKit package, Ersilia is simply implementing the RDKit NP_Score.

Pretrained
Regression
Compound
Single
Score
Float
List
Higher score indicates higher natural product likeness
Natural product
Drug-likeness
https://github.com/ersilia-os/eos8ioa
http://pubs.acs.org/doi/abs/10.1021/ci700286x
https://github.com/rdkit/rdkit/tree/master/Contrib/NP_Score
BSD-3.0
miquelduranfrigola
19/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8ioa
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8ioa.zip
Local
Q4
2021
natural-product-likeness
Ready
Natural product likeness score

The model is a derivation of the natural product fingerprint (eos6tg8). In addition to generating specific natural product fingerprints, the activation value of the neuron that predicts if a molecule is a natural product or not can be used as a NP-likeness score. The method outperforms the NP_Score implemented in RDKit.

Pretrained
Regression
Compound
Single
Score
Float
Single
Higher score indicates higher natural product likeness
Natural product
Drug-likeness
https://github.com/ersilia-os/eos9yui
https://www.sciencedirect.com/science/article/pii/S2001037021003226?
https://github.com/kochgroup/neural_npfp
None
https://eos9yui-7xpw3.ondigitalocean.app/
miquelduranfrigola
19/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos9yui
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9yui.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos9yui
Online
Q4
2021
retrosynthetic-accessibility
Ready
Retrosynthetic accessibility score

Retrosynthetic accessibility score based on the computer aided synthesis planning tool AiZynthfinder. The authors have selected a ChEMBL subset of 200.000 molecules, and checked whether AiZinthFinder could identify a synthetic route or not. This data has been trained to create a classifier that computes 4500 times faster than the underlying AiZynthFinder. Molecules outside the applicability domain, such as the GBD database, need to be fine tuned to their use case.

Pretrained
Regression
Compound
Single
Score
Float
Single
Higher score indicates easier retrosynthetic accessibility
Synthetic accessibility
Chemical synthesis
https://github.com/ersilia-os/eos2r5a
https://pubs.rsc.org/en/content/articlelanding/2021/sc/d0sc05401a
https://github.com/reymond-group/RAscore
MIT
miquelduranfrigola
19/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2r5a
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2r5a.zip
Local
Q4
2021
soltrannet-aqueous-solubility
Ready
Aqueous solubility prediction

Fast aqueous solubility prediction based on the Molecule Attention Transformer (MAT). The authors used AqSolDB to fine-tune the MAT network to solubility prediction, achieving competitive scores in the Second Challenge to Predict Aqueous Solubility (SC2).

Pretrained
Regression
Compound
Single
Experimental value
Float
Single
Predicted LogS (log of the solubility)
Solubility
ADME
LogS
https://github.com/ersilia-os/eos6oli
https://pubs.acs.org/doi/10.1021/acs.jcim.1c00331
https://github.com/gnina/SolTranNet
Apache-2.0
miquelduranfrigola
19/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6oli
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6oli.zip
Yes
Local
Q4
2021
molgrad-ppb
Ready
Coloring molecules for plasma protein binding prediction

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions. In this model, they train MolGrad with data from a Plasma-protein binding assay (PPB) to predict the fraction bound in plasma of small mo

Pretrained
Regression
Compound
Single
Experimental value
Float
Single
Fraction (%) bound in plasma
ADME
Fraction bound
Chemical graph model
https://github.com/ersilia-os/eos6ao8
https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344
https://github.com/josejimenezluna/molgrad/
AGPL-3.0
miquelduranfrigola
19/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6ao8
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6ao8.zip
Yes
Local
Q4
2021
molgrad-herg
Ready
Coloring molecules for hERG blockade

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions.In this model, they train MolGrad with a dataset of hERG channel blockers/non-blockers to predict the cardiotoxicity of small molecules (I

Pretrained
Regression
Compound
Single
Experimental value
Float
Single
pIC50 of hERG inhibition
hERG
Toxicity
Cardiotoxicity
Chemical graph model
https://github.com/ersilia-os/eos43at
https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344
https://github.com/josejimenezluna/molgrad/
AGPL-3.0
https://eos43at-zqx9x.ondigitalocean.app/
miquelduranfrigola
19/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos43at
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos43at.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos43at
Yes
Online
Q4
2021
molgrad-caco2
Ready
Coloring molecules for Caco-2 cell permeability

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions.  This model has been trained using experimental data on the permeability of molecules across Caco2 cell membranes (Papp, cm s-1)

Pretrained
Regression
Compound
Single
Experimental value
Float
Single
Log 10 of the Passive permeability in cm s-1
Permeability
ADME
Papp
Chemical graph model
https://github.com/ersilia-os/eos1af5
https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344
https://github.com/josejimenezluna/molgrad/
AGPL-3.0
miquelduranfrigola
19/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos1af5
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1af5.zip
Yes
Local
Q4
2021
cardiotoxnet-herg
Ready
Ligand-based prediction of hERG blockade

A robust predictor for hERG channel blockade based on an ensemble of five deep learning models. The authors have collected a dataset from public sources, such as BindingDB and ChEMBL on hERG blockers and non-blockers. The cut-off for hERG blockade was set at IC50 < 10 uM for the classifier.

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability that the compound inhibits hERG (IC50 < 10 uM)
hERG
Toxicity
Cardiotoxicity
https://github.com/ersilia-os/eos2ta5
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00541-z
https://github.com/Abdulk084/CardioTox
None
miquelduranfrigola
18/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2ta5
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2ta5.zip
Local
Q4
2021
molgrad-cyp3a4
Ready
Coloring molecules for interaction with CYP3A4

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions.  This model has been trained using a ChEMBL dataset of CYP450 3A4 inhibitors (0) and non-inhibitors (1).

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability that the molecule is metabolized by Cyp3A4 (cut-off: 10 uM)
CYP450
ADME
Chemical graph model
https://github.com/ersilia-os/eos96ia
https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344
https://github.com/josejimenezluna/molgrad/
GPL-3.0
miquelduranfrigola
18/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos96ia
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos96ia.zip
Yes
Local
Q4
2021
mycpermcheck
Ready
Membrane permeability in Mycobacterium tuberculosis

MycPermCheck predicts potential to permeate the Mycobacterium tuberculosis cell membrane based on physicochemical properties.

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of permeability across the M.tb cell wall
Permeability
M.tuberculosis
ADME
Tuberculosis
https://github.com/ersilia-os/eos8d8a
https://academic.oup.com/bioinformatics/article/29/1/62/272745
https://www.mycpermcheck.aksotriffer.pharmazie.uni-wuerzburg.de/index.html
MIT
miquelduranfrigola
14/10/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8d8a
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8d8a.zip
Yes
Local
Q4
2021
padel
Ready
PADEL small molecule descriptors

PaDEL is a commonly used molecular descriptor. It calculates 1875 molecular descriptors (1444 1D and 2D descriptors, 431 3D descriptors) and 12 types of fingerprints for small molecule representation. Originally developed in Java, here we provide PaDDELPy, its python implementation.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
Vector representation of a molecule
Descriptor
https://github.com/ersilia-os/eos7asg
https://onlinelibrary.wiley.com/doi/10.1002/jcc.21707
https://github.com/ecrl/padelpy
MIT
miquelduranfrigola
27/9/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7asg
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7asg.zip
Local
Q3
2021
smiles-transformer
Ready
SMILES transformer descriptor

Molecular embedding based on natural language processing. It converts SMILES into fingerprints using an unsupervised model pre-trained on a very large SMILES dataset from ChEMBL. The transformer is particularly well-suited for low-data drug discovery.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
Vector representation of small molecules
Chemical language model
Descriptor
Embedding
https://github.com/ersilia-os/eos2lm8
https://arxiv.org/abs/1911.04738
https://github.com/DSPsleeporg/smiles-transformer
MIT
miquelduranfrigola
22/9/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2lm8
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2lm8.zip
Local
Q3
2021
mordred
Ready
Mordred chemical descriptors

A set of ca 1,800 chemical descriptors, including both RDKit and original modules. It is comparable to the well known PaDEL-Descriptors (see eos7asg), but has shorter calculation times and can process larger molecules.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
Vector representation of a molecule
Descriptor
https://github.com/ersilia-os/eos78ao
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0258-y
https://github.com/mordred-descriptor/mordred
BSD-3.0
miquelduranfrigola
17/9/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos78ao
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos78ao.zip
Local
Q3
2021
rdkit-fingerprint
Ready
Path-based fingerprint

Path-based fingerprints calculated with the RDKit package Chem.RDKFingerprint. It is inspired in the Daylight fingerprint. As explained in the RDKit Book, the fingerprinting algorithm identifies all subgraphs in the molecule within a particular range of sizes, hashes each subgraph to generate a raw bit ID, mods that raw bit ID to fit in the assigned fingerprint size, and then sets the corresponding bit.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
Vector representation of small molecules
Fingerprint
Descriptor
https://github.com/ersilia-os/eos7jio
https://www.rdkit.org/docs/RDKit_Book.html
https://github.com/rdkit/rdkit
BSD-3.0
miquelduranfrigola
17/9/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7jio
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7jio.zip
Local
Q3
2021
molbert
Ready
MolBERT chemical language transformer

Molecular representation using the BERT language Transformer. The model has been pre-trained on the GuacaMol dataset (~1.6M molecules from ChEMBL), and can be fine-tuned to the desired QSAR tasks. It has been benchmarked in MoleculeNet.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
Embedding representation of a molecule
Chemical language model
Embedding
Descriptor
https://github.com/ersilia-os/eos2thm
https://arxiv.org/abs/2011.13230
https://github.com/BenevolentAI/MolBERT
MIT
miquelduranfrigola
17/9/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2thm
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2thm.zip
Local
Q3
2021
rdkit-descriptors
Ready
Physicochemical descriptors available from RDKIT

A set of 200 physicochemical descriptors available from the RDKIT, including molecular weight, solubility and druggability parameters. We have used the DescriptaStorus selection of RDKit descriptors for simplicity.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
Vector representation of small molecules
Descriptor
https://github.com/ersilia-os/eos8a4x
https://www.rdkit.org/docs/RDKit_Book.html
https://github.com/bp-kelley/descriptastorus
Proprietary
miquelduranfrigola
17/9/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8a4x
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8a4x.zip
Local
Q3
2021
avalon
Ready
Avalon fingerprint

Avalon is a path-based substructure key fingerprint (1024 bits), developed for substructure screen-out when searching. It is part of the Avalon Chemoinformatics Toolkit and has also been implemented as an external RDKit tool.

Pretrained
Representation
Compound
Single
Descriptor
Integer
List
Bitvector representation of a molecule
Fingerprint
https://github.com/ersilia-os/eos8h6g
https://pubs.acs.org/doi/full/10.1021/ci050413p
https://github.com/rdkit/rdkit/tree/master/External/AvalonTools
BSD-3.0
miquelduranfrigola
14/9/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8h6g
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8h6g.zip
Local
Q3
2021
molecular-weight
Ready
Molecular weight

The model is simply an implementation of the function Descriptors.MolWt of the chemoinformatics package RDKIT. It takes as input a small molecule (SMILES) and calculates its molecular weight in g/mol.

Pretrained
Regression
Compound
Single
Other value
Float
Single
Calculated molecular weight (g/mol)
Molecular weight
https://github.com/ersilia-os/eos3b5e
https://www.rdkit.org/docs/RDKit_Book.html
https://github.com/rdkit/rdkit
BSD-3.0
miquelduranfrigola
13/9/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3b5e
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3b5e.zip
CPU
Local
Q3
2021
morgan-counts
Ready
Morgan counts fingerprints

The Morgan Fingerprints, or extended connectivity fingerprints (ECFP4) are one of the most widely used molecular representations. They are circular representations (from an atom, search the atoms around with a radius n) and can have thousands of features. This implementation uses the RDKit package and is done with radius 3 and 2048 dimensions.

Pretrained
Representation
Compound
Single
Descriptor
Integer
List
Vector representation of a molecule
Fingerprint
Descriptor
https://github.com/ersilia-os/eos5axz
https://www.rdkit.org/docs/RDKit_Book.html
https://github.com/rdkit/rdkit
BSD-3.0
miquelduranfrigola
30/8/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos5axz
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5axz.zip
Local
Q3
2021
whales-descriptor
Ready
Holistic molecular descriptors for scaffold hopping

Weighted Holistic Atom Localization and Entity Shape (WHALES) is a descriptors based on 3D structure to facilitate natural product featurization. It is aimed at scaffold hopping exercises from natural products to synthetic compounds

Pretrained
Representation
Compound
Single
Descriptor
Float
List
Vector representation of a molecule
Natural product
Descriptor
https://github.com/ersilia-os/eos3ae6
https://www.nature.com/articles/s42004-018-0043-x
https://github.com/ETHmodlab/scaffold_hopping_whales
MIT
miquelduranfrigola
15/7/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3ae6
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ae6.zip
Local
Q3
2021
grover-embedding
Ready
Large-scale graph transformer

GROVER is a self-supervised Graph Neural Network for molecular representation pretrained with 10 million unlabelled molecules from ChEMBL and ZINC15. The model provided has been pre-trained on 10 million molecules (GROVERlarge). GROVER has then been fine-tuned to predict several activities from the MoleculeNet benchmark, consistently outperforming other state-of-the-art methods for serveral benchmark datasets.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
Embedding representation of a molecule
Chemical graph model
Embedding
Descriptor
https://github.com/ersilia-os/eos7w6n
https://papers.nips.cc/paper/2020/file/94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf
https://github.com/tencent-ailab/grover
MIT
miquelduranfrigola
2/7/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7w6n
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7w6n.zip
Yes
Local
Q3
2021
cc-signaturizer
Ready
Chemical Checker signaturizer

A set of 25 Chemical Checker bioactivity signatures (including 2D & 3D fingerprints, scaffold, binding, crystals, side effects, cell bioassays, etc) to capture properties of compounds beyond their structures. Each signature has a length of 128 dimensions. In total, there are 3200 dimensions. The signaturizer is periodically updated. We use the 2020-02 version of the signaturizer.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
2D projection of bioactivity signatures
Descriptor
Bioactivity profile
Embedding
https://github.com/ersilia-os/eos4u6p
https://www.nature.com/articles/s41467-021-24150-4
http://gitlabsbnb.irbbarcelona.org/packages/signaturizer
MIT
miquelduranfrigola
1/7/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos4u6p
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4u6p.zip
Local
Q3
2021
cdd-descriptor
Ready
Continuous and data-driven descriptors

Low dimension continuous descriptor based on a neural machine translation model. This model has been trained by inputting a IUPAC molecular representation to obtain its SMILES. The intermediate continuous vector representation encoded by when reading the IUPAC name is a representation of the molecule, containing all the information to generate the output sequence (SMILES). This model has been pretrained on a large dataset combining ChEMBL and ZINC.

Pretrained
Representation
Compound
Single
Descriptor
Float
List
Embedding representation of a molecule
Descriptor
Chemical language model
https://github.com/ersilia-os/eos7a04
https://pubs.rsc.org/en/content/articlelanding/2019/sc/c8sc04175j
https://github.com/jrwnter/cddd
MIT
miquelduranfrigola
1/7/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7a04
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7a04.zip
Local
Q3
2021
grover-sider
Ready
Adverse Drug Reactions

The model predicts the putative adverse drug reactions (ADR) of a molecule, using the SIDER database (MoleculeNet) that contains pairs of marketed drugs and their described ADRs. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Pretrained
Classification
Compound
Single
Probability
Float
List
Predicted ADRs classified in 27 groups
Toxicity
MoleculeNet
Side effects
https://github.com/ersilia-os/eos77w8
https://arxiv.org/abs/2007.02835
https://github.com/tencent-ailab/grover
MIT
Amna-28
4/6/2021
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos77w8
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos77w8.zip
Yes
Local
Q2
2021
grover-bbbp
Ready
Blood-brain barrier penetration

This model predicts the Blood-Brain Barrier (BBB) penetration potential of small molecules using as training data the curated MoleculeNet benchmark containing 2000 experimental data points. It has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability that a molecule crosses the blood brain barrier
Permeability
MoleculeNet
Chemical graph model
Alzheimer
https://github.com/ersilia-os/eos1amr
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Amna-28
4/6/2021
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos1amr
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1amr.zip
Yes
Local
Q2
2021
chembl-multitask-descriptor
Ready
Multi-target prediction based on ChEMBL data

This is a ligand-based target prediction model developed by the ChEMBL team. They trained the model using pairs of small molecules and their protein targets, and produced a multitask predictor. The thresholds of activity where determined by protein families (kinases: <= 30nM, GPCRs: <= 100nM, Nuclear Receptors: <= 100nM, Ion Channels: <= 10μM, Non-IDG Family Targets: <= 1μM). Here we provide the model trained on ChEMBL_28, which showed an accuracy of 85%.

Pretrained
Classification
Compound
Single
Probability
Float
List
Probability of having the protein (identified by ChEMBL ID), as target
Bioactivity profile
Target identification
ChEMBL
https://github.com/ersilia-os/eos1vms
http://chembl.blogspot.com/2019/05/multi-task-neural-network-on-chembl.html
https://github.com/chembl/chembl_multitask_model/
None
miquelduranfrigola
4/6/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos1vms
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1vms.zip
Local
Q2
2021
etoxpred
Ready
Toxicity and synthetic accessibility prediction

The eToxPred tool has been developed to predict, on one hand, the synthetic accessibility (SA) score, or how easy it is to make the molecule in the laboratory, and, on the other hand, the toxicity (Tox) score, or the probability of the molecule of being toxic to humans. The authors trained and cross-validated both predictors on a large number of datasets, and demonstrated the method usefulness in building virtual custom libraries.

Pretrained
Regression
Compound
Single
Score
Float
Single
Higher scores indicate easier synthetic accessibility and higher toxicity, respectively
Toxicity
Synthetic accessibility
https://github.com/ersilia-os/eos92sw
https://bmcpharmacoltoxicol.biomedcentral.com/articles/10.1186/s40360-018-0282-6
https://github.com/pulimeng/eToxPred
GPL-3.0
miquelduranfrigola
4/6/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos92sw
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos92sw.zip
Local
Q2
2021
chemprop-sars-cov-inhibition
Ready
SARS-CoV inhibition

This model was developed to support the early efforts in the identification of novel drugs against SARS-CoV2. It predicts the probability that a small molecule inhibits SARS-3CLpro-mediated peptide cleavage. It was developed using a high-throughput screening against the 3CL protease of SARS-CoV1, as no data was yet available for the new virus (SARS-CoV2) causing the COVID-19 pandemic. It uses the ChemProp model.

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability of 3CL protease inhibition (%) The classifier was trained using a threshold of 12% of inhibition
COVID19
Antiviral activity
Sars-CoV-2
Chemical graph model
https://github.com/ersilia-os/eos9f6t
https://www.sciencedirect.com/science/article/pii/S0092867420301021
http://chemprop.csail.mit.edu/checkpoints
MIT
miquelduranfrigola
3/6/2021
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos9f6t
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9f6t.zip
Yes
Local
Q2
2021
compound-test-3
Test
Test model 3

Test model 3

Dummy
Dummy
Compound
Single
Dummy
Dummy model
Dummy
GPL-3.0
miquelduranfrigola
3/9/2021
https://github.com/miquelduranfrigola
Local
Q3
2021
compound-test-2
Test
Test model 2

Test model 2

Dummy
Dummy
Compound
Single
Dummy
Dummy model
Dummy
GPL-3.0
miquelduranfrigola
3/9/2021
https://github.com/miquelduranfrigola
Local
Q3
2021
compound-test-1
Test
Test model 1

Test model 1

Dummy
Dummy
Compound
Single
Dummy
Dummy model
Dummy
GPL-3.0
miquelduranfrigola
3/9/2021
https://github.com/miquelduranfrigola
Local
Q3
2021
chemprop-antibiotic
Ready
Broad spectrum antibiotic activity

Based on a simple E.coli growth inhibition assay, the authors trained a model capable of identifying antibiotic potential in compounds structurally divergent from conventional antibiotic drugs. One of the predicted active molecules, Halicin (SU3327), was experimentally validated in vitro and in vivo. Halicin is a drug under development as a treatment for diabetes.

Pretrained
Classification
Compound
Single
Probability
Float
Single
Probability that a compound inhibits E.coli growth. The inhibition threshold was set at 80% growth inhibition in the training set.
E.coli
IC50
Antimicrobial activity
Chemical graph model
https://github.com/ersilia-os/eos4e40
https://pubmed.ncbi.nlm.nih.gov/32084340/
http://chemprop.csail.mit.edu/checkpoints
MIT
https://eos4e40-rovva.ondigitalocean.app/
miquelduranfrigola
6/6/2018
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos4e40
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4e40.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos4e40
Yes
Local
Q2
2018

Alert

Lorem ipsum
Okay