Public
Report abuse
Use this data
Sign up for free
Identifier
1
eos74km
2
eos8ub5
3
eos2db3
4
eos9gg2
5
eos3mk2
6
eos9p4a
7
eos39co
8
eos3wzy
9
eos3nn9
10
eos1pu1
11
eos39dp
12
eos6ru3
13
eos6ost
14
eos8aox
15
eos57bx
16
eos5guo
17
eos24ur
18
eos2401
19
eos5gge
20
eos7d58
21
eos694w
22
eos42ez
23
eos21q7
24
eos18ie
25
eos8bhe
26
eos5cl7
27
eos1n4b
28
eos9ym3
29
eos30f3
30
eos5xng
31
eos69e6
32
eos4wt0
33
eos4x30
34
eos1ut3
35
eos9ivc
36
eos9zw0
37
eos633t
38
eos3kcw
39
eos1d7r
40
eos9ueu
41
eos4f95
42
eos2zmb
43
eos1noy
44
eos3le9
45
eos4rta
46
eos2l0q
47
eos3804
48
eos2hzy
49
eos8fma
50
eos1mxi
51
eos7yti
52
eos4qda
53
eos80ch
54
eos3ev6
55
eos7nno
56
eos5jz9
57
eos59rr
58
eos7kpb
59
eos2gw4
60
eos3cf4
61
eos3zur
62
eos9tyg
63
eos44zp
64
eos24jm
65
eos6aun
66
eos31ve
67
eos2fy6
68
eos2lqb
69
eos8fth
70
eos8lok
71
eos9yy1
72
eos22io
73
eos74bo
74
eos81ew
75
eos93h2
76
eos7qga
77
eos4avb
78
eos4cxk
79
eos8c0o
80
eos6hy3
81
eos5505
82
eos4se9
83
eos24ci
84
eos935d
85
eos4q1a
86
eos9taz
87
eos2rd8
88
eos9sa2
89
eos8a5g
90
eos238c
91
eos2v11
92
eos1579
93
eos6m4j
94
eos4zfy
95
eos2a9n
96
eos9c7k
97
eos7jlv
98
eos4b8j
99
eos3ae7
100
eos9be7
101
eos4tcc
102
eos5qfo
103
eos2mrz
104
eos2re5
105
eos30gr
106
eos526j
107
eos6pbf
108
eos2b6f
109
eos3xip
110
eos6o0z
111
eos85a3
112
eos8451
113
eos157v
114
eos481p
115
eos2mhp
116
eos6fza
117
eos5smc
118
eos9ei3
119
eos46ev
120
eos69p9
121
eos7a45
122
eos65rt
123
eos2hbd
124
eos97yu
125
eos6tg8
126
eos2gth
127
eos7pw8
128
eos8ioa
129
eos9yui
130
eos2r5a
131
eos6oli
132
eos6ao8
133
eos43at
134
eos1af5
135
eos2ta5
136
eos96ia
137
eos8d8a
138
eos7asg
139
eos2lm8
140
eos78ao
141
eos7jio
142
eos2thm
143
eos8a4x
144
eos8h6g
145
eos3b5e
146
eos5axz
147
eos3ae6
148
eos7w6n
149
eos4u6p
150
eos7a04
151
eos77w8
152
eos1amr
153
eos1vms
154
eos92sw
155
eos9f6t
156
eos4e40
Drag to adjust the number of frozen columns
Slug
Status
Repository
Title
Description
Input
Input Shape
Output
Output Shape
Output Type
Mode
Tag
GitHub
Publication
Source Code
License
Interpretation
Task
Contributor
Contributor Profile
Host URL
DockerHub
Docker Architecture
Runtime
S3
DO Deployment
Deployment
Biomodel Annotation
Secrets
Incorporation Date
Incorporation Quarter
Incorporation Year
antimicrobial-kg-ml
Ready
Antimicrobial class specificity prediction

Prediction of antimicrobial class specificity using simple machine learning methods applied to an antimicrobial knowledge graph. The knowledge graph is built on ChEMBL, Co-ADD and SPARK. Endpoints are broad terms such as activity against gram-positive or gram-negative bacteria. The best model according to the authors is a Random Forest with MHFP6 fingerprints.

Compound
Single
Score
List
Float
Pretrained
Antimicrobial activity
https://github.com/ersilia-os/eos74km
https://www.biorxiv.org/content/10.1101/2024.12.02.626313v1.full
https://github.com/IMI-COMBINE/broad_spectrum_prediction
MIT
Class probabilities for each antimicrobial class
Annotation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos74km.zip
Local
17/12/2024
Q4
2024
chemical-space-projections-coconut
Ready
Projections against Coconut

This tool performs PCA, UMAP and tSNE projections taking the Coconut natural products database as a chemical space of reference. The Ersilia Compound Embeddings are used as descriptors. Four PCA components and two UMAP and tSNE components are returned.

Compound
Single
Value
List
Float
In-house
Embedding
https://github.com/ersilia-os/eos8ub5
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00478-9
https://github.com/ersilia-os/compound-embedding
GPL-3.0-or-later
Coordinates of 2D projections, namely PCA, UMAP and tSNE.
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8ub5
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8ub5.zip
Local
10/11/2024
Q4
2024
chemical-space-projections-chemdiv
Ready
Chemical space 2D projections against ChemDiv

This tool performs PCA, UMAP and tSNE projections taking a 100k ChemDiv diversity set as a chemical space of reference. The Ersilia Compound Embeddings are used as descriptors. Four PCA components and two UMAP and tSNE components are returned.

Compound
Single
Value
List
Float
In-house
Embedding
https://github.com/ersilia-os/eos2db3
https://www.chemdiv.com/catalog/diversity-libraries/representative-diversity-libraries-out-of-1-6m-stock/
https://github.com/ersilia-os/compound-embedding
GPL-3.0-or-later
Coordinates of 2D projections, namely PCA, UMAP and tSNE.
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2db3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2db3.zip
Local
9/11/2024
Q4
2024
chemical-space-projections-drugbank
Ready
Chemical space 2D projections against DrugBank

This tool performs PCA, UMAP and tSNE projections taking the DrugBank chemical space as a reference. The Ersilia Compound Embeddings are used as descriptors. Four PCA components and two UMAP and tSNE components are returned.

Compound
Single
Value
List
Float
In-house
Embedding
https://github.com/ersilia-os/eos9gg2
https://academic.oup.com/nar/article/52/D1/D1265/7416367
https://github.com/ersilia-os/compound-embedding
GPL-3.0-or-later
Coordinates of 2D projections, namely PCA, UMAP and tSNE.
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos9gg2
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9gg2.zip
Local
9/11/2024
Q4
2024
bbbp-marine-kinase-inhibitors
Ready
BBBP model tested on marine-derived kinase inhibitors

A set of three binary classifiers (random forest, gradient boosting classifier, and logistic regression) to predict the Blood-Brain Barrier (BBB) permeability of small organic compounds. The best models were applied to natural products of marine origin, able to inhibit kinases associated with neurodegenerative disorders. The training set size was around 300 compounds.

Compound
Single
Score
List
Float
Retrained
Drug-likeness
Permeability
https://github.com/ersilia-os/eos3mk2
https://pubmed.ncbi.nlm.nih.gov/30699889/
https://github.com/plissonf/BBB-Models
MIT
Classification score over three classifiers, namely random forest (rfc), gradient boosting classifier (gbc), and logistic regression (logreg).
Annotation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3mk2
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3mk2.zip
Local
23/10/2024
Q4
2024
deep-dl
Ready
Drug-likeness scoring based on unsupervised learning

This model evaluates drug-likeness using an unsupervised learning approach, eliminating the need for labeled data and avoiding biases from incomplete negative sets. It extracts features directly from known drug molecules, identifying common characteristics through a recurrent neural network (RNN) language model. By representing molecules as SMILES strings, the model learns the probability distribution of known drugs and assesses new molecules based on their likelihood of appearing in this space.

Compound
Single
Score
Single
Float
Pretrained
Drug-likeness
https://github.com/ersilia-os/eos9p4a
https://pubs.rsc.org/en/content/articlehtml/2022/sc/d1sc05248a
https://github.com/SeonghwanSeo/DeepDL
GPL-3.0-or-later
Higher score indicates higher drug likeness
Annotation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://eos9p4a-izpny.ondigitalocean.app/
https://hub.docker.com/r/ersiliaos/eos9p4a
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9p4a.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos9p4a
Online
4/9/2024
Q3
2024
unimol-representation
Ready
Uni-Mol molecular representation

Uni-Mol offers a simple and effective SE(3) equivariant transformer architecture for pre-training molecular representations that capture 3D information. The model is trained on >200M conformations. The current model outputs a representation embedding.

Compound
Single
Value
List
Float
Pretrained
Fingerprint
https://github.com/ersilia-os/eos39co
https://openreview.net/forum?id=6K2RM6wVqKu
https://github.com/deepmodeling/Uni-Mol
GPL-3.0-only
Uni-Mol representation embedding
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos39co
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos39co.zip
Local
22/7/2024
Q3
2024
qupkake
Ready
Predict micro-pKa of organic molecules

QupKake is an innovative approach that combines graph neural network (GNN) models with semiempirical quantum mechanical (QM) features to forecast the micro-pKa values of organic molecules. QM has a significant role in both identifying reaction sites and predicting micro-pKa values. Precisely predicting micro-pKa values is vital for comprehending and adjusting the acidity and basicity of organic compounds, This has significant applications in drug discovery, materials science, and environmental c

Compound
Single
Value
List
Float
Pretrained
pKa
https://github.com/ersilia-os/eos3wzy
https://doi.org/10.1021/acs.jctc.4c00328
https://github.com/hutchisonlab/QupKake
BSD-3-Clause
Up to 10 pKa values for the molecule
Annotation
LauraGomezjurado
https://github.com/LauraGomezjurado
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3wzy.zip
Local
17/7/2024
Q3
2024
mpro-covid19
Ready
Predict bioactivity against Main Protease of SARS-CoV-2

MProPred predicts the efficacy of compounds against the main protease of SARS-CoV-2, which is a promising drug target since it processes polyproteins of SARS-CoV-2. This model uses PaDEL-Descriptor to calculate molecular descriptors of compounds. It is based on a dataset of 758 compounds that have inhibition efficacy against the Main Protease, as published in peer-reviewed journals between January, 2020 and August, 2021. Input compounds are compared to compounds in the dataset to measure molecul

Compound
Single
Value
Single
Float
Pretrained
COVID19
https://github.com/ersilia-os/eos3nn9
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10289339/
https://github.com/Nadimfrds/Mpropred
MIT
Gives the pIC50 values for each compound to compare their bioactivity against the main protease
Annotation
HarmonySosa
https://github.com/HarmonySosa
https://hub.docker.com/r/ersiliaos/eos3nn9
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3nn9.zip
Local
1/7/2024
Q3
2024
cardiotox-dictrank
Ready
Cardiotoxicity Classifier

Prediction of drug-induced cardiotoxicity as a binary classification of cardiotoxicity risk. The probability score depicts risk of the compound being cardiotoxic. Classification is based on the chemical data such as SMILES representations of compounds and a variety of descriptors such as Morgan fingerprints and Mordred physicochemical descriptors that describe the molecular structure of the drug interactions. Biological data is also used including gene expression and cellular paintings after dru

Compound
Single
Score
Single
Float
Retrained
Cardiotoxicity
DrugBank
https://github.com/ersilia-os/eos1pu1
https://doi.org/10.1021/acs.jcim.3c01834
https://github.com/srijitseal/DICTrank
None
The model provides a probability score indicating the likelihood of a compound being cardiotoxic
Annotation
kurysauce
https://github.com/kurysauce
https://hub.docker.com/r/ersiliaos/eos1pu1
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1pu1.zip
Local
29/6/2024
Q2
2024
phakinpro
Ready
Pharmacokinetics Profiler (PhaKinPro)

Pharmacokinetics Profiler (PhaKinPro) predicts the pharmacokinetic (PK) properties of drug candidates. It has been built using a manually curated database of 10.000 compounds with information for 12 PK endpoints. Each model provides a multi-classifier output for a single endpoint, along with a confidence estimate of the prediction and whether the query molecule is within the applicability domain of the model.

Compound
Single
Score
List
String
Pretrained
Microsomal stability
ADME
Metabolism
Half-life
Permeability
https://github.com/ersilia-os/eos39dp
https://pubs.acs.org/doi/10.1021/acs.jmedchem.3c02446
https://github.com/molecularmodelinglab/PhaKinPro
MIT
A list of several ADME predictions
Annotation
sucksido
https://github.com/sucksido
https://hub.docker.com/r/ersiliaos/eos39dp
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos39dp.zip
Local
3/5/2024
Q2
2024
whales-qmug
Ready
WHALES similarity search on 600k molecules from Q-Mug

Search Q-Mug based on WHALES descriptors. Q-Mug is a subset of 600k bioactive molecules from ChEMBL. Three conformers are given for each molecule. WhALES is a simple descriptor useful for scaffold hopping.

Compound
Single
Compound
List
String
Pretrained
Similarity
https://github.com/ersilia-os/eos6ru3
https://link.springer.com/protocol/10.1007/978-1-0716-1209-5_2
https://github.com/ETHmodlab/scaffold_hopping_whales
GPL-3.0
The top 100 most similar molecules are returned, based on WHALES descriptors. 3D conformer generation is done internally.
Sampling
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6ru3
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6ru3.zip
Local
22/4/2024
Q2
2024
reinvent4-libinvent
Ready
REINVENT 4 LibInvent

REINVENT 4 LibInvent creates new molecules by appending R groups to a given input. If the input SMILES string contains specified attachment points, it is directly processed by LibInvent to generate new molecules. If no attachment points given, the model try to find potential attachment points, and iterates through different combinations of these points. It passes each combination to LibInvent to generate new molecules.

Compound
Single
Compound
List
String
Pretrained
Similarity
https://github.com/ersilia-os/eos6ost
https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a
https://github.com/MolecularAI/REINVENT4
Apache-2.0
Model generates up to 1000 similar molecules per input molecule.
Sampling
ankitskvmdam
https://github.com/ankitskvmdam
https://hub.docker.com/r/ersiliaos/eos6ost
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6ost.zip
Local
18/4/2024
Q2
2024
cc-signaturizer-3d
Ready
Chemical Checker Signaturizer 3D

Building on the Chemical Checker bioactivity signatures (available as eos4u6p), the authors use the relation between stereoisomers and bioactivity of over 1M compounds to train stereochemically-aware signaturizers that better describe small molecule bioactivity properties. In this implementation we provide the A1, A2, A3, B1, B4 and C3 signatures

Compound
Single
Value
List
Float
Pretrained
Descriptor
Bioactivity profile
Embedding
https://github.com/ersilia-os/eos8aox
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-024-00867-4
https://gitlabsbnb.irbbarcelona.org/packages/signaturizer3d
MIT
2D projection of bioactivity signatures
Representation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos8aox
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8aox.zip
Local
19/3/2024
Q1
2024
reinvent4-mol2mol-scaffold
Ready
REINVENT 4 Mol2MolScaffold

Mol2MolScaffold uses REINVENT4's mol2mol scaffold prior and mol2mol scaffold generic prior to generate around 500 new molecules similar to the provided molecules. The generated molecules will be relatively similar to the input molecules.

Compound
Single
Compound
List
String
Pretrained
Similarity
https://github.com/ersilia-os/eos57bx
https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a
https://github.com/MolecularAI/REINVENT4
Apache-2.0
Model generates up to 500 similar molecules per input molecule.
Sampling
ankitskvmdam
https://github.com/ankitskvmdam
https://hub.docker.com/r/ersiliaos/eos57bx
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos57bx.zip
Local
8/3/2024
Q1
2024
erg-fingerprints
Ready
ErG 2D Descriptors

The Extended Reduced Graph (ErG) approach uses the description of pharmacophore nodes to encode molecular properties, with the goal of correctly describing pharmacophoric properties, size and shape of molecules. It was benchmarked against Daylight fingerprints and outperformed them in 10 out of 11 cases. ErG descriptors are well suited for scaffold hopping approaches.

Compound
Single
Value
List
Float
Pretrained
Descriptor
Fingerprint
https://github.com/ersilia-os/eos5guo
https://pubs.acs.org/doi/10.1021/ci050457y
https://www.rdkit.org/docs/source/rdkit.Chem.rdReducedGraphs.html
BSD-3.0
Vector representing ErG fingerprint values
Representation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos5guo
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5guo.zip
Local
6/3/2024
Q1
2024
whales-scaled
Ready
WHALES scaled

Scaled version of the WHALES descriptors (see eos3ae6). WHALES are holistic molecular descriptors useful for scaffold hopping, based on 3D structure to facilitate natural product featurization. The scaling uses sklearn's Robust Scaler trained on a random set of 100K molecules from ChEMBL.

Compound
Single
Value
List
Float
Pretrained
Natural product
Descriptor
https://github.com/ersilia-os/eos24ur
https://www.nature.com/articles/s42004-018-0043-x
https://github.com/grisoniFr/scaffold_hopping_whales
MIT
Scaled vector representation of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos24ur
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos24ur.zip
Local
5/3/2024
Q1
2024
scaffold-decoration
Ready
Scaffold decoration

The context discusses a novel notation system called Sequential Attachment-based Fragment Embedding (SAFE) that improves upon traditional molecular string representations like SMILES. SAFE reframes SMILES strings as an unordered sequence of interconnected fragment blocks while maintaining compatibility with existing SMILES parsers. This streamlines complex molecular design tasks by facilitating autoregressive generation under various constraints. The effectiveness of SAFE is demonstrated by trai

Compound
Single
Compound
List
String
Pretrained
Compound generation
https://github.com/ersilia-os/eos2401
https://arxiv.org/pdf/2310.10773.pdf
https://github.com/datamol-io/safe/tree/main
CC
Model generates up to 1000 new molecules from input molecule by replacing side chains of the scaffold
Sampling
Inyrkz
https://github.com/Inyrkz
https://hub.docker.com/r/ersiliaos/eos2401
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2401.zip
Local
20/2/2024
Q1
2024
dili-predictor
Ready
Early prediction of Drug-Induced Liver Injury

The DILI-Predictor predicts 10 features related to DILI toxicity including in-vivo and in-vitro and physicochemical parameters. It has been developed by the Broad Institute using the DILIst dataset (1020 compounds) from the FDA and achieved an accuracy balance of 70% on a test set of 255 compounds held out from the same dataset. The authors show how the model can correctly predict compounds that are not toxic in human despite being toxic in mice.

Compound
Single
Score
List
Float
Pretrained
Toxicity
Metabolism
https://github.com/ersilia-os/eos5gge
https://pubs.acs.org/doi/10.1021/acs.chemrestox.4c00015
https://github.com/Manas02/dili-pip
None
Prediction of 10 DILI-related endpoints. The most important is the first, DILI. Threshold for DILI active is set at 0.16 by the authors.
Annotation
Zainab-ik
https://github.com/Zainab-ik
https://hub.docker.com/r/ersiliaos/eos5gge
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5gge.zip
Local
19/2/2024
Q1
2024
admet-ai-prediction
Ready
ADMET properties prediction

ADMET AI is a framework for carrying out fast batch predictions for ADMET properties. It is based on ensemble of five Chemprop-RDKit models and has been trained on 41 tasks from the ADMET group in Therapeutics Data Commons (v0.4.1). Out of these 41 tasks, there are 31 classification tasks and 10 regression tasks. In addition to that output also contains 8 physicochemical properties, namely, molecular weight, logP, hydrogen bond acceptors, hydrogen bond doners, Lipinski's Rule of 5, QED, stereo c

Compound
Single
Score
Value
List
Float
Pretrained
ADME
Toxicity
https://github.com/ersilia-os/eos7d58
https://academic.oup.com/bioinformatics/article/40/7/btae416/7698030
https://github.com/swansonk14/admet_ai
MIT
ADMET outcomes, including physicochemical properties and classification tasks, as well as percentile normalizations based on the DrugBank chemical space.
Annotation
DhanshreeA
https://github.com/DhanshreeA
https://eos7d58-awe6b.ondigitalocean.app/
https://hub.docker.com/r/ersiliaos/eos7d58
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7d58.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos7d58
Local
Yes
7/2/2024
Q1
2024
reinvent4-mol2mol-medium-similarity
Ready
REINVENT 4 Mol2MolMediumSimilarity

The Mol2MolMediumSimilarity leverages REINVENT4's mol2mol medium similarity prior to generate up to 100 unique molecules. The generated molecules will be relatively similar to the input molecule.

Compound
Single
Compound
List
String
Pretrained
Similarity
https://github.com/ersilia-os/eos694w
https://chemrxiv.org/engage/chemrxiv/article-details/65463cafc573f893f1cae33a
https://github.com/MolecularAI/REINVENT4
Apache-2.0
Model generates up to 100 similar molecules per input molecule.
Sampling
ankitskvmdam
https://github.com/ankitskvmdam
https://hub.docker.com/r/ersiliaos/eos694w
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos694w.zip
Local
7/2/2024
Q1
2024
antibiotics-ai-cytotox
Ready
Human cytotoxicity endpoints

The authors tested the dataset of 39312 compounds used to train the antibiotics-ai model (eos18ie) against several cytotoxicity endpoints; human liver carcinoma cells (HepG2), human primary skeletal muscle cells (HSkMCs) and human lung fibroblast cells (IMR-90). Cellular viability was measured after 20133 days of treatment with each compound at 10 μM and activities were binarized using a 90% cell viability cut-off. 341 (8.5%), 490 (3.8%) and 447 (8.8%) compounds classified as cytotoxic for HepG2

Compound
Single
Score
List
Float
Pretrained
Cytotoxicity
https://github.com/ersilia-os/eos42ez
https://www.nature.com/articles/s41586-023-06887-8
https://github.com/felixjwong/antibioticsai
MIT
Predicting cytotoxicity in human liver carcinoma cells (HepG2), human primary skeletal muscle cells (HSkMCs) and human lung fibroblast cells (IMR-90)
Annotation
Richiio
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos42ez
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos42ez.zip
Local
Yes
5/2/2024
Q1
2024
inter-dili
Ready
InterDILI: drug-induced injury prediction

This model has been trained on a publicly available collection of 5 datasets manually curated for drug-induced-liver-injury (DILI). DILI outcome has been binarised, and ECFP descriptors, together with physicochemical properties have been used to train a random forest classifier which achieves AUROC > 0.9

Compound
Single
Score
Single
Float
Retrained
Toxicity
Human
Metabolism
https://github.com/ersilia-os/eos21q7
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-023-00796-8
https://github.com/bmil-jnu/InterDILI
None
Probability of Drug-Induced Liver Injury (DILI), higher score indicates higer risk
Annotation
leilayesufu
https://github.com/leilayesufu
https://hub.docker.com/r/ersiliaos/eos21q7
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos21q7.zip
Local
30/1/2024
Q1
2024
antibiotics-ai-saureus
Ready
Antibiotic activity prediction against Staphylococcus aureus

The authors use a mid-size dataset (more than 30k compounds) to train an explainable graph-based model to identify potential antibiotics with low cytotoxicity. The model uses a substructure-based approach to explore the chemical space. Using this method, they were able to screen 283 compounds and identify a candidate active against methicillin-resistant S. aureus (MRSA) and vancomycin-resistant enterococci.

Compound
Single
Score
Single
Float
Pretrained
Antimicrobial activity
ESKAPE
https://github.com/ersilia-os/eos18ie
https://www.nature.com/articles/s41586-023-06887-8
https://github.com/felixjwong/antibioticsai
MIT
Probability of growth inhibition (80% cut off at 50uM)
Annotation
Richiio
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos18ie
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos18ie.zip
Local
Yes
26/1/2024
Q1
2024
scaffold-morphing
Ready
Scaffold morphing

The context discusses a novel notation system called Sequential Attachment-based Fragment Embedding (SAFE) that improves upon traditional molecular string representations like SMILES. SAFE reframes SMILES strings as an unordered sequence of interconnected fragment blocks while maintaining compatibility with existing SMILES parsers. This streamlines complex molecular design tasks by facilitating autoregressive generation under various constraints. The effectiveness of SAFE is demonstrated by trai

Compound
Single
Compound
List
String
Pretrained
Compound generation
https://github.com/ersilia-os/eos8bhe
https://arxiv.org/pdf/2310.10773.pdf
https://github.com/datamol-io/safe/tree/main
CC
Model generates new molecules from input molecule by replacing core structures of input molecule.
Sampling
Inyrkz
https://github.com/Inyrkz
https://hub.docker.com/r/ersiliaos/eos8bhe
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8bhe.zip
Local
12/1/2024
Q1
2024
ngonorrhoeae-inhibition
Ready
Growth Inhibitors of Neisseria gonorrhoeae

The authors curated a dataset of 282 compounds from ChEMBL, of which 160 (56.7%) were labeled as active N. gonorrhoeae inhibitor compounds. They used this dataset to build a naïve Bayesian model and used it to screen a commercial library. With this method, they identified and validated two hits. We have used the dataset to build a model using LazyQSAR with Ersilia Compound Embeddings as molecular descriptors. LazyQSAR is an AutoML Ersilia-developed library.

Compound
Single
Score
Single
Float
Retrained
Antimicrobial activity
ChEMBL
N.gonorrhoeae
https://github.com/ersilia-os/eos5cl7
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8274436/
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
Probability of activity for the inhibition of the pathogen N. gonorrhoeae
Annotation
Richiio
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos5cl7
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5cl7.zip
Local
Yes
3/1/2024
Q1
2024
hdac3-inhibition
Ready
Identifying HDAC3 inhibitors

The model predicts the inhibitory potential of small molecules against Histone deacetylase 3 (HDAC3), a relevant human target for cancer, inflammation, neurodegenerative diseases and diabetes. The authors have used a dataset of 1098 compounds from ChEMBL and validated the model using the benchmark MUBD-HDAC3.

Compound
Single
Score
Single
Float
Pretrained
Cancer
ChEMBL
https://github.com/ersilia-os/eos1n4b
https://onlinelibrary.wiley.com/doi/10.1002/minf.202000105
https://github.com/jwxia2014/HDAC3i-Finder
GPL-3.0
Probability that the molecule is a HDAC3 inhibitor
Annotation
Richiio
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos1n4b
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1n4b.zip
Local
14/12/2023
Q4
2023
mrlogp
Ready
MRlogP: neural network-based logP prediction for druglike small molecules

The authors use a two-step approach to build a model that accurately predicts the lipophilicity (LogP) of small molecules. First, they train the model on a large amount of low accuracy predicted LogP values and then they fine-tune the network using a small, accurate dataset of 244 druglike compounds. The model achieves an average root mean squared error of 0.988 and 0.715 against druglike molecules from Reaxys and PHYSPROP.

Compound
Single
Value
Single
Float
Pretrained
Lipophilicity
LogP
https://github.com/ersilia-os/eos9ym3
https://www.mdpi.com/2227-9717/9/11/2029/htm
https://github.com/JustinYKC/MRlogP
MIT
Predicted LogP of small molecules
Annotation
leilayesufu
https://github.com/leilayesufu
https://hub.docker.com/r/ersiliaos/eos9ym3
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ym3.zip
Local
12/12/2023
Q4
2023
dmpnn-herg
Ready
Prediction of hERG channel blockers with directed message passing neural networks

This model leverages the ChemProp network (D-MPNN) to build a predictor of hERG-mediated cardiotoxicity. The model has been trained using a published dataset which contains 7889 molecules with several cut-offs for hERG blocking activity. The authors select a 10 uM cut-off. This implementation of the model does not use any specific featurizer, though the authors suggest the moe206 descriptors (closed-source) improve performance even further.

Compound
Single
Score
Single
Float
Pretrained
Cardiotoxicity
hERG
Toxicity
Descriptor
https://github.com/ersilia-os/eos30f3
https://pubs.rsc.org/en/content/articlehtml/2022/ra/d1ra07956e
https://github.com/AI-amateur/DMPNN-hERG
None
Probability of blocking hERG (cut-off: 10uM)
Annotation
leilayesufu
https://github.com/leilayesufu
https://hub.docker.com/r/ersiliaos/eos30f3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos30f3.zip
Local
4/12/2023
Q4
2023
chemprop-burkholderia
Ready
Burkholderia cenocepacia inhibition

Prediction of antimicrobial potential using a dataset of 29537 compounds screened against the antibiotic resistant pathogen Burkholderia cenocepacia. The model uses the Chemprop Direct Message Passing Neural Network (D-MPNN) abd has an AUC score of 0.823 for the test set. It has been used to virtually screen the FDA approved drugs as well as a collection of natural product list (>200k compounds) with hit rates of 26% and 12% respectively.

Compound
Single
Score
Single
Float
Pretrained
Antimicrobial activity
https://github.com/ersilia-os/eos5xng
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9624395/
https://github.com/cardonalab/Prediction-of-ATB-Activity
GPL-3.0
Probability that a compound inhibits the drug resistant bacteria Burkholderia cenocepacia. Scores range from 0 to 1. With 1 indicating the highest probability for growth inhibitory activity.
Annotation
Richioo
https://github.com/Richioo
https://hub.docker.com/r/ersiliaos/eos5xng
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5xng.zip
Local
Yes
3/12/2023
Q4
2023
pgmg-pharmacophore
Ready
Pharmacophore-guided molecular generation

Based on a molecule's pharmacophore, this model generates new molecules de-novo to match the pharmacophore. Internally, pharmacophore hypotheses are generated for a given ligand. A graph neural network encodes spatially distributed chemical features and a transformer decoder generates molecules.

Compound
Single
Compound
List
String
Pretrained
Chemical graph model
Compound generation
https://github.com/ersilia-os/eos69e6
https://www.nature.com/articles/s41467-023-41454-9
https://github.com/CSUBioGroup/PGMG
MIT
Model generates new molecules from input molecule by first creating pharmacophore hypotheses and then constraining generation.
Sampling
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos69e6
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos69e6.zip
Local
1/12/2023
Q4
2023
morgan-binary-fps
Ready
Morgan fingerprints in binary form (radius 3, 2048 dimensions)

The Morgan Fingerprints are one of the most widely used molecular representations. They are circular representations (from an atom,search the atoms around with a radius n) and can have thousands of features. This implementation uses the RDKit package and is done with radius 3 and 2048 dimensions, providing a binary vector as output. For Morgan counts, see model eos5axz.

Compound
Single
Value
List
Integer
Pretrained
Descriptor
Fingerprint
https://github.com/ersilia-os/eos4wt0
https://pubmed.ncbi.nlm.nih.gov/20426451/
https://www.rdkit.org/docs
BSD-3.0
Binary vector representing the SMILES
Representation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos4wt0
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4wt0.zip
Local
1/12/2023
Q4
2023
pmapper-3d
Ready
3D pharmacophore descriptor

The pharmacophore mapper (pmapper) identifies common 3D pharmacophores of active compounds against a specific target and uniquely encodes them with hashes suitable for fast identification of identical pharmacophores. The obtained signatures are amenable for downstream ML tasks.

Compound
Single
Value
List
Integer
Pretrained
Descriptor
Fingerprint
https://github.com/ersilia-os/eos4x30
https://www.mdpi.com/1422-0067/20/23/5834
https://github.com/DrrDom/pmapper
BSD-3.0
Vector representation of pharmacophores
Representation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos4x30
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4x30.zip
Local
28/11/2023
Q4
2023
molfeat-usrcat
Ready
USR descriptors with pharmacophoric constraints

USRCAT is a real-time ultrafast molecular shape recognition with pharmacophoric constraints. It integrates atom type to the traditional Ultrafast Shape Recognition (USR) descriptor to improve the performance of shape-based virtual screening, being able to discriminate between compounds with similar shape but distinct pharmacophoric features.

Compound
Single
Value
List
Float
Pretrained
Descriptor
Embedding
https://github.com/ersilia-os/eos1ut3
https://jcheminf.biomedcentral.com/articles/10.1186/1758-2946-4-27
https://molfeat.datamol.io/featurizers/usrcat
Apache-2.0
60 features based on USRCAT
Representation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos1ut3
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1ut3.zip
Local
28/11/2023
Q4
2023
antitb-seattle
Ready
Antituberculosis activity prediction

Prediction of the activity of small molecules against Mycobacterium tuberculosis. This model has been developed by Ersilia thanks to the data provided by the Seattle Children's (Dr. Tanya Parish research group). In vitro activity against M. tuberculosis was measured i na single point inhibition assay (10000 molecules) and selected compounds (259) were assayed in MIC50 and MIC90 assays. Cut-offs have been determined according to the researcher's guidance.

Compound
Single
Compound
List
Float
In-house
M.tuberculosis
Antimicrobial activity
MIC90
Tuberculosis
https://github.com/ersilia-os/eos9ivc
https://pubmed.ncbi.nlm.nih.gov/30650074/
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
Probability of inhibition of M.tb in vitro in the MIC50, MIC90 and whole cell assays at cut-offs 10 and 20 uM and 50%, respectively
Classification
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos9ivc
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ivc.zip
Local
No
24/11/2023
Q4
2023
molpmofit
Ready
Molecular Prediction Model Fine-Tuning (MolPMoFiT) encodings

Using self-supervised learning, the authors pre-trained a large model using one millon unlabelled molecules from ChEMBL. This model can subsequently be fine-tuned for various QSAR tasks. Here, we provide the encodings for the molecular structures using the pre-trained model, not the fine-tuned QSAR models.

Compound
Single
Value
List
Float
Pretrained
Descriptor
Embedding
https://github.com/ersilia-os/eos9zw0
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00430-x
https://github.com/XinhaoLi74/MolPMoFiT
CC
Embedding vectors of each smiles are obtained, represented in a matrix, where each row is a vector of embedding of each smiles character, with a dimension of 400. The pretrained model is loaded using the fastai library
Representation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos9zw0
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9zw0.zip
Local
6/11/2023
Q4
2023
moler-enamine-blocks
Ready
Extending molecular scaffolds with building blocks

MoLeR is a graph-based generative model that combines fragment-based and atom-by-atom generation of new molecules with scaffold-constrained optimization. It does not depend on generation history and therefore MoLeR is able to complete arbitrary scaffolds. The model has been trained on the GuacaMol dataset. Here we sample the 300k building blocks library from Enamine.

Compound
Single
Compound
List
String
Pretrained
Chemical graph model
Compound generation
https://github.com/ersilia-os/eos633t
https://arxiv.org/abs/2103.03864
https://github.com/microsoft/molecule-generation
MIT
1000 new molecules are sampled for each input molecule, preserving its scaffold.
Sampling
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos633t
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos633t.zip
Local
3/11/2023
Q4
2023
small-world-wuxi
Ready
Small World Wuxi search

Small World is an index of chemical space containing more than 230B molecular substructures. Here we use the Small World API to post a query to the SmallWorld server. We sample 100 molecules within a distance of 10 specifically for the Wuxi map, not the entire SmallWorld domain. Please check other small-world models available in our hub.

Compound
Single
Compound
List
String
Online
Similarity
https://github.com/ersilia-os/eos3kcw
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606195/
https://pypi.org/project/smallworld-api/
MIT
List of 100 nearest neighbors
Sampling
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3kcw
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3kcw.zip
Local
2/11/2023
Q4
2023
small-world-zinc
Ready
Small World Zinc search

Small World is an index of chemical space containing more than 230B molecular substructures. Here we use the Small World API to post a query to the SmallWorld server. We sample 100 molecules within a distance of 10 specifically for the ZINC map, not the entire SmallWorld domain. Please check other small-world models available in our hub.

Compound
Single
Compound
List
String
Online
Similarity
https://github.com/ersilia-os/eos1d7r
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606195/
https://pypi.org/project/smallworld-api/
MIT
List of 100 nearest neighbors
Sampling
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos1d7r
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1d7r.zip
Local
2/11/2023
Q4
2023
small-world-enamine-real
Ready
Small World Enamine REAL search

Small World is an index of chemical space containing more than 230B molecular substructures. Here we use the Small World API to post a query to the SmallWorld server. We sample 100 molecules within a distance of 10 specifically for the Enamine REAL map, not the entire SmallWorld domain. Please check other small-world models available in our hub.

Compound
Single
Compound
List
String
Online
Similarity
https://github.com/ersilia-os/eos9ueu
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606195/
https://pypi.org/project/smallworld-api/
MIT
List of 100 nearest neighbors
Sampling
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos9ueu
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ueu.zip
Local
1/11/2023
Q4
2023
mycetos
Ready
Inhibition of Eumycetoma from MycetOS

This model predicts the growth of the fungus M. mycetomatis, causal agent of Mycetoma, in presence of small drugs. It has been developed using the data from MycetOS, an opemn source initiative aiming at finding new patent-free drugs. The model has been trained using the LazyQSAR package (MorganBinaryClassifier) from Ersilia.

Compound
Single
Score
Single
Float
In-house
Mycetoma
Antifungal activity
https://github.com/ersilia-os/eos4f95
https://www.ijidonline.com/article/S1201-9712(20)31735-5/fulltext
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
Probability of inhibition of M. mycetomatis (growth assay, cut-off at 20% growth)
Annotation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos4f95
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4f95.zip
Local
No
27/9/2023
Q3
2023
hdac1-inhibition
Ready
Inhibition of HDAC1

Prediction of the inhibition of the Human Histone Deacetylase 1 to revert HIV latency. The dataset is composed of all available pIC50 values from ChEMBL target 325, and the model has been developed using Ersilia's LazyQsar package (MorganBinaryClassifier)

Compound
Single
Score
List
Float
In-house
HIV
Human
HDAC1
https://github.com/ersilia-os/eos2zmb
https://www.ebi.ac.uk/chembl/target_report_card/CHEMBL325/
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
Probability of inhibition of HDAC1 at cut-offs pIC50 7 (0.1uM) and 8 (10nM)
Annotation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos2zmb
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2zmb.zip
Local
No
27/9/2023
Q3
2023
chembl-sampler
Ready
ChEMBL Molecular Sampler

A simple sampler of the ChEMBL database using their API. It looks for similar molecules to the input molecule and returns a list of 100 molecules by default. This model has been developed by Ersilia. It posts queries to an online server.

Compound
Single
Compound
List
String
Pretrained
Similarity
https://github.com/ersilia-os/eos1noy
https://academic.oup.com/nar/article/40/D1/D1100/2903401
https://github.com/ersilia-os/chem-sampler/blob/main/chemsampler/samplers/chembl/sampler.py
GPL-3.0
100 nearest molecules in ChEMBL
Sampling
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos1noy
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1noy.zip
Local
4/9/2023
Q3
2023
hepg2-mmv
Ready
HepG2 Toxicity - MMV

This model predicts the toxicity of small molecules in HepG2 cells. It has been developed by Ersilia thanks to data provided by MMV. We have used two cut-offs to define activity (5 and 10 uM respectively) with a dataset of 1335 molecules. 5-fold crossvalidation showed an AUROC of 0.8 and 0.77 respectively

Compound
Single
Probability
List
Float
In-house
Toxicity
Human
https://github.com/ersilia-os/eos3le9
https://ersilia.io
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
Probability of toxicity in HepG2 cells. Cut-offs: 5 and 10 uM
Classification
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos3le9
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3le9.zip
Local
No
24/8/2023
Q3
2023
malaria-mmv
Ready
Antimalarial activity (MMV)

Prediction of the in vitro antimalarial potential of small molecules. This model has been developed by Ersilia thanks to experimental data provided by MMV. The model provides the probability of inhibition of the malaria parasite (NF54) measured both as percentage of inhibition (with luminescence and LDH) and IC50. 5-fold crossvalidation of the models shows AUROC>0.75 in all models.

Compound
Single
Probability
Single
Float
In-house
Malaria
P.falciparum
IC50
https://github.com/ersilia-os/eos4rta
https://ersilia.io
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
Probability of inhibiting the malaria parasite (strain NF54) in IC50 (threshold 1uM) and percentage of inhibition (50%, measured by LDH and Lum)
Classification
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos4rta
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4rta.zip
Local
No
24/8/2023
Q3
2023
schisto-swisstph
Ready
Anti-schistosomiasis activity

Prediction of the activity of small molecules against the schistosoma parasite. This model has been developed by Ersilia thanks to the data provided by the Swiss TPH. In vitro activity against newly transformed schistosoma (nts) and adult worms was measured (% of inhibition of activity and IC50, respectively)

Compound
Single
Probability
List
Float
In-house
Neglected tropical disease
Schistosomiasis
IC50
https://github.com/ersilia-os/eos2l0q
https://pubmed.ncbi.nlm.nih.gov/30398059
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
The probabilities of the molecule being active against schistosoma in NTS stage (in a % of inhibition assay at 70 and 90% inhibition 10uM) and adult stage (in IC50 assay at cut-offs 5 and 10uM
Classification
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos2l0q
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2l0q.zip
Local
No
24/8/2023
Q3
2023
chemprop-abaumannii
Ready
Inhibition of Acinetobacter baumannii growth

This model is a Chemprop neural network trained with a growth inhibition dataset. Authors screened ~7,500 molecules for those that inhibited the growth of A. baumannii in vitro. They discovered abaucin, an antibacterial compound with narrow-spectrum activity against A. baumannii.

Compound
Single
Score
Single
Float
Pretrained
A.baumannii
Antimicrobial activity
https://github.com/ersilia-os/eos3804
https://www.nature.com/articles/s41589-023-01349-8
https://github.com/GaryLiu152/chemprop_abaucin
None
Probability of growth inhibition of the bacteria A. Baumannii (threshold > 80%)
Annotation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://eos3894-gz5nz.ondigitalocean.app/
https://hub.docker.com/r/ersiliaos/eos3804
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3804.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos3804
Online
Yes
23/8/2023
Q3
2023
pubchem-sampler
Ready
PubChem Molecular Sampler

A simple sampler of the PubChem database using their API. It looks for similar molecules to the input molecule and returns a list of 100 molecules by default. This model has been developed by Ersilia and posts queries to an online server.

Compound
Single
Compound
List
String
Pretrained
Similarity
https://github.com/ersilia-os/eos2hzy
https://academic.oup.com/nar/article/51/D1/D1373/6777787
https://github.com/ersilia-os/chem-sampler/blob/main/chemsampler/samplers/pubchem/sampler.py
GPL-3.0
100 nearest molecules in PubChem
Similarity
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos2hzy
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2hzy.zip
Local
10/8/2023
Q3
2023
stoned-sampler
Ready
Stoned Sampler

The STONED sampler uses small modifications to molecules represented as SELFIES to perform a search of the chemical space and generate new molecules. The use of string modifications in the SELFIES molecular representation bypasses the need for large amounts of data while maintaining a performance comparable to deep generative models.

Compound
Single
Compound
List
String
Pretrained
Compound generation
https://github.com/ersilia-os/eos8fma
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153210/
https://github.com/aspuru-guzik-group/stoned-selfies
Apache-2.0
Up to 1000 derivatives of the input molecule
Generative
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos8fma
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8fma.zip
Local
8/8/2023
Q3
2023
smiles-pe
Ready
SmilesPE: tokenizer algorithm for SMILES, DeepSMILES, and SELFIES

The Smiles Pair Encoding method generates smiles substring tokens based on high-frequency token pairs from large chemical datasets. This method is well-suited for both QSAR activities as well as generative models. The model provided here has been pretrained using ChEMBL.

Compound
Single
Compound
Flexible List
String
Pretrained
Chemical language model
Chemical notation
ChEMBL
https://github.com/ersilia-os/eos1mxi
https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c01127
https://github.com/XinhaoLi74/SmilesPE
Apache-2.0
A data-driven tokenization method for SMILES-based deep learning models in cheminformatics, demonstrating high performance in molecular generation and QSAR prediction tasks compared to atom-level tokenization
Generative
Richiio
https://github.com/Richiio
https://hub.docker.com/r/ersiliaos/eos1mxi
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1mxi.zip
Local
2/8/2023
Q3
2023
osm-series4
Ready
Antimalarial activity from OSM

This model predicts the antimalarial potential of small molecules in vitro. We have collected the data available from the Open Source Malaria Series 4 molecules and used two cut-offs to define activity, 1 uM and 2.5 uM. The training has been done with the LazyQSAR package (Morgan Binary Classifier) and shows an AUROC >0.8 in a 5-fold cross-validation on 20% of the data held out as test. These models have been used to generate new series 4 candidates by Ersilia.

Compound
Single
Probability
List
Float
Pretrained
Malaria
P.falciparum
IC50
https://github.com/ersilia-os/eos7yti
https://pubs.acs.org/doi/10.1021/acscentsci.6b00086
https://github.com/ersilia-os/lazy-qsar
GPL-3.0
Probability of killing P.falciparum in vitro (IC50 < 1uM and 2.5uM, respectively)
Classification
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos7yti
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7yti.zip
Local
No
2/8/2023
Q3
2023
fasmifra
Ready
FasmiFra molecule generator

FasmiFra is a molecular generator based on (deep)SMILES fragments. The authors use Deep SMILES to ensure the generated molecules are syntactically valid, and by working on string operations they are able to obtain high performance (>340,000 molecule/s). Here, we use 100k compounds from ChEMBL to sample fragments. Only assembled molecules containing one of the fragments of the input molecule are retained.

Compound
Single
Compound
List
String
Pretrained
Compound generation
https://github.com/ersilia-os/eos4qda
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00566-4
https://github.com/UnixJunkie/FASMIFRA
GPL-3.0
1000 generated molecules per each input
Generative
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos4qda
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4qda.zip
Local
1/8/2023
Q3
2023
malaria-mam
Ready
Antimalarial activity for sexual stage and asexual blood stage (ABS)

Prediction of the antimalarial potential of small molecules using data from various chemical libraries that were screened against the asexual and sexual (gametocyte) stages of the parasite. Several compounds' molecular fingerprints were used to train machine learning models to recognize stage-specific active and inactive compounds.

Compound
Single
Score
List
Float
Pretrained
Malaria
P.falciparum
https://github.com/ersilia-os/eos80ch
https://pubs.acs.org/doi/10.1021/acsomega.3c05664
https://github.com/M2PL
GPL-3.0
Probability of inhibition of the malaria parasite growth
Annotation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos80ch
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos80ch.zip
Local
Yes
10/7/2023
Q3
2023
ncats-cyp3a4
Ready
CYP3A4 metabolism

Analysis of metabolic stability, determining the inhibition of CYP3A4 activity and whether the compounds are a substrate for the CYP3A$ enzyme. The data to build these models has been publicly available at PubChem (AID1645840, AID1645841, AID1645842) by ADME@NCATS.

Compound
Single
Probability
List
Float
Pretrained
CYP450
ADME
Metabolism
https://github.com/ersilia-os/eos3ev6
https://dmd.aspetjournals.org/content/49/9/822
https://github.com/ncats/ncats-adme
None
Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.
Classification
ZakiaYahya
https://github.com/ZakiaYahya
https://hub.docker.com/r/ersiliaos/eos3ev6
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ev6.zip
Local
Yes
6/7/2023
Q3
2023
ncats-cyp2d6
Ready
CYP2D6 metabolism

Analysis of metabolic stability, determining the inhibition of CYP2D6 activity and whether the compounds are a substrate for the CYP2D6 enzyme. The data to build these models has been publicly available at PubChem (AID1645840, AID1645841, AID1645842) by ADME@NCATS

Compound
Single
Probability
List
Float
Pretrained
CYP450
ADME
Metabolism
https://github.com/ersilia-os/eos7nno
https://dmd.aspetjournals.org/content/49/9/822
https://github.com/ncats/ncats-adme
None
Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.
Classification
ZakiaYahya
https://github.com/ZakiaYahya
https://hub.docker.com/r/ersiliaos/eos7nno
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7nno.zip
Local
Yes
6/7/2023
Q3
2023
ncats-cyp2c9
Ready
CYP2C9 metabolism

Analysis of metabolic stability, determining the inhibition of CYP2C9 activity and whether the compounds are a substrate for the CYP2C9 enzyme. The data to build these models has been publicly available at PubChem (AID1645840, AID1645841, AID1645842) by ADME@NCATS

Compound
Single
Probability
List
Float
Pretrained
CYP450
ADME
Metabolism
https://github.com/ersilia-os/eos5jz9
https://dmd.aspetjournals.org/content/49/9/822
https://github.com/ncats/ncats-adme
None
Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.
Classification
ZakiaYahya
https://github.com/ZakiaYahya
https://hub.docker.com/r/ersiliaos/eos5jz9
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5jz9.zip
Local
Yes
5/7/2023
Q3
2023
bidd-molmap-fingerprint
Ready
Molecular fingerprint maps based on broadly learned knowledge-based representations

Molecular representation of small molecules via ingerprint-based molecular maps (images). Typically, the goal is to use these images as inputs for an image-based deep learning model such as a convolutional neural network. The authors have demonstrated high performance of MolMap out-of-the-box with a broad range of tasks from MoleculeNet.

Compound
Single
Image
Descriptor
List
Float
Pretrained
Fingerprint
https://github.com/ersilia-os/eos59rr
https://www.nature.com/articles/s42256-021-00301-6
https://github.com/shenwanxiang/bidd-molmap
GPL-3.0
Image representation of a molecule. Each pixel represents a molecular feature (37 rows, 36 columns, flattened with reshape)
Representation
samuelmaina
https://github.com/samuelmaina
https://hub.docker.com/r/ersiliaos/eos59rr
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos59rr.zip
Local
3/7/2023
Q3
2023
h3d-virtual-screening-cascade-light
Ready
H3D virtual screening cascade light

This panel of models provides predictions for the H3D virtual screening cascade. It leverages the Ersilia Compound Embedding and FLAML. The H3D virtual screening cascade contains models for Mycobacterium tuberculosis and Plasmodium falciparum IC50 predictions, as well as ADME, cytotoxicity and solubility assays

Compound
Single
Probability
List
Float
In-house
Malaria
P.falciparum
Tuberculosis
M.tuberculosis
ADME
Cytotoxicity
Solubility
https://github.com/ersilia-os/eos7kpb
https://www.nature.com/articles/s41467-023-41512-2
https://github.com/ersilia-os/h3d-screening-cascade-models
GPL-3.0
The raw scores are the ones emerging from the FLAML model. The ones with a sufix _perc represent the percentile in the scale 0-1 over a ChEMBL dataset of 200k compounds.
Classification
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7kpb
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7kpb.zip
Local
Yes
9/5/2023
Q2
2023
ersilia-compound-embedding
Ready
Ersilia Compound Embeddings

Bioactivity-aware chemical embeddings for small molecules. Using transfer learning, we have created a fast network that produces embeddings of 1024 features condensing physicochemical as well as bioactivity information The training of the network has been done using the FS-Mol and ChEMBL datasets, and Grover, Mordred and ECFP descriptors

Compound
Single
Descriptor
List
Float
In-house
Descriptor
Embedding
https://github.com/ersilia-os/eos2gw4
https://www.nature.com/articles/s41467-023-41512-2
https://github.com/ersilia-os/compound-embedding
GPL-3.0
Embedding of 1024 features representing a compound
Representation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos2gw4
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2gw4.zip
Local
13/4/2023
Q2
2023
molfeat-chemgpt
Ready
ChemGPT-4.7

ChemGPT (4.7M params) is a language-based transformer model for generative molecular modeling, which was pretrained on the PubChem10M dataset. Pre-trained ChemGPT models are also robust, self-supervised representation learners that generalize to previously unseen regions of chemical space and enable embedding-based nearest-neighbor search.

Compound
Single
Descriptor
List
Float
Pretrained
Descriptor
Chemical language model
Chemical graph model
Embedding
https://github.com/ersilia-os/eos3cf4
https://chemrxiv.org/engage/chemrxiv/article-details/627bddd544bdd532395fb4b5
https://molfeat.datamol.io/featurizers/ChemGPT-4.7M
Apache-2.0
128 features based on a chemical language model
Representation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos3cf4
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3cf4.zip
Local
11/4/2023
Q2
2023
molfeat-estate
Ready
Estate Molecular Descriptors

Electrotopological state (Estate) indices are numerical values computed for each atom in a molecule, and which encode information about both the topological environment of that atom and the electronic interactions due to all other atoms in the molecule

Compound
Single
Descriptor
List
Float
Pretrained
Fingerprint
Descriptor
https://github.com/ersilia-os/eos3zur
https://link.springer.com/article/10.1023/A:1015952613760
https://molfeat.datamol.io/featurizers/estate
Apache-2.0
79 Electrotopological features
Representation
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos3zur
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3zur.zip
Local
11/4/2023
Q2
2023
ncats-pampa74
Ready
Parallel Artificial Membrane Permeability Assay (PAMPA) 7

Parallel Artificial Membrane Permeability is an in vitro surrogate to determine the permeability of drugs across cellular membranes. PAMPA at pH 7.4 was experimentally determined in a dataset of 5,473 unique compounds by the NIH-NCATS. 50% of the dataset was used to train a classifier (SVM) to predict the permeability of new compounds, and validated on the remaining 50% of the data, rendering an AUC = 0.88. The Peff was converted to logarithmic, log Peff value lower than 2.0 were considered to h

Compound
Single
Probability
Single
Float
Pretrained
ADME
Permeability
LogP
https://github.com/ersilia-os/eos9tyg
https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext
https://github.com/ncats/ncats-adme
None
Probability of a compound being poorly permeable (logPeff < 1)
Classification
pauline-banye
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos9tyg
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9tyg.zip
Local
Yes
7/4/2023
Q2
2023
ncats-cyp450
Ready
CYP450 metabolism

Analysis of metabolic stability, determining the inhibition of CYP450 activity and whether the compounds are a substrate for the CYP450 enzymes. The data to build these models is publicly available at PubChem, AID1645840, AID1645841, AID1645842. The tested cyps include CYP2C9, CYP2D6 and CYP3A4.

Compound
Single
Probability
List
Float
Pretrained
CYP450
ADME
Metabolism
https://github.com/ersilia-os/eos44zp
https://dmd.aspetjournals.org/content/49/9/822
https://github.com/ncats/ncats-adme
None
Probability of inhibiting the enzyme and probability of being a ubstrate of the enzyme. Activity in both indicates the compound is a ligand of the enzyme.
Classification
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos44zp
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos44zp.zip
Local
Yes
6/4/2023
Q2
2023
qcrb-tb
Ready
QcrB Inhibition (M. tuberculosis)

The cytochrome bcc complex (QcrB) is a subunit of the mycobacterial cyt-bcc-aa3 oxidoreductase in the electron transport chain (ETC), and it has been suggested as a good M.tb target due to the bacteria's dependence on oxidative phosphorylation for its growth. The authors use a dataset of 352 molecules, of which 277 are classified as active (QIM < 1 uM), 58 as moderately active ( 1 > QIM < 20 uM) and 78 as inactive (QIM > 20). Qim refers to quantification of intracellular mycobacteria.

Compound
Single
Other value
Single
Integer
Pretrained
M.tuberculosis
Antimicrobial activity
https://github.com/ersilia-os/eos24jm
https://pubs.acs.org/doi/full/10.1021/acsomega.2c01613
https://github.com/CoutinhoLab/Q-TB/
CC
Class 1: active(QIM < 1uM), Class 2:moerately active (1 < QIM < 20uM), Class 3:inactive (QIM > 20uM)
Classification
GemmaTuron
https://github.com/GemmaTuron
https://hub.docker.com/r/ersiliaos/eos24jm
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos24jm.zip
Local
Yes
6/4/2023
Q2
2023
rxn-fingerprint
Ready
RXNFP - chemical reaction fingerprints

RXNFP uses a pre-trained BERT Language Model to transform a reaction represented as smiles into a fingerprint amenable for downstream applications. The authors show how the RXN-fps can be used to identify nearest neighbors on reaction datasets, or map the reaction space without knowing the reaction centers.

Compound
Single
Descriptor
Matrix
Float
Pretrained
Fingerprint
Embedding
Chemical synthesis
https://github.com/ersilia-os/eos6aun
https://www.nature.com/articles/s42256-020-00284-w
https://github.com/rxn4chemistry/rxnfp/tree/master/
MIT
Fingerprint of the reaction.
Representation
samuelmaina
https://github.com/samuelmaina
https://hub.docker.com/r/ersiliaos/eos6aun
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6aun.zip
Local
28/3/2023
Q1
2023
ncats-hlm
Ready
Human Liver Microsomal Stability

The Human Liver Microsomal assay takes into account the liver-mediated drug metabolism to assess the stability of a compound in the human body. The NIH-NCATS group took a proprietary dataset of 4300 compounds with its associated HLM (in vitro half-life; unstable ≤  30 min, stable >30 min) and used it to train a classifier.

Compound
Single
Probability
Single
Float
Pretrained
Metabolism
ADME
Human
Microsomal stability
Half-life
https://github.com/ersilia-os/eos31ve
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00426-7
https://github.com/ncats/ncats-adme/tree/master
None
Probability of a compound being unstable in a HLM assay (half-life ≤ 30min)
Classification
pauline-banye
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos31ve
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos31ve.zip
Local
Yes
27/3/2023
Q1
2023
s2dv-hepg2-toxicity
Ready
S2DV HepG2 toxicity

The model uses Word2Vec, a natural language processing technique to represent SMILES strings. The model was trained on over <2000 small molecules with associated experimental HepG2 cytotoxicity data (IC50) to classify compounds as HepG2 toxic (IC50 <= 30 uM) or non-toxic. Data was gathered from the public repository ChEMBL.

Compound
Single
Experimental value
Single
Float
Pretrained
ChEMBL
IC50
Toxicity
https://github.com/ersilia-os/eos2fy6
https://pubmed.ncbi.nlm.nih.gov/35062019/
https://github.com/NTU-MedAI/S2DV
Apache-2.0
Probability of HepG2 Toxicity (IC50 < 30 uM)
Classification
emmakodes
https://github.com/emmakodes
https://hub.docker.com/r/ersiliaos/eos2fy6
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2fy6.zip
Local
27/3/2023
Q1
2023
hob-pre
Ready
Human oral bioavailability prediction

HobPre predicts the oral bioavailability of small molecules in humans. It has been trained using public data on ~1200 molecules (Falcón-Cano et al, 2020, complemented with other literature and ChEMBL compounds). The molecules were labeled according to two cut-offs: HOB > 20% and HOB > 50%, due to ongoing discussions as to which would be a more appropriate cut-off.

Compound
Single
Probability
List
Float
Pretrained
ADME
Solubility
Human
https://github.com/ersilia-os/eos2lqb
https://doi.org/10.1186/s13321-021-00580-6
https://github.com/whymin/HOB
None
Probability of a compound having high oral bioavailability (HOB >20% and HOB >50%)
Classification
HellenNamulinda
https://github.com/HellenNamulinda
https://hub.docker.com/r/ersiliaos/eos2lqb
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2lqb.zip
Local
Yes
27/3/2023
Q1
2023
redial-2020
Ready
SARS-CoV-2 antiviral prediction: REDIAL-2020

Predictor of several endpoints related to Sars-CoV-2. It provides predictions for Live Virus Infectivity, Viral Entry, Viral Replication, In Vitro Infectivity and Human Cell Toxicity using a combination of three models. Consensus results are obtained by averaging the prediction for the three different models for each activity and toxicity models. The models have been built using NCATS COVID19 data. Further details on result interpretations can be found here: https://drugcentral.org/Redial

Compound
Single
Probability
Single
Float
Pretrained
Sars-CoV-2
COVID19
Antiviral activity
https://github.com/ersilia-os/eos8fth
https://www.nature.com/articles/s42256-021-00335-w#Sec9
https://github.com/sirimullalab/redial-2020/tree/v1.0
MIT
The model returns the probability of 1 (active) in each assay. Good drugs are active in CPE, 3CL and are inactive in cytotox, hCYTOX and ACE2 and/or are active in at least one of the following: AlphaLISA, CoV-PPE, MERS-PPE, while inactive in the counter screen, respectively: TruHit, CoV-PPE_cs, MERS-PPE_cs.
Classification
Pradnya2203
https://github.com/Pradnya2203
https://hub.docker.com/r/ersiliaos/eos8fth
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8fth.zip
Local
Yes
27/3/2023
Q1
2023
s2dv-hbv
Ready
Inhibition of Hepatits B virus

The model uses Word2Vec, a natural language processing technique to represent SMILES strings. The model was trained on over <4000 small molecules with associated experimental HBV inhibition data (IC50) to classify compounds as HBV inhibitors (IC50 <= 1 uM) or non-inhibitors. Data was gathered from the public repository ChEMBL.

Compound
Single
Experimental value
Single
Float
Pretrained
Antiviral activity
IC50
HBV
ChEMBL
https://github.com/ersilia-os/eos8lok
https://pubmed.ncbi.nlm.nih.gov/35062019/
https://github.com/NTU-MedAI/S2DV
Apache-2.0
Probability of inhibition of HBV (IC50 < 1uM)
Classification
emmakodes
https://github.com/emmakodes
https://hub.docker.com/r/ersiliaos/eos8lok
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8lok.zip
Local
Yes
24/3/2023
Q1
2023
ncats-hlcs
Ready
Human Liver Cytosolic Stability

The human liver cytosol stability model is used for predicting the stability of a drug in the cytosol of human liver cells, which is beneficial for identifying potential drug candidates early during the drug discovery process. If a drug compound is quickly absorbed, it may not reach the intended target in the body or become toxic. On the other hand, if a drug compound is too stable, it could accumulate and cause detrimental effects. The authors use an NCATS dataset of 1450 compounds screened in

Compound
Single
Probability
Single
Float
Pretrained
ADME
Metabolism
Human
Half-life
https://github.com/ersilia-os/eos9yy1
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00426-7
https://github.com/ncats/ncats-adme
None
Probability of a compound being unstable (half-life ≤ 30min) due to liver cells metabolism
Classification
pauline-banye
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos9yy1
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9yy1.zip
Local
Yes
1/3/2023
Q1
2023
idl-ppbopt
Ready
Human Plasma Protein Binding (PPB) of Compounds

IDL-PPB aims to obtain the plasma protein binding (PPB) values of a compound. Based on an interpretable deep learning model and using the algorithm fingerprinting (AFP) this model predicts the binding affinity of the plasma protein with the compound.

Compound
Single
Experimental value
Single
Float
Pretrained
Fraction bound
ADME
https://github.com/ersilia-os/eos22io
https://pubs.acs.org/doi/10.1021/acs.jcim.2c00297
https://github.com/Louchaofeng/IDL-PPBopt
GPL-3.0
This model receives smiles as input and returns as output the fraction PPB, which measures the affinity of the binding of the plasma protein. In the analysis of results by the author, they indicate high affinity (fraction of ppb >80%), medium affinity (40% <= fraction of ppb <=80%) and as low levels of affinity (fraction of ppb < 40%). Note: Inorganics and salts are out of the applicability domain of the model, So for these compounds the output is Null.
Regression
carcablop
https://github.com/carcablop
https://hub.docker.com/r/ersiliaos/eos22io
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos22io.zip
Local
3/2/2023
Q1
2023
ncats-solubility
Ready
Aqueous Kinetic Solubility

Kinetic aqueous solubility (μg/mL) was experimentally determined using the same SOP in over 200 NCATS drug discovery projects. A final dataset of 11780 non-redundant molecules and their associated solubility was used to train a SVM classifier. Approximately half of the dataset has poor solubility (< 10 μg/mL), and two-thirds of these low soluble molecules report values of < 1 μg/mL. A subset of the data used is available at PubChem (AID 1645848).

Compound
Single
Probability
Single
Float
Pretrained
ADME
Solubility
https://github.com/ersilia-os/eos74bo
https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext
https://github.com/ncats/ncats-adme
None
Probability of a compound having poor solublibity (< 10 µg/ml)
Classification
pauline-banye
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos74bo
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos74bo.zip
Local
Yes
31/1/2023
Q1
2023
ncats-pampa5
Ready
Parallel Artificial Membrane Permeability Assay 5

Parallel Artificial Membrane Permeability is an in vitro surrogate to determine the permeability of drugs across cellular membranes. PAMPA at pH 5 was experimentally determined in a dataset of 5,473 unique compounds by the NIH-NCATS. 50% of the dataset was used to train a classifier (SVM) to predict the permeability of new compounds, and validated on the remaining 50% of the data, rendering an AUC = 0.88. The Peff was converted to logarithmic, log Peff value lower than 2.0 were considered to hav

Compound
Single
Probability
Single
Float
Pretrained
ADME
Permeability
LogP
https://github.com/ersilia-os/eos81ew
https://www.sciencedirect.com/science/article/pii/S0968089621005964
https://github.com/ncats/ncats-adme
None
Probability of a compound being poorly permeable (logPeff < 1)
Classification
pauline-banye
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos81ew
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos81ew.zip
Local
Yes
29/1/2023
Q1
2023
image-mol-gpcr
Ready
imagemol-gpcr

ImageMol is a Representation Learning Framework that utilizes molecule images for encoding molecular inputs as machine readable vectors for downstream tasks such as bio-activity prediction, drug metabolism analysis, or drug toxicity prediction. The approach utilizes transfer learning, that is, pre-training the model on massive unlabeled datasets to help it in generalizing feature extraction and then fine tuning on specific tasks. This model is fine tuned on 10 GPCR assays with the largest number

Compound
Single
Score
Single
Float
Pretrained
Target identification
GPCR
https://github.com/ersilia-os/eos93h2
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/HongxinXiang/ImageMol
MIT
Binding activity prediction (as a regression task) for the following GPCR assays: 5HT1A, 5HT2A, AA1R, AA2AR, AA3R, CNR2, DRD2, DRD3, HRH3, OPRM
Regression
DhanshreeA
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos93h2
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos93h2.zip
Local
25/1/2023
Q1
2023
datamol-smiles2canonical
Ready
Converter of SMILES in Canonical, Selfie, Inchi, Inchi Key form

Using the Datamol package, the model receives a SMILE as input, then goes through a process of sanitizing and standardization of the molecule to generate four outputs: Canonical SMILES, SELFIES, InChI and InChIKey

Compound
Single
Compound
Matrix
String
Pretrained
Chemical notation
https://github.com/ersilia-os/eos7qga
https://doc.datamol.io/stable/tutorials/Preprocessing.html
https://github.com/datamol-org/datamol
Apache-2.0
Compound represented in its canonical SMILES, SELFIES, InChI and InChIKey forms
Representation
carcablop
https://github.com/carcablop
https://hub.docker.com/r/ersiliaos/eos7qga
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7qga.zip
Local
25/1/2023
Q1
2023
image-mol-embeddings
Ready
Molecular representation learning

Representation Learning Framework that utilizes molecule images for encoding molecular inputs as machine readable vectors for downstream tasks such as bio-activity prediction, drug metabolism analysis, or drug toxicity prediction. The approach utilizes transfer learning, that is, pre-training the model on massive unlabeled datasets to help it in generalizing feature extraction and then fine tuning on specific tasks.

Compound
Single
Descriptor
Matrix
Float
Pretrained
Embedding
https://github.com/ersilia-os/eos4avb
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/HongxinXiang/ImageMol
MIT
ImageMol embeddings of shape [1512] reshaped as a Numpy 1D array before serializing. These embeddings can be used as the input features of a fully connected classification or regression layer in a neural network.
Representation
DhanshreeA
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos4avb
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4avb.zip
Local
25/1/2023
Q1
2023
sars-cov-2-antiviral-screen
Ready
SARS-CoV-2 Anti viral screening

ImageMol is a Representation Learning Framework that utilizes molecule images for encoding molecular inputs as machine readable vectors for downstream tasks such as bio-activity prediction, drug metabolism analysis, or drug toxicity prediction. The approach utilizes transfer learning, that is, pre-training the model on massive unlabeled datasets to help it in generalizing feature extraction and then fine tuning on specific tasks. This model is fine tuned on 13 assays concerned with a number of t

Compound
Single
Boolean
List
Integer
Pretrained
Sars-CoV-2
Antiviral activity
COVID19
https://github.com/ersilia-os/eos4cxk
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/HongxinXiang/ImageMol
MIT
The output is comprised of binary classification across thirteen assays that are as follows: 3C-like enzymatic activity (3CL), ACE2 enzymatic activity (ACE2), Human Embryonic Kidney 293 Cell line toxicity (HEK293), Human fibroblast toxicity (Human), MERS Pseudotyped particle entry (MERS_PPE), MERS Pseudotyped particle entry counterscreen (MERS_PPE_cs), SarsCov Pseudotyped particle entry (Cov_PPE), SarsCov Pseudotyped particle entry counterscreen (Cov_PPE_cs), SarsCov2 cytopathic effect (COV2_CPE), SarsCov2 cytopathic effect counterscreen (COV2_Cytotox), Spike ACE2 Protein-protein interaction (AlphaLISA), Spike ACE2 Protein-protein interaction counterscreen (TruHit), Transmembrane protease serine 2 enzymatic activity (TMPRSS2)
Classification
DhanshreeA
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos4cxk
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4cxk.zip
Local
Yes
25/1/2023
Q1
2023
image-mol-bace
Ready
ImageMol human beta-secretase-1 (BACE-1) inhibition

This model has been developed using ImageMol, a deep learning model pretrained on 10 million unlabelled small molecules and fine-tuned in a second step to predict the binding of inhibitors to the human beta secretase 1 (BACE-1) protein. The BACE-1 dataset from MoleculeNet contains 1522 compounds with their associated pIC50. A compound with pIC50 => 7 is considered a BACE-1 inhibitor.

Compound
Single
Probability
Single
Float
Pretrained
BACE
Chemical graph model
MoleculeNet
https://github.com/ersilia-os/eos8c0o
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/ChengF-Lab/ImageMol
MIT
Probability of BACE-1 inhibition (>0.5: Inhibitor). Compounds with pIC50 => 7 are considered BACE-1 inhibitors
Classification
DhanshreeA
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos8c0o
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8c0o.zip
Local
17/1/2023
Q1
2023
image-mol-hiv
Ready
ImageMol HIV growth inhibition

TThis model has been developed using ImageMol, a deep learning model pretrained on 10 million unlabelled small molecules and fine-tuned in a second step to predict the inhibition of the human immunodeficiency virus (HIV). The HIV dataset is from MoleculeNet and contains 43850 small molecules and their in vitro activity against HIV (CA - Confirmed active, CM - Confirmed moderately active, CI - Confirmed inactive). The classification was based on EC50 values and expert knowledge.

Compound
Single
Probability
Single
Float
Pretrained
HIV
Antiviral activity
MoleculeNet
https://github.com/ersilia-os/eos6hy3
https://www.nature.com/articles/s42256-022-00557-6
https://github.com/ChengF-Lab/ImageMol
MIT
Probability of HIV inhibition. Active compounds are considered those classified as CA/CM.
Classification
DhanshreeA
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos6hy3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6hy3.zip
Local
Yes
17/1/2023
Q1
2023
ncats-rlm
Ready
Rat liver microsomal stability

Hepatic metabolic stability is key to ensure the drug attains the desired concentration in the body. The Rat Liver Microsomal (RLM) stability is a good approximation of a compound’s stability in the human body, and NCATS has collected a proprietary dataset of 20216 compounds with its associated RLM (in vitro half-life; unstable ≤30 min, stable >30 min) and used it to train a classifier based on an ensemble of several ML approaches (random forest, deep neural networks, graph convolutional neural

Compound
Single
Probability
Single
Float
Pretrained
Microsomal stability
Rat
ADME
Metabolism
Half-life
https://github.com/ersilia-os/eos5505
https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext
https://github.com/ncats/ncats-adme
None
Probability of a compound being unstable in RLM assay (half-life ≤ 30min)
Classification
pauline-banye
https://github.com/pauline-banye
https://hub.docker.com/r/ersiliaos/eos5505
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5505.zip
Local
Yes
12/1/2023
Q1
2023
smiles2iupac
Ready
STOUT: SMILES to IUPAC name translator

Small molecules are represented by a variety of machine-readable strings (SMILES, InChi, SMARTS, among others). On the contrary, IUPAC (International Union of Pure and Applied Chemistry) names are devised for human readers. The authors trained a language translator model treating the SMILES and IUPAC as two different languages. 81 million SMILES were downloaded from PubChem and converted to SELFIES for model training. The corresponding IUPAC names for the 81 million SMILES were obtained with Che

Compound
Single
Text
Single
String
Pretrained
Chemical notation
Chemical language model
https://github.com/ersilia-os/eos4se9
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00512-4
https://github.com/Kohulan/Smiles-TO-iUpac-Translator
MIT
IUPAC name of a specific SMILES
Representation
carcablop
https://github.com/carcablop
https://hub.docker.com/r/ersiliaos/eos4se9
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4se9.zip
Local
9/1/2023
Q1
2023
drugtax
Ready
DrugTax: Drug taxonomy

DrugTax takes SMILES inputs and classifies the molecule according to their taxonomy, organic or inorganic kingdom and their subclasses, using a 0/1 binary classification for each one. It generates a vector of 163 features including the taxonomy classification and other key information such as number of carbons, nitrogens… These vectors can be used for subsequent molecular representation in chemoinformatic pipelines.

Compound
Single
Descriptor
List
Integer
Pretrained
Fingerprint
Descriptor
https://github.com/ersilia-os/eos24ci
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-022-00649-w
https://github.com/MoreiraLAB/DrugTax
GPL-3.0
A vector of 163 points, each one corresponding to a particular taxonomic or structural molecular feature
Representation
Femme-js
https://github.com/Femme-js
https://hub.docker.com/r/ersiliaos/eos24ci
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos24ci.zip
Local
3/1/2023
Q1
2023
meta-trans
Ready
MetaTrans: human drug metabolites

Small molecules are metabolized by the liver in what is known as phase I and phase II reactions. Those can lead to reduced drug efficacy and generation of toxic metabolites, causing serious side effects. This model predicts the human metabolites of small molecules using a molecular transformer pr-trained on general chemical reactions and fine tuned to human metabolism. It provides up to 10 metabolites for each input molecule.

Compound
Single
Compound
List
String
Pretrained
Metabolism
https://github.com/ersilia-os/eos935d
https://pubs.rsc.org/en/content/articlelanding/2020/sc/d0sc02639e#fn1
https://github.com/KavrakiLab/MetaTrans
BSD-3.0
A maximum of 10 human metabolites generated from the input molecule
Generative
carcablop
https://github.com/carcablop
https://hub.docker.com/r/ersiliaos/eos935d
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos935d.zip
Local
20/12/2022
Q4
2022
crem-structure-generation
Ready
CReM fragment based structure generation

CReM (chemically reasonable mutations) is a fragment-based generative model that takes as input a small molecule, breaks it down into fragments and iteratively replaces them with other fragments from a database. It has three implementations (MUTATE: arbitrarily replaces one fragment with another one); GROW (arbitrarily replaces an hydrogen with another fragment) and LINK (replaces hydrogen atoms in two molecules to link them with a fragment). Here, we use a MUTATE and GROWTH approach, which prov

Compound
Single
Compound
List
String
Pretrained
Compound generation
https://github.com/ersilia-os/eos4q1a
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00431-w
https://github.com/DrrDom/crem
BSD-3.0
Up to 100 newly generated molecules
Generative
DhanshreeA
https://github.com/DhanshreeA
https://hub.docker.com/r/ersiliaos/eos4q1a
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4q1a.zip
Local
20/12/2022
Q4
2022
moler-enamine-fragments
Ready
Extending molecular scaffolds with fragments

MoLeR is a graph-based generative model that combines fragment-based and atom-by-atom generation of new molecules with scaffold-constrained optimization. It does not depend on generation history and therefore MoLeR is able to complete arbitrary scaffolds. The model has been trained on the GuacaMol dataset. Here we sample a fragment library from Enamine.

Compound
Single
Compound
List
String
Pretrained
Chemical graph model
Compound generation
https://github.com/ersilia-os/eos9taz
https://arxiv.org/abs/2103.03864
https://github.com/microsoft/molecule-generation
MIT
1000 new molecules are sampled for each input molecule, preserving its scaffold.
Generative
anamika-yadav99
https://github.com/anamika-yadav99
https://hub.docker.com/r/ersiliaos/eos9taz
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9taz.zip
Local
16/11/2022
Q4
2022
molt5-smiles-to-caption
Ready
MolT5-Translation between Molecules and Natural Language

MolT5 (Molecular T5) is a self-supervised learning framework pretrained on unlabeled natural language text and molecule strings with two end goals: molecular captioning (given a molecule, generate its description) and text-based de novo molecular generation (given a description, propose a molecule that matches it). This implementation is focused on molecular captioning.

Compound
Single
Text
Single
String
Pretrained
Chemical language model
Chemical notation
https://github.com/ersilia-os/eos2rd8
https://arxiv.org/abs/2204.11817
https://github.com/blender-nlp/MolT5
None
Description of a molecule
Representation
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos2rd8
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2rd8.zip
Local
14/11/2022
Q4
2022
bayesian-drug-likeness
Ready
Drug-likeness prediction with Bayesian neural networks

To define drug-likeness, a set of 2136 approved drugs from DrugBank was taken as drug-like, and three negative datasets were selected from ZINC15 (19M), the Network of Organic Chemistry (6M) and ligands from the Protein Data Bank (13k), respectively. The drug dataset was combined with an equal subsampling of the negative dataset for each experiment, using five different molecular representations (Mold2, RDKit, MCS, EXFP4, Mol2Vec). We have re-trained it following the author’s specifications.

Compound
Single
Probability
Single
Float
Retrained
Drug-likeness
https://github.com/ersilia-os/eos9sa2
https://www.nature.com/articles/s42256-020-0209-y
https://github.com/Nanotekton/drugability/tree/v0.1
Non-commercial
Drug-likeness probability
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos9sa2
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9sa2.zip
Local
9/11/2022
Q4
2022
molbloom
Ready
MolBloom: molecule purchasability in ZINC20

This model uses a Bloom filter to query the ZINC20 database to identify if a molecule is purchasable. A bloom filter is a space-efficient probabilistic data structure to identify whether an element is in a given set. Due to the nature of bloom filters, false negatives are not possible (i.e if the model returns False, the molecule is not purchasable). As stated by the author, if the model returns True the molecule is purchasable with an error rate of 0.0003 (according to the ZINC20 catalog).

Compound
Single
Boolean
Single
String
Pretrained
ZINC
Compound generation
https://github.com/ersilia-os/eos8a5g
https://github.com/whitead/molbloom/blob/main/CITATION.cff
https://github.com/whitead/molbloom
MIT
It returns a boolean (True/False) suggesting whether the molecule is commercially available or not.
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos8a5g
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8a5g.zip
Local
2/11/2022
Q4
2022
mesh-therapeutic-use
Ready
MeSH therapeutic use based on chemical structure

Drug function, defined as Medical Subject Headings (MeSH) “therapeutic use” is predicted based on the chemical structure. 6955 non-redundant molecules, pertaining to one of the twelve therapeutic use classes selected, were downloaded from PubChem and used to train a binary classifier. The model provides the probability that a molecule has one of the following therapeutic uses: antineoplastic, cardiovascular, central nervous system (CNS), anti-infective, gastrointestinal, anti-inflammatory, derma

Compound
Single
Probability
List
Float
In-house
Therapeutic indication
https://github.com/ersilia-os/eos238c
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819987/
https://github.com/jgmeyerucsd/drug-class
GPL-3.0
Probability that the molecule belongs to each therapeutic use specified.
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos238c
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos238c.zip
Local
17/10/2022
Q4
2022
admetlab-2
Ready
ADMETlab-2

ADMETLab2 is the improved version of ADMETLab, a suite of models for systematic evaluation of ADMET properties. ADMETLab2 provides predictions on 17 physicochemical properties, 13 medicinal chemistry properties, 23 ADME properties, 27 toxicity endpoints and 8 toxicophore rules. The code and training data are not released, using this model posts predictions to the ADMETLab2 online server. The Ersilia Model Hub also offers ADMETLab (v1) as a downloadable package for IP-sensitive queries.

Compound
Single
Experimental value
Probability
List
Float
Online
Toxicity
ADME
Lipophilicity
Solubility
Permeability
https://github.com/ersilia-os/eos2v11
https://academic.oup.com/nar/article/49/W1/W5/6249611?login=false
https://admetmesh.scbdd.com/
Proprietary
Predicted relevant ADMET properties, Tox21 outcomes, physicochemical properties and drug-likeness. Outputs are of mixed type, including classification (labels) and continuous values.
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2v11
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2v11.zip
Local
16/9/2022
Q3
2022
metabokiller
Ready
Carcinogenic potential of metabolites and small molecules

Carcinogenicity is a result of several potential effects on cells. This model predicts the carcinogenic potential of a small molecule based on their potential to induce cellular proliferation, genomic instability, oxidative stress, anti-apoptotic responses and epigenetic alterations. Metabokiller uses the Chemical Checker signaturizer to featurize the molecules, and the Lime package to provide interpretable results. Using Metabokiller, the authors screened a panel of human metabolites and exper

Compound
Single
Probability
List
Float
Pretrained
Toxicity
Cancer
Metabolism
https://github.com/ersilia-os/eos1579
https://doi.org/10.1038/s41589-022-01110-7
https://github.com/the-ahuja-lab/Metabokiller
Non-commercial
Probability that the molecule has each of the specified carcinogenic properties
Classification
brosular
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos1579
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1579.zip
Local
30/8/2022
Q3
2022
bidd-molmap-desc
Ready
Molecular maps based on broadly learned knowledge-based representations

Molecular representation of small molecules via descriptor-based molecular maps (images). The fingerprint-based molecular maps are available at eos59rr. These images can be used as inputs for an image-based deep learning model such as a convolutional neural network. The authors have demonstrated high performance of MolMap out-of-the-box with a broad range of tasks from MoleculeNet.

Compound
Single
Image
Descriptor
Matrix
Float
Pretrained
Descriptor
https://github.com/ersilia-os/eos6m4j
https://www.nature.com/articles/s42256-021-00301-6
https://github.com/shenwanxiang/bidd-molmap
GPL-3.0
Image representation of a molecule. Each pixel represents a molecular feature
Generative
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6m4j
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6m4j.zip
Local
25/8/2022
Q3
2022
maip-malaria
Ready
MAIP: antimalarial activity prediction

Prediction of the antimalarial potential of small molecules. This model is an ensemble of smaller QSAR models trained on proprietary data from various sources, up to a total of >7M compounds. The training sets belong to Evotec, Johns Hopkins, MRCT, MMV - St. Jude, AZ, GSK, and St. Jude Vendor Library. The code and training data are not released, using this model posts predictions to the MAIP online server. The Ersilia Model Hub also offers MAIP-surrogate as a downloadable package for IP-sensitiv

Compound
Single
Score
Single
Float
Online
P.falciparum
Malaria
Antimicrobial activity
https://github.com/ersilia-os/eos4zfy
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00487-2
https://www.ebi.ac.uk/chembl/maip/
None
Higher score indicates higher antimalarial potential
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos4zfy
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4zfy.zip
Local
Yes
18/8/2022
Q3
2022
chembl-similarity
Ready
Similarity search in ChEMBL

Given a molecule, this model looks for its 100 nearest neighbors in the ChEMBL database, according to ECFP4 Tanimoto similarity. Due to size constraints, the model redirects queries to the ChEMBL server, so when using this model predictions are posted online.

Compound
Single
Compound
List
String
Online
ChEMBL
Similarity
https://github.com/ersilia-os/eos2a9n
https://www.frontiersin.org/articles/10.3389/fchem.2020.00046/full
http://130.92.106.217:8080/chemblMuti.v1/
None
List of 100 nearest neighbors
Similarity
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos2a9n
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2a9n.zip
Local
18/8/2022
Q3
2022
medchem17-similarity
Ready
Similarity search in ChEMBL, DrugBank and UNPD

Given a molecule, this model for its 100 nearest neighbors, according to ECFP4 Tanimoto similarity, in the medicinal chemistry database ChEMBL17_DrugBank17_UNPD17. This combined database contains all the compounds from the three collections (DrugBank, ChEMBL22 and Universal natural product directory (UNPD)) with up to 17 heavy atoms. It features a total of 128k compounds. The whole ChEMBL17_DrugBank17_UNPD17 database is not downloaded with the model, by using it you post queries to an online ser

Compound
Single
Compound
List
String
Online
Similarity
ChEMBL
DrugBank
https://github.com/ersilia-os/eos9c7k
https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201900031
https://gdb-medchem-simsearch.gdb.tools/
None
List of 100 nearest neighbors
Similarity
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos9c7k
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9c7k.zip
Local
18/8/2022
Q3
2022
gdbmedchem-similarity
Ready
GDBMedChem similarity search

The model looks for 100 nearest neighbors of a given molecule, according to ECFP4 Tanimoto similarity, in the GDBMedChem database. GDBMedChem is a 10M molecule-sampling from GDB17, a database containing all the enumerated molecules of up to 17 atoms heavy atoms (166.4B molecules). GDBMedChem compounds have reduced complexity and better synthetic accessibility than GDB17 but retain high sp3 carbon fraction and natural product likeness, providing a database of diverse molecules for drug design. Th

Compound
Single
Compound
List
String
Online
Similarity
ChEMBL
https://github.com/ersilia-os/eos7jlv
https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201900031
https://gdb-medchem-simsearch.gdb.tools/
None
List of 100 nearest neighbors
Similarity
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos7jlv
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7jlv.zip
Local
18/8/2022
Q3
2022
gdbchembl-similarity
Ready
GDBChEMBL similarity search

The model looks for 100 nearest neighbors of a given molecule, according to ECFP4 Tanimoto similarity, in the GDBChEMBL database. GDBChEMBL is a 10M molecule-sampling from GDB17, a database containing all the enumerated molecules of up to 17 atoms heavy atoms (166.4B molecules). GDBChEMBL compounds were selected using a ChEMBL-likeness score, with the objective of having a collection with higher synthetic accessibility and high bioactivity while maintaining continuous coverage of the GDB17 chemi

Compound
Single
Compound
List
String
Online
Similarity
ChEMBL
https://github.com/ersilia-os/eos4b8j
https://www.frontiersin.org/articles/10.3389/fchem.2020.00046/full
https://gdb-chembl-simsearch.gdb.tools/
None
List of 100 nearest neighbors
Similarity
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos4b8j
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4b8j.zip
Local
15/8/2022
Q3
2022
chemical-vae
Ready
Variational autoencoder for small molecule generation

This variational autoencoder (VAE) for chemistry uses an encoder-decoder-predictor framework to predict new small molecules. The input SMILES molecule is converted into a continuous vector, and the decoder converts this molecular representation back to a discrete SMILES. These continuous molecular representations allow for simple operations to generate new chemical matter. The decoder is constrained to produce valid molecules. In addition, a predictor estimates the chemical properties of the mol

Compound
Single
Compound
List
String
Pretrained
Compound generation
https://github.com/ersilia-os/eos3ae7
https://pubs.acs.org/doi/10.1021/acscentsci.7b00572
https://github.com/aspuru-guzik-group/chemical_vae
Apache-2.0
Compounds generated based on the input molecule
Generative
brosular
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos3ae7
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ae7.zip
Local
13/8/2022
Q3
2022
chemnet-distance
Ready
FCD: Fréchet ChemNet Distance to evaluate generative models

The Fréchet ChemNet distance is a metric to evaluate generative models. It unifies, in a single score, whether the generated molecules are valid according to chemical and biological properties as well as their diversity from the training set. The score measures the Fréchet Inception Distance between molecules represented by ChemNet, a deep neural network trained to predict biological and chemical properties of small molecules.

Compound
Pair of Lists
Distance
Single
Float
Pretrained
Similarity
Bioactivity profile
Compound generation
https://github.com/ersilia-os/eos9be7
https://pubs.acs.org/doi/10.1021/acs.jcim.8b00234
https://github.com/bioinf-jku/FCD
LGPL-3.0
Frechet ChemNet Distance (FCD). Higher FCD indicates higher difference to the training set
Similarity
brosular
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos9be7
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9be7.zip
Local
12/8/2022
Q3
2022
bayesherg
Ready
BayeshERG: hERG channel blockade

BayeshERG is a predictor of small molecule-induced blockade of the hERG ion channel. To increase its predictive power, the authors pretrained a bayesian graph neural network with 300,000 molecules as a transfer learning exercise. The pretraining set was obtained from Du et al, 2015, and the fine tuning dataset is a collection of 14,322 molecules from public databases (8488 positives and 5834 negatives). The model was validated on external datasets and experimentally, from 12 selected compounds (

Compound
Single
Probability
Single
Float
Pretrained
hERG
Toxicity
Cardiotoxicity
https://github.com/ersilia-os/eos4tcc
https://academic.oup.com/bib/article-abstract/23/4/bbac211/6609519
https://github.com/GIST-CSBL/BayeshERG
GPL-3.0
Probability of hERG channel blockade. The cut-off used in the training set to define hERG blockade was IC50 <= 10 μM
Classification
azycn
https://github.com/azycn
https://hub.docker.com/r/ersiliaos/eos4tcc
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4tcc.zip
Local
10/8/2022
Q3
2022
rexgen
Ready
Organic reaction outcome prediction

Utilizes a Weisfeiler-Lehman network (attentive mechanism) to predict the products of an organic reaction given the reactants. The model identifies the reaction centers (set of atoms/bonds that change from reactant to product) and obtains the products directly from a graph-based neural network.

Compound
List
Compound
Flexible List
String
Pretrained
Chemical synthesis
https://github.com/ersilia-os/eos5qfo
https://arxiv.org/pdf/1709.04555v3.pdf
https://github.com/connorcoley/rexgen_direct
GPL-3.0
Products of an organic reaction
Generative
svolk19-stanford
https://github.com/svolk19-stanford
https://hub.docker.com/r/ersiliaos/eos5qfo
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5qfo.zip
Local
8/8/2022
Q3
2022
deepsmiles
Ready
DeepSMILES, an alternate SMILES representation for deep learning

DeepSMILES converts a SMILES string to a more accurate syntax for molecule representation, taking into account both the branches (closed parenthesis in the SMILES strings) and rings (using a single symbol at ring closure that also indicates ring size). This syntax is particularly suitable in generative models, when the output is a SMILES string. With DeepSMILES, scientists can train a network using this new syntax, generate new molecules represented as DeepSMILES and then decode them back to nor

Compound
Single
Compound
Single
String
Pretrained
Chemical language model
Chemical notation
https://github.com/ersilia-os/eos2mrz
https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c73ed6567dfe7e5fec388d/original/deep-smiles-an-adaptation-of-smiles-for-use-in-machine-learning-of-chemical-structures.pdf
https://github.com/baoilleach/deepsmiles
MIT
String representing a DeepSMILES
Representation
brosular
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos2mrz
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2mrz.zip
Local
28/7/2022
Q3
2022
admetlab
Ready
ADMETlab models for evaluation of drug candidates

A series of models for the systematic ADMET evaluation of drug candidate molecules. Models include blood-brain barrier penetration; inhibition and substrate affinity for CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP3A4, and pgp; F 20% and F 30% bioavailability; human intestinal absorption; Ames mutagenicity; skin sensitization; plasma protein binding; volume distribution; LD50 of acute toxicity; human hepatotoxicity; hERG blocking; clearance; half-life; Papp (caco-2 permeability); LogD distribution coeff

Compound
Single
Experimental value
List
Float
Pretrained
ADME
Toxicity
Lipophilicity
Solubility
Permeability
https://github.com/ersilia-os/eos2re5
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0283-x
https://github.com/ifyoungnet/ADMETlab
GPL-3.0
Regression models provide a numerical result (LogS (log mol/L), LogP (distribution coefficient), Papp (Caco-2 permeability in cm/s), PPB (%)). Classifications provide the probability of activity according to ADMETlab thresholds.
Classification
svolk19-stanford
https://github.com/svolk19-stanford
https://hub.docker.com/r/ersiliaos/eos2re5
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2re5.zip
Local
28/7/2022
Q3
2022
deepherg
Ready
Classification of hERG blockers and nonblockers

This model used a multitask deep neural network (DNN) to predict the probability that a molecule is a hERG blocker. It was trained using 7889 compounds with experimental data available (IC50). The checkpoints of the pretrained model were not available, therefore we re-trained the model using the same method but without mol2vec featuriztion. Molecule featurization was instead done with Morgan fingerprints. Six models were tested, with several thresholds for negative decoys (10, 20, 40, 60, 80 and

Compound
Single
Probability
Single
Float
Retrained
Toxicity
hERG
Cardiotoxicity
https://github.com/ersilia-os/eos30gr
https://pubs.acs.org/doi/full/10.1021/acs.jcim.8b00769
https://github.com/ChengF-Lab/deephERG
None
Probability of hERG blockade. Actives are defined as IC50<10, inactives are defined as IC50>80
Classification
azycn
https://github.com/azycn
https://hub.docker.com/r/ersiliaos/eos30gr
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos30gr.zip
Local
22/7/2022
Q3
2022
aizynthfinder
Ready
Retrosynthesis planning

A tool for planning retrosynthesis of a target molecule based on template reactions and a stock of precursors. The algorithm breaks down the input molecule into purchasable blocks until it has been completely solved.

Compound
Single
Score
Flexible List
String
Float
Pretrained
Synthetic accessibility
Chemical synthesis
https://github.com/ersilia-os/eos526j
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00472-1
https://github.com/MolecularAI/aizynthfinder
MIT
The fraction of solved precursors and the number of reactions required for synthesis. Close to 1.0 for a solved compound, less than 0.8 for unsolved.
Generative
svolk19-stanford
https://github.com/svolk19-stanford
https://hub.docker.com/r/ersiliaos/eos526j
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos526j.zip
Local
19/7/2022
Q3
2022
selfies
Ready
SELF-referencIng Embedded Strings

String representation of small molecules that is more robust than SMILES, since, by design, all SELFIES strings are valid molecules. It is particularly helpful when applied in generative models, as all the SELFIES proposed are valid molecules. The authors also found that on generative models, SELFIES produces more diverse molecules than compared to SMILES.

Compound
Single
Compound
Single
String
Pretrained
Chemical notation
Chemical language model
Compound generation
https://github.com/ersilia-os/eos6pbf
https://arxiv.org/pdf/1905.13741
https://github.com/aspuru-guzik-group/selfies
Apache-2.0
String representation of a molecule (SELFIE)
Representation
brosular
https://github.com/brosular
https://hub.docker.com/r/ersiliaos/eos6pbf
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6pbf.zip
Local
14/7/2022
Q3
2022
pkasolver
Ready
Microstate pKa values

This model employs transfer learning with graph neural networks in order to predict micro-state pKa values of small molecules. The model enumerates the molecule's protonation states and predicts its pKa values. It was trained in two phases, first, using a large ChEMBL dataset and then fine-tuning the model for a small training set of molecules with available pKa values. The model in this repository is the pkasolver-light, which does not require an Epik license and is limited to monoprotic molecu

Compound
Single
Experimental value
Single
Float
Pretrained
pKa
ADME
https://github.com/ersilia-os/eos2b6f
https://www.biorxiv.org/content/10.1101/2022.01.20.476787v1
https://github.com/mayrf/pkasolver
MIT
Acidity of a molecule (lower pKa indicates stronger acid)
Regression
svolk19-stanford
https://github.com/svolk19-stanford
https://hub.docker.com/r/ersiliaos/eos2b6f
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2b6f.zip
Local
13/7/2022
Q3
2022
grover-qm8
Ready
Electronic spectra and excited state energy

Prediction of the electronic spectra and excited state energy of small molecules. The training set is the QM8 from Molecule Net, where the electronic properties have been calculated by multiple quantum mechanic methods. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound
Single
Other value
List
Float
Pretrained
MoleculeNet
Chemical graph model
Quantum properties
https://github.com/ersilia-os/eos3xip
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Predicted electronic spectra and excited state energy
Regression
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos3xip
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3xip.zip
Local
Yes
13/7/2022
Q3
2022
grover-qm7
Ready
Atomization energy of small molecules

The model predicts the atomization energy of a molecule. It has been trained using the QM7 dataset from MoleculeNet, a subset of GDB13 containing all molecules up to 23 atoms (7 heavy atoms + C, S, O, N). This dataset contains the computed atomization energy of 7165 molecules. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound
Single
Other value
Single
Float
Pretrained
MoleculeNet
Chemical graph model
Quantum properties
https://github.com/ersilia-os/eos6o0z
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Atomization energy of the molecue
Regression
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos6o0z
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6o0z.zip
Local
Yes
13/7/2022
Q3
2022
grover-lipo
Ready
Octanol/water distribution coefficient

Prediction of octanol/water distribution coefficient (logD at pH 7.4) trained using the Lipophilicity Molecule Net dataset. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound
Single
Experimental value
Single
Float
Pretrained
MoleculeNet
Lipophilicity
ADME
LogD
Chemical graph model
https://github.com/ersilia-os/eos85a3
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Predicted logD at pH 7.4
Regression
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos85a3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos85a3.zip
Local
Yes
13/7/2022
Q3
2022
grover-esol
Ready
Water solubility

Prediction of water solubility data (log solubility in mols per litre) for common organic small molecules. trained using the Molecule Net ESOL dataset.

Compound
Single
Experimental value
Single
Float
Pretrained
Solubility
MoleculeNet
ADME
LogS
Chemical graph model
https://github.com/ersilia-os/eos8451
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Log Solubility (Mols/Litre)
Regression
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos8451
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8451.zip
Local
Yes
13/7/2022
Q3
2022
grover-freesolv
Ready
Hydration free energy of small molecules in water

Model based on experimental and calculated hydration free energy of small molecules in water, the FreeSolv dataset from MoleculeNet. Hydration free energies are relevant to understand the binding interaction between a molecule (in solution) into its binding site. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound
Single
Other value
Single
Float
Pretrained
MoleculeNet
Chemical graph model
Quantum properties
https://github.com/ersilia-os/eos157v
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Calculated Hydration Free energy in kcal/mol
Regression
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos157v
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos157v.zip
Local
Yes
13/7/2022
Q3
2022
grover-toxcast
Ready
ToxCast toxicity panel

Prediction across the ToxCast toxicity panel, containing hundreds of toxicity outcomes, as part of the MoleculeNet benchmark. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound
Single
Probability
List
Float
Pretrained
Toxicity
ToxCast
Chemical graph model
https://github.com/ersilia-os/eos481p
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Probability of toxicity against 617 biological targets
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos481p
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos481p.zip
Local
Yes
13/7/2022
Q3
2022
grover-bace
Ready
BACE-1 inhibition

Prediction of Beta-secretase 1 (BACE-1) inhibition. BACE-1 is expressed mainly in neurons and has been involved in the development of Alzheimer's disease. This model has been trained on the BACE dataset from MoleculeNet using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound
Single
Probability
Single
Float
Pretrained
Alzheimer
BACE
MoleculeNet
Chemical graph model
https://github.com/ersilia-os/eos2mhp
https://arxiv.org/abs/2007.02835
https://github.com/tencent-ailab/grover
MIT
Probability that the molecule is a BACE-1 inhibitor (using a 0.1 uM cut-off)
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos2mhp
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2mhp.zip
Local
Yes
13/7/2022
Q3
2022
grover-clintox
Ready
Toxicity at clinical trial stage

Using the Molecule Net dataset ClinTox, the authors trained a classification model to predict the likelihood of failure in clinical trials due to toxicity. The dataset has been built using FDA approved drugs (non-toxic) and a set of drugs that have failed at advanced clinical trial stages. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound
Single
Probability
List
Float
Pretrained
Toxicity
MoleculeNet
Chemical graph model
Side effects
https://github.com/ersilia-os/eos6fza
https://arxiv.org/abs/2007.02835
https://github.com/tencent-ailab/grover
MIT
Probability that a molecule is approved by the FDA and probability that a molecule shows toxicity in clinical trials
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos6fza
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6fza.zip
Local
Yes
13/7/2022
Q3
2022
grover-tox21
Ready
Predicts activity of compounds accross the Tox21 panel

Predicts activity of compounds in the Tox21 toxicity panel, comprising of 12 toxicity pathways, as part of the MoleculeNet benchmark datasets. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER)

Compound
Single
Probability
List
Float
Pretrained
Tox21
Toxicity
Chemical graph model
https://github.com/ersilia-os/eos5smc
https://papers.nips.cc/paper/2020/file/94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf
https://github.com/tencent-ailab/grover
MIT
Toxicity measurements against 12 biological targets
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos5smc
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5smc.zip
Local
Yes
12/7/2022
Q3
2022
sa-score
Ready
Synthetic accessibility score

Estimation of synthetic accessibility score (SAScore) of drug-like molecules based on molecular complexity and fragment contributions. The fragment contributions are based on a 1M sample from PubChem and the molecular complexity is based on the presence/absence of non-standard structural features. It has been validated comparing the SAScore and the estimates of medicinal chemist experts for 40 molecules (r2 = 0.89). The SAScore has been contributed to the RDKit Package.

Compound
Single
Score
Single
Float
Pretrained
Synthetic accessibility
Chemical synthesis
https://github.com/ersilia-os/eos9ei3
https://jcheminf.biomedcentral.com/articles/10.1186/1758-2946-1-8
https://github.com/rdkit/rdkit/tree/master/Contrib/SA_Score
BSD-3.0
Low scores indicate higher synthetic accessibility
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://eos9ei3-tkreo.ondigitalocean.app/
https://hub.docker.com/r/ersiliaos/eos9ei3
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9ei3.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos9ei3
Online
10/7/2022
Q3
2022
chemtb
Ready
Mycobacterium tuberculosis inhibitor prediction

Identification of active molecules against Mycobacterium tuberculosis using an ensemble of data from ChEMBL25 (Target IDs 360, 2111188 and 2366634). The final model is a stacking model integrating four algorithms, including support vector machine, random forest, extreme gradient boosting and deep neural networks.

Compound
Single
Probability
Single
Float
Pretrained
M.tuberculosis
IC50
Tuberculosis
Antimicrobial activity
https://github.com/ersilia-os/eos46ev
https://academic.oup.com/bib/article-abstract/22/5/bbab068/6209685
http://cadd.zju.edu.cn/chemtb/
None
Probability of M.tb inhibition (measured as IC50 at cut-off 5 uM)
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos46ev
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos46ev.zip
Local
Yes
28/6/2022
Q2
2022
ssl-gcn-tox21
Ready
Toxicity prediction across the Tox21 panel with semi-supervised learning

Toxicity prediction across the Tox21 panel from MoleculeNet, comprising 12 toxicity pathways. The model uses the Mean Teacher Semi-Supervised Learning (MT-SSL) approach to overcome the low number of data points experimentally annotated for toxicity tasks. For the MT-SSL, Tox21 (831 compounds and 12 different endpoints) was used as labeled data and a selection of 50K compounds from other MoleculeNet datasets was used as unlabeled data.

Compound
Single
Probability
List
Float
Pretrained
Tox21
Toxicity
MoleculeNet
https://github.com/ersilia-os/eos69p9
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00570-8
https://github.com/chen709847237/SSL-GCN
None
Probability of toxicity across 12 tasks defined in Tox21
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos69p9
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos69p9.zip
Local
16/6/2022
Q2
2022
coprinet-molecule-price
Ready
Small molecule price prediction

CoPriNet has been trained on 2D graph representations of small molecules with their associated price in the Mcule catalog. The predicted price provides a better overview of the compound availability than standard synthetic accessibility scores or retrosynthesis tools. The Mcule catalog is proprietary but the trained model as well as the test dataset (100K) are publicly available.

Compound
Single
Other value
Single
Float
Pretrained
Price
Compound generation
Chemical synthesis
https://github.com/ersilia-os/eos7a45
https://pubs.rsc.org/en/content/articlelanding/2023/dd/d2dd00071g
https://github.com/oxpig/CoPriNet
MIT
Price value prediction
Regression
anamika-yadav99
https://github.com/anamika-yadav99
https://hub.docker.com/r/ersiliaos/eos7a45
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7a45.zip
Local
28/3/2022
Q1
2022
deepfl-logp
Ready
Membrane permeability of fluorescent probes

A deep neural network was trained to predict the LogP value of small molecules and fluorescent probes using an experimentally annotated dataset of >13k molecules (OPERA). This dataset was complemented with fluorescent probes to improve the model accuracy in this space. Probes predicted impermeant to cell membranes consistently showed experimental LogP <1.

Compound
Single
Experimental value
Single
Float
Pretrained
Permeability
ADME
LogP
https://github.com/ersilia-os/eos65rt
https://www.nature.com/articles/s41598-021-86460-3.epdf?sharing_token=zmYZd6qpwnDwc8tCOYGGf9RgN0jAjWel9jnR3ZoTv0OXuXXr_ZS6VuKQMyMJiA3PeIcqAJZTcpcNZJHblyChkQ2eTpzGXq23YsIcFlG8ayuEptKCJ1DeyIRGrh9O2d5JvvGGB9qG8cXgAuy_k-e1ncAMkAzpTegmR0XUbnftjv0%3D
https://github.com/k-soliman/DeepFl-LogP
GPL-3.0
LogP values of > 1 indicate membrane permeability
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos65rt
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos65rt.zip
Local
10/11/2021
Q4
2021
passive-permeability
Ready
Passive permeability based on simulations

Using Coarse Grained (CG) models, where several atoms are aggregated into a single bead, the authors obtain a set of 500,000 compounds with their simulated permeability across a single-component DOPC lipid bilayer. With this approach, the authors are able to cover a large and representative portion of the chemical space. We have used the data generated in this publication to train a simple regression model to predict compound permeability.

Compound
Single
Experimental value
Single
Float
In-house
Permeability
ADME
Papp
https://github.com/ersilia-os/eos2hbd
https://pubs.acs.org/doi/full/10.1021/acscentsci.8b00718?ref=recommended
https://pubs.acs.org/doi/full/10.1021/acscentsci.8b00718?ref=recommended
None
Permeability coefficient (P). Cut-off: 6
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2hbd
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2hbd.zip
Local
Yes
10/11/2021
Q4
2021
pampa-permeability
Ready
PAMPA effective permeability

The authors provide a dataset of 200 small molecules and their experimentally measured permeability in a PAMPA assay. Using this data, we have trained a model that predicts the logarithm of the effective permeability coefficient.

Compound
Single
Experimental value
Single
Float
In-house
Permeability
ADME
LogP
https://github.com/ersilia-os/eos97yu
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651837/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651837/
None
logPe
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos97yu
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos97yu.zip
Local
Yes
10/11/2021
Q4
2021
natural-product-fingerprint
Ready
Natural product fingerprint

The model uses a combination of two multilayer perceptron networks (baseline and auxiliar) and an autoencoder-like network to extract natural-product specific fingerprints that outperform traditional methods for molecular representation. The training sets correspond to the coconut database (NP) and the Zinc database (synthetic).

Compound
Single
Descriptor
List
String
Pretrained
Natural product
Fingerprint
Descriptor
https://github.com/ersilia-os/eos6tg8
https://www.sciencedirect.com/science/article/pii/S2001037021003226?via%3Dihub#f0010
https://github.com/kochgroup/neural_npfp
None
Descriptor of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6tg8
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6tg8.zip
Local
3/11/2021
Q4
2021
maip-malaria-surrogate
Ready
MAIP distillation: antimalarial potential prediction

Prediction of the antimalarial potential of small molecules. This model was originally trained on proprietary data from various sources, up to a total of >7M compounds. The training sets belong to Evotec, Johns Hopkins, MRCT, MMV - St. Jude, AZ, GSK, and St. Jude Vendor Library. In this implementation, we have used a teacher-student approach to train a surrogate model based on ChEMBL data (2M molecules) to provide a lite downloadable version of the original MAIP

Compound
Single
Score
Single
Float
Retrained
P.falciparum
Malaria
Antimicrobial activity
https://github.com/ersilia-os/eos2gth
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00487-2
https://www.ebi.ac.uk/chembl/maip/
None
Higher score indicates Higher antimalarial potential
Classification
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2gth
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2gth.zip
Local
No
2/11/2021
Q4
2021
syba-synthetic-accessibility
Ready
Bayesian prediction of synthetic accessibility

SYBA uses a fragment-based approach to classify whether a molecule is easy or hard to synthesize, and it can also be used to analyze the contribution of individual fragments to the total synthetic accessibility. The easy-to-synthesize dataset is an extract of the ZINC purchasable compounds, and the hard-to-synthesize dataset is generated using a Nonpher approach (introducing small molecular perturbations to transform molecules into more complex compounds). The fragments are calculated with ECFP8

Compound
Single
Score
Single
Float
Pretrained
Synthetic accessibility
Chemical synthesis
https://github.com/ersilia-os/eos7pw8
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00439-2
https://github.com/lich-uct/syba
GPL-3.0
Higher score indicates higher confidence that the molecule is synthetically available
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7pw8
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7pw8.zip
Local
25/10/2021
Q4
2021
natural-product-score
Ready
Natural product score

A simple score to distinguish between natural products (-like) and synthetic compounds. The score was calculated using an analysis of the structural features that distinguish natural products (NP) from synthetic molecules. NP structures were obtained from the CRC Dictionary of Natural products and synthetic molecules belong to an in-house collection. This method has been contributed to the RDKit package, Ersilia is simply implementing the RDKit NP_Score.

Compound
Single
Score
List
Float
Pretrained
Natural product
Drug-likeness
https://github.com/ersilia-os/eos8ioa
http://pubs.acs.org/doi/abs/10.1021/ci700286x
https://github.com/rdkit/rdkit/tree/master/Contrib/NP_Score
BSD-3.0
Higher score indicates higher natural product likeness
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8ioa
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8ioa.zip
Local
19/10/2021
Q4
2021
natural-product-likeness
Ready
Natural product likeness score

The model is a derivation of the natural product fingerprint (eos6tg8). In addition to generating specific natural product fingerprints, the activation value of the neuron that predicts if a molecule is a natural product or not can be used as a NP-likeness score. The method outperforms the NP_Score implemented in RDKit.

Compound
Single
Score
Single
Float
Pretrained
Natural product
Drug-likeness
https://github.com/ersilia-os/eos9yui
https://www.sciencedirect.com/science/article/pii/S2001037021003226?
https://github.com/kochgroup/neural_npfp
None
Higher score indicates higher natural product likeness
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://eos9yui-7xpw3.ondigitalocean.app/
https://hub.docker.com/r/ersiliaos/eos9yui
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9yui.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos9yui
Online
19/10/2021
Q4
2021
retrosynthetic-accessibility
Ready
Retrosynthetic accessibility score

Retrosynthetic accessibility score based on the computer aided synthesis planning tool AiZynthfinder. The authors have selected a ChEMBL subset of 200.000 molecules, and checked whether AiZinthFinder could identify a synthetic route or not. This data has been trained to create a classifier that computes 4500 times faster than the underlying AiZynthFinder. Molecules outside the applicability domain, such as the GBD database, need to be fine tuned to their use case.

Compound
Single
Score
Single
Float
Pretrained
Synthetic accessibility
Chemical synthesis
https://github.com/ersilia-os/eos2r5a
https://pubs.rsc.org/en/content/articlelanding/2021/sc/d0sc05401a
https://github.com/reymond-group/RAscore
MIT
Higher score indicates easier retrosynthetic accessibility
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2r5a
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2r5a.zip
Local
19/10/2021
Q4
2021
soltrannet-aqueous-solubility
Ready
Aqueous solubility prediction

Fast aqueous solubility prediction based on the Molecule Attention Transformer (MAT). The authors used AqSolDB to fine-tune the MAT network to solubility prediction, achieving competitive scores in the Second Challenge to Predict Aqueous Solubility (SC2).

Compound
Single
Experimental value
Single
Float
Pretrained
Solubility
ADME
LogS
https://github.com/ersilia-os/eos6oli
https://pubs.acs.org/doi/10.1021/acs.jcim.1c00331
https://github.com/gnina/SolTranNet
Apache-2.0
Predicted LogS (log of the solubility)
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6oli
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6oli.zip
Local
Yes
19/10/2021
Q4
2021
molgrad-ppb
Ready
Coloring molecules for plasma protein binding prediction

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions. In this model, they train MolGrad with data from a Plasma-protein binding assay (PPB) to predict the fraction bound in plasma of small mo

Compound
Single
Experimental value
Single
Float
Pretrained
ADME
Fraction bound
Chemical graph model
https://github.com/ersilia-os/eos6ao8
https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344
https://github.com/josejimenezluna/molgrad/
AGPL-3.0
Fraction (%) bound in plasma
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos6ao8
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos6ao8.zip
Local
Yes
19/10/2021
Q4
2021
molgrad-herg
Ready
Coloring molecules for hERG blockade

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions.In this model, they train MolGrad with a dataset of hERG channel blockers/non-blockers to predict the cardiotoxicity of small molecules (I

Compound
Single
Experimental value
Single
Float
Pretrained
hERG
Toxicity
Cardiotoxicity
Chemical graph model
https://github.com/ersilia-os/eos43at
https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344
https://github.com/josejimenezluna/molgrad/
AGPL-3.0
pIC50 of hERG inhibition
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://eos43at-zqx9x.ondigitalocean.app/
https://hub.docker.com/r/ersiliaos/eos43at
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos43at.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos43at
Online
Yes
19/10/2021
Q4
2021
molgrad-caco2
Ready
Coloring molecules for Caco-2 cell permeability

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions.  This model has been trained using experimental data on the permeability of molecules across Caco2 cell membranes (Papp, cm s-1)

Compound
Single
Experimental value
Single
Float
Pretrained
Permeability
ADME
Papp
Chemical graph model
https://github.com/ersilia-os/eos1af5
https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344
https://github.com/josejimenezluna/molgrad/
AGPL-3.0
Log 10 of the Passive permeability in cm s-1
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos1af5
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1af5.zip
Local
Yes
19/10/2021
Q4
2021
cardiotoxnet-herg
Ready
Ligand-based prediction of hERG blockade

A robust predictor for hERG channel blockade based on an ensemble of five deep learning models. The authors have collected a dataset from public sources, such as BindingDB and ChEMBL on hERG blockers and non-blockers. The cut-off for hERG blockade was set at IC50 < 10 uM for the classifier.

Compound
Single
Probability
Single
Float
Pretrained
hERG
Toxicity
Cardiotoxicity
https://github.com/ersilia-os/eos2ta5
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00541-z
https://github.com/Abdulk084/CardioTox
None
Probability that the compound inhibits hERG (IC50 < 10 uM)
Classification
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2ta5
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2ta5.zip
Local
18/10/2021
Q4
2021
molgrad-cyp3a4
Ready
Coloring molecules for interaction with CYP3A4

By combining a Message-Passing Graph Neural Network (MPGNN) and a Forward fully connected Neural Network (FNN) with an integrated gradients explainable artificial intelligence (XAI) method, the authors developed MolGrad and tested it on a number of ADME predictive tasks. MolGrad incorporates explainable features to facilitate interpretation of the predictions.  This model has been trained using a ChEMBL dataset of CYP450 3A4 inhibitors (0) and non-inhibitors (1).

Compound
Single
Probability
Single
Float
Pretrained
CYP450
ADME
Chemical graph model
https://github.com/ersilia-os/eos96ia
https://pubs.acs.org/doi/10.1021/acs.jcim.0c01344
https://github.com/josejimenezluna/molgrad/
GPL-3.0
Probability that the molecule is metabolized by Cyp3A4 (cut-off: 10 uM)
Classification
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos96ia
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos96ia.zip
Local
Yes
18/10/2021
Q4
2021
mycpermcheck
Ready
Membrane permeability in Mycobacterium tuberculosis

MycPermCheck predicts potential to permeate the Mycobacterium tuberculosis cell membrane based on physicochemical properties.

Compound
Single
Probability
Single
Float
Pretrained
Permeability
M.tuberculosis
ADME
Tuberculosis
https://github.com/ersilia-os/eos8d8a
https://academic.oup.com/bioinformatics/article/29/1/62/272745
https://www.mycpermcheck.aksotriffer.pharmazie.uni-wuerzburg.de/index.html
MIT
Probability of permeability across the M.tb cell wall
Classification
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8d8a
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8d8a.zip
Local
Yes
14/10/2021
Q4
2021
padel
Ready
PADEL small molecule descriptors

PaDEL is a commonly used molecular descriptor. It calculates 1875 molecular descriptors (1444 1D and 2D descriptors, 431 3D descriptors) and 12 types of fingerprints for small molecule representation. Originally developed in Java, here we provide PaDDELPy, its python implementation.

Compound
Single
Descriptor
List
Float
Pretrained
Descriptor
https://github.com/ersilia-os/eos7asg
https://onlinelibrary.wiley.com/doi/10.1002/jcc.21707
https://github.com/ecrl/padelpy
MIT
Vector representation of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7asg
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7asg.zip
Local
27/9/2021
Q3
2021
smiles-transformer
Ready
SMILES transformer descriptor

Molecular embedding based on natural language processing. It converts SMILES into fingerprints using an unsupervised model pre-trained on a very large SMILES dataset from ChEMBL. The transformer is particularly well-suited for low-data drug discovery.

Compound
Single
Descriptor
List
Float
Pretrained
Chemical language model
Descriptor
Embedding
https://github.com/ersilia-os/eos2lm8
https://arxiv.org/abs/1911.04738
https://github.com/DSPsleeporg/smiles-transformer
MIT
Vector representation of small molecules
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2lm8
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2lm8.zip
Local
22/9/2021
Q3
2021
mordred
Ready
Mordred chemical descriptors

A set of ca 1,800 chemical descriptors, including both RDKit and original modules. It is comparable to the well known PaDEL-Descriptors (see eos7asg), but has shorter calculation times and can process larger molecules.

Compound
Single
Descriptor
List
Float
Pretrained
Descriptor
https://github.com/ersilia-os/eos78ao
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0258-y
https://github.com/mordred-descriptor/mordred
BSD-3.0
Vector representation of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos78ao
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos78ao.zip
Local
17/9/2021
Q3
2021
rdkit-fingerprint
Ready
Path-based fingerprint

Path-based fingerprints calculated with the RDKit package Chem.RDKFingerprint. It is inspired in the Daylight fingerprint. As explained in the RDKit Book, the fingerprinting algorithm identifies all subgraphs in the molecule within a particular range of sizes, hashes each subgraph to generate a raw bit ID, mods that raw bit ID to fit in the assigned fingerprint size, and then sets the corresponding bit.

Compound
Single
Descriptor
List
Float
Pretrained
Fingerprint
Descriptor
https://github.com/ersilia-os/eos7jio
https://www.rdkit.org/docs/RDKit_Book.html
https://github.com/rdkit/rdkit
BSD-3.0
Vector representation of small molecules
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7jio
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7jio.zip
Local
17/9/2021
Q3
2021
molbert
Ready
MolBERT chemical language transformer

Molecular representation using the BERT language Transformer. The model has been pre-trained on the GuacaMol dataset (~1.6M molecules from ChEMBL), and can be fine-tuned to the desired QSAR tasks. It has been benchmarked in MoleculeNet.

Compound
Single
Descriptor
List
Float
Pretrained
Chemical language model
Embedding
Descriptor
https://github.com/ersilia-os/eos2thm
https://arxiv.org/abs/2011.13230
https://github.com/BenevolentAI/MolBERT
MIT
Embedding representation of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos2thm
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos2thm.zip
Local
17/9/2021
Q3
2021
rdkit-descriptors
Ready
Physicochemical descriptors available from RDKIT

A set of 200 physicochemical descriptors available from the RDKIT, including molecular weight, solubility and druggability parameters. We have used the DescriptaStorus selection of RDKit descriptors for simplicity.

Compound
Single
Descriptor
List
Float
Pretrained
Descriptor
https://github.com/ersilia-os/eos8a4x
https://www.rdkit.org/docs/RDKit_Book.html
https://github.com/bp-kelley/descriptastorus
Proprietary
Vector representation of small molecules
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8a4x
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8a4x.zip
Local
17/9/2021
Q3
2021
avalon
Ready
Avalon fingerprint

Avalon is a path-based substructure key fingerprint (1024 bits), developed for substructure screen-out when searching. It is part of the Avalon Chemoinformatics Toolkit and has also been implemented as an external RDKit tool.

Compound
Single
Descriptor
List
Integer
Pretrained
Fingerprint
https://github.com/ersilia-os/eos8h6g
https://pubs.acs.org/doi/full/10.1021/ci050413p
https://github.com/rdkit/rdkit/tree/master/External/AvalonTools
BSD-3.0
Bitvector representation of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos8h6g
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos8h6g.zip
Local
14/9/2021
Q3
2021
molecular-weight
Ready
Molecular weight

The model is simply an implementation of the function Descriptors.MolWt of the chemoinformatics package RDKIT. It takes as input a small molecule (SMILES) and calculates its molecular weight in g/mol.

Compound
Single
Other value
Single
Float
Pretrained
Molecular weight
https://github.com/ersilia-os/eos3b5e
https://www.rdkit.org/docs/RDKit_Book.html
https://github.com/rdkit/rdkit
BSD-3.0
Calculated molecular weight (g/mol)
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3b5e
AMD64
CPU
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3b5e.zip
Local
13/9/2021
Q3
2021
morgan-counts
Ready
Morgan counts fingerprints

The Morgan Fingerprints, or extended connectivity fingerprints (ECFP4) are one of the most widely used molecular representations. They are circular representations (from an atom, search the atoms around with a radius n) and can have thousands of features. This implementation uses the RDKit package and is done with radius 3 and 2048 dimensions.

Compound
Single
Descriptor
List
Integer
Pretrained
Fingerprint
Descriptor
https://github.com/ersilia-os/eos5axz
https://www.rdkit.org/docs/RDKit_Book.html
https://github.com/rdkit/rdkit
BSD-3.0
Vector representation of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos5axz
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5axz.zip
Local
30/8/2021
Q3
2021
whales-descriptor
Ready
Holistic molecular descriptors for scaffold hopping

Weighted Holistic Atom Localization and Entity Shape (WHALES) is a descriptors based on 3D structure to facilitate natural product featurization. It is aimed at scaffold hopping exercises from natural products to synthetic compounds

Compound
Single
Descriptor
List
Float
Pretrained
Natural product
Descriptor
https://github.com/ersilia-os/eos3ae6
https://www.nature.com/articles/s42004-018-0043-x
https://github.com/ETHmodlab/scaffold_hopping_whales
MIT
Vector representation of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos3ae6
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3ae6.zip
Local
15/7/2021
Q3
2021
grover-embedding
Ready
Large-scale graph transformer

GROVER is a self-supervised Graph Neural Network for molecular representation pretrained with 10 million unlabelled molecules from ChEMBL and ZINC15. The model provided has been pre-trained on 10 million molecules (GROVERlarge). GROVER has then been fine-tuned to predict several activities from the MoleculeNet benchmark, consistently outperforming other state-of-the-art methods for serveral benchmark datasets.

Compound
Single
Descriptor
List
Float
Pretrained
Chemical graph model
Embedding
Descriptor
https://github.com/ersilia-os/eos7w6n
https://papers.nips.cc/paper/2020/file/94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf
https://github.com/tencent-ailab/grover
MIT
Embedding representation of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7w6n
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7w6n.zip
Local
Yes
2/7/2021
Q3
2021
cc-signaturizer
Ready
Chemical Checker signaturizer

A set of 25 Chemical Checker bioactivity signatures (including 2D & 3D fingerprints, scaffold, binding, crystals, side effects, cell bioassays, etc) to capture properties of compounds beyond their structures. Each signature has a length of 128 dimensions. In total, there are 3200 dimensions. The signaturizer is periodically updated. We use the 2020-02 version of the signaturizer.

Compound
Single
Descriptor
List
Float
Pretrained
Descriptor
Bioactivity profile
Embedding
https://github.com/ersilia-os/eos4u6p
https://www.nature.com/articles/s41467-021-24150-4
http://gitlabsbnb.irbbarcelona.org/packages/signaturizer
MIT
2D projection of bioactivity signatures
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos4u6p
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4u6p.zip
Local
1/7/2021
Q3
2021
cdd-descriptor
Ready
Continuous and data-driven descriptors

Low dimension continuous descriptor based on a neural machine translation model. This model has been trained by inputting a IUPAC molecular representation to obtain its SMILES. The intermediate continuous vector representation encoded by when reading the IUPAC name is a representation of the molecule, containing all the information to generate the output sequence (SMILES). This model has been pretrained on a large dataset combining ChEMBL and ZINC.

Compound
Single
Descriptor
List
Float
Pretrained
Descriptor
Chemical language model
https://github.com/ersilia-os/eos7a04
https://pubs.rsc.org/en/content/articlelanding/2019/sc/c8sc04175j
https://github.com/jrwnter/cddd
MIT
Embedding representation of a molecule
Representation
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos7a04
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos7a04.zip
Local
1/7/2021
Q3
2021
grover-sider
Ready
Adverse Drug Reactions

The model predicts the putative adverse drug reactions (ADR) of a molecule, using the SIDER database (MoleculeNet) that contains pairs of marketed drugs and their described ADRs. This model has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound
Single
Probability
List
Float
Pretrained
Toxicity
MoleculeNet
Side effects
https://github.com/ersilia-os/eos77w8
https://arxiv.org/abs/2007.02835
https://github.com/tencent-ailab/grover
MIT
Predicted ADRs classified in 27 groups
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos77w8
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos77w8.zip
Local
Yes
4/6/2021
Q2
2021
grover-bbbp
Ready
Blood-brain barrier penetration

This model predicts the Blood-Brain Barrier (BBB) penetration potential of small molecules using as training data the curated MoleculeNet benchmark containing 2000 experimental data points. It has been trained using the GROVER transformer (see eos7w6n or grover-embedding for a detail of the molecular featurization step with GROVER).

Compound
Single
Probability
Single
Float
Pretrained
Permeability
MoleculeNet
Chemical graph model
Alzheimer
https://github.com/ersilia-os/eos1amr
https://papers.nips.cc/paper/2020/hash/94aef38441efa3380a3bed3faf1f9d5d-Abstract.html
https://github.com/tencent-ailab/grover
MIT
Probability that a molecule crosses the blood brain barrier
Classification
Amna-28
https://github.com/Amna-28
https://hub.docker.com/r/ersiliaos/eos1amr
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1amr.zip
Local
Yes
4/6/2021
Q2
2021
chembl-multitask-descriptor
Ready
Multi-target prediction based on ChEMBL data

This is a ligand-based target prediction model developed by the ChEMBL team. They trained the model using pairs of small molecules and their protein targets, and produced a multitask predictor. The thresholds of activity where determined by protein families (kinases: <= 30nM, GPCRs: <= 100nM, Nuclear Receptors: <= 100nM, Ion Channels: <= 10μM, Non-IDG Family Targets: <= 1μM). Here we provide the model trained on ChEMBL_28, which showed an accuracy of 85%.

Compound
Single
Probability
List
Float
Pretrained
Bioactivity profile
Target identification
ChEMBL
https://github.com/ersilia-os/eos1vms
http://chembl.blogspot.com/2019/05/multi-task-neural-network-on-chembl.html
https://github.com/chembl/chembl_multitask_model/
None
Probability of having the protein (identified by ChEMBL ID), as target
Classification
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos1vms
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos1vms.zip
Local
4/6/2021
Q2
2021
etoxpred
Ready
Toxicity and synthetic accessibility prediction

The eToxPred tool has been developed to predict, on one hand, the synthetic accessibility (SA) score, or how easy it is to make the molecule in the laboratory, and, on the other hand, the toxicity (Tox) score, or the probability of the molecule of being toxic to humans. The authors trained and cross-validated both predictors on a large number of datasets, and demonstrated the method usefulness in building virtual custom libraries.

Compound
Single
Score
Single
Float
Pretrained
Toxicity
Synthetic accessibility
https://github.com/ersilia-os/eos92sw
https://bmcpharmacoltoxicol.biomedcentral.com/articles/10.1186/s40360-018-0282-6
https://github.com/pulimeng/eToxPred
GPL-3.0
Higher scores indicate easier synthetic accessibility and higher toxicity, respectively
Regression
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos92sw
AMD64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos92sw.zip
Local
4/6/2021
Q2
2021
chemprop-sars-cov-inhibition
Ready
SARS-CoV inhibition

This model was developed to support the early efforts in the identification of novel drugs against SARS-CoV2. It predicts the probability that a small molecule inhibits SARS-3CLpro-mediated peptide cleavage. It was developed using a high-throughput screening against the 3CL protease of SARS-CoV1, as no data was yet available for the new virus (SARS-CoV2) causing the COVID-19 pandemic. It uses the ChemProp model.

Compound
Single
Probability
Single
Float
Pretrained
COVID19
Antiviral activity
Sars-CoV-2
Chemical graph model
https://github.com/ersilia-os/eos9f6t
https://www.sciencedirect.com/science/article/pii/S0092867420301021
http://chemprop.csail.mit.edu/checkpoints
MIT
Probability of 3CL protease inhibition (%) The classifier was trained using a threshold of 12% of inhibition
Classification
miquelduranfrigola
https://github.com/miquelduranfrigola
https://hub.docker.com/r/ersiliaos/eos9f6t
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9f6t.zip
Local
Yes
3/6/2021
Q2
2021
chemprop-antibiotic
Ready
Broad spectrum antibiotic activity

Based on a simple E.coli growth inhibition assay, the authors trained a model capable of identifying antibiotic potential in compounds structurally divergent from conventional antibiotic drugs. One of the predicted active molecules, Halicin (SU3327), was experimentally validated in vitro and in vivo. Halicin is a drug under development as a treatment for diabetes.

Compound
Single
Probability
Single
Float
Pretrained
E.coli
IC50
Antimicrobial activity
Chemical graph model
https://github.com/ersilia-os/eos4e40
https://pubmed.ncbi.nlm.nih.gov/32084340/
http://chemprop.csail.mit.edu/checkpoints
MIT
Probability that a compound inhibits E.coli growth. The inhibition threshold was set at 80% growth inhibition in the training set.
Classification
miquelduranfrigola
https://github.com/miquelduranfrigola
https://eos4e40-rovva.ondigitalocean.app/
https://hub.docker.com/r/ersiliaos/eos4e40
AMD64
ARM64
https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos4e40.zip
https://ersilia-app-cubsw.ondigitalocean.app/?model_id=eos4e40
Local
Yes
6/6/2018
Q2
2018

Alert

Lorem ipsum
Okay