Planned maintenance
A system upgrade is planned for 10/12-2024, at 12:00-13:00. During this time DiVA will be unavailable.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
CPSign - Conformal Prediction for Cheminformatics Modeling
Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Sweden.
Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Sweden; Department of Computer and Systems Sciences, Stockholm University, Sweden; MTM Research Centre, School of Science and Technology, Örebro University, Sweden.
Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Sweden.
Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Sweden; Department of Computer Science, Royal Holloway University of London, UK.
Show others and affiliations
2024 (English)In: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 16, no 1, article id 75Article in journal (Refereed) Published
Abstract [en]

Conformal prediction has seen many applications in pharmaceutical science, being able to calibrate outputsof machine learning models and producing valid prediction intervals. We here present the open source softwareCPSign that is a complete implementation of conformal prediction for cheminformatics modeling. CPSign implements inductive and transductive conformal prediction for classifcation and regression, and probabilistic predictionwith the Venn-ABERS methodology. The main chemical representation is signatures but other types of descriptorsare also supported. The main modeling methodology is support vector machines (SVMs), but additional modelingmethods are supported via an extension mechanism, e.g. DeepLearning4J models. We also describe features for visualizing results from conformal models including calibration and efciency plots, as well as features to publish predictive models as REST services. We compare CPSign against other common cheminformatics modeling approachesincluding random forest, and a directed message-passing neural network. The results show that CPSign producesrobust predictive performance with comparative predictive efciency, with superior runtime and lower hardwarerequirements compared to neural network based models. CPSign has been used in several studies and is in production-use in multiple organizations. The ability to work directly with chemical input fles, perform descriptor calculationand modeling with SVM in the conformal prediction framework, with a single software package having a low footprint and fast execution time makes CPSign a convenient and yet fexible package for training, deploying, and predicting on chemical data. CPSign can be downloaded from GitHub at https://github.com/arosbio/cpsign.

Scientifc contribution 

CPSign provides a single software that allows users to perform data preprocessing, modeling and make predictionsdirectly on chemical structures, using conformal and probabilistic prediction. Building and evaluating new modelscan be achieved at a high abstraction level, without sacrifcing fexibility and predictive performance—showcasedwith a method evaluation against contemporary modeling approaches, where CPSign performs on par with a stateof-the-art deep learning based model.

Place, publisher, year, edition, pages
BioMed Central (BMC), 2024. Vol. 16, no 1, article id 75
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hj:diva-63307DOI: 10.1186/s13321-024-00870-9ISI: 001258657400001PubMedID: 38943219Scopus ID: 2-s2.0-85197657994Local ID: GOA;;1826415OAI: oai:DiVA.org:hj-63307DiVA, id: diva2:1826415
Funder
Swedish Research Council, 2020-03731, 2020-01865Swedish Research Council Formas, 2022-00940Swedish Cancer Society, 22 2412EU, Horizon Europe, 101057014
Note

Originally posted on the preprint server bioRxiv on November 22, 2023.

Available from: 2024-01-11 Created: 2024-01-11 Last updated: 2024-07-15Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopusPreprint (bioRxiv)

Authority records

Carlsson, Lars

Search in DiVA

By author/editor
Carlsson, Lars
By organisation
JTH, Department of Computing
In the same journal
Journal of Cheminformatics
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 78 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf