Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multi-assignment clustering: Machine learning from a biological perspective
School of Bioscience, University of Skövde, Skövde, Sweden.
School of Informatics, University of Skövde, Skövde, Sweden.
Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL). School of Informatics, University of Skövde, Skövde, Sweden.ORCID iD: 0000-0003-2900-9335
Takara Bio Europe AB, Gothenburg, Sweden.
Show others and affiliations
2021 (English)In: Journal of Biotechnology, ISSN 0168-1656, E-ISSN 1873-4863, Vol. 326, p. 1-10Article in journal (Refereed) Published
Abstract [en]

A common approach for analyzing large-scale molecular data is to cluster objects sharing similar characteristics. This assumes that genes with highly similar expression profiles are likely participating in a common molecular process. Biological systems are extremely complex and challenging to understand, with proteins having multiple functions that sometimes need to be activated or expressed in a time-dependent manner. Thus, the strategies applied for clustering of these molecules into groups are of key importance for translation of data to biologically interpretable findings. Here we implemented a multi-assignment clustering (MAsC) approach that allows molecules to be assigned to multiple clusters, rather than single ones as in commonly used clustering techniques. When applied to high-throughput transcriptomics data, MAsC increased power of the downstream pathway analysis and allowed identification of pathways with high biological relevance to the experimental setting and the biological systems studied. Multi-assignment clustering also reduced noise in the clustering partition by excluding genes with a low correlation to all of the resulting clusters. Together, these findings suggest that our methodology facilitates translation of large-scale molecular data into biological knowledge. The method is made available as an R package on GitLab (https://gitlab.com/wolftower/masc).

Place, publisher, year, edition, pages
Elsevier, 2021. Vol. 326, p. 1-10
Keywords [en]
Clustering, K-means, annotation enrichment, multiple cluster assignment, pathways, transcriptomics
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hj:diva-51257DOI: 10.1016/j.jbiotec.2020.12.002ISI: 000616124700001PubMedID: 33285150Scopus ID: 2-s2.0-85097644109Local ID: HOAOAI: oai:DiVA.org:hj-51257DiVA, id: diva2:1510961
Funder
Knowledge Foundation, 2014/0301, 2017/0302Available from: 2020-12-17 Created: 2020-12-17 Last updated: 2021-03-15Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Riveiro, Maria

Search in DiVA

By author/editor
Riveiro, Maria
By organisation
Jönköping AI Lab (JAIL)
In the same journal
Journal of Biotechnology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 110 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf