Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Calibrating probability estimation trees using Venn-Abers predictors
Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).ORCID iD: 0000-0003-0412-6199
Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).ORCID iD: 0000-0003-0274-9026
School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
2019 (English)In: SIAM International Conference on Data Mining, SDM 2019, Society for Industrial and Applied Mathematics, 2019, p. 28-36Conference paper, Published paper (Refereed)
Abstract [en]

Class labels output by standard decision trees are not very useful for making informed decisions, e.g., when comparing the expected utility of various alternatives. In contrast, probability estimation trees (PETs) output class probability distributions rather than single class labels. It is well known that estimating class probabilities in PETs by relative frequencies often lead to extreme probability estimates, and a number of approaches to provide more well-calibrated estimates have been proposed. In this study, a recent model-agnostic calibration approach, called Venn-Abers predictors is, for the first time, considered in the context of decision trees. Results from a large-scale empirical investigation are presented, comparing the novel approach to previous calibration techniques with respect to several different performance metrics, targeting both predictive performance and reliability of the estimates. All approaches are considered both with and without Laplace correction. The results show that using Venn-Abers predictors for calibration is a highly competitive approach, significantly outperforming Platt scaling, Isotonic regression and no calibration, with respect to almost all performance metrics used, independently of whether Laplace correction is applied or not. The only exception is AUC, where using non-calibrated PETs together with Laplace correction, actually is the best option, which can be explained by the fact that AUC is not affected by the absolute, but only relative, values of the probability estimates. 

Place, publisher, year, edition, pages
Society for Industrial and Applied Mathematics, 2019. p. 28-36
Keywords [en]
Calibration, Data mining, Decision trees, Forestry, Laplace transforms, Calibration techniques, Class probabilities, Empirical investigation, Performance metrics, Predictive performance, Probability estimate, Probability estimation trees, Relative frequencies, Probability distributions
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:hj:diva-44355DOI: 10.1137/1.9781611975673.4Scopus ID: 2-s2.0-85066082095ISBN: 9781611975673 (print)OAI: oai:DiVA.org:hj-44355DiVA, id: diva2:1322841
Conference
19th SIAM International Conference on Data Mining, SDM 2019, Hyatt Regency Calgary, Calgary, Canada, 2 - 4 May 2019
Available from: 2019-06-11 Created: 2019-06-11 Last updated: 2019-08-22Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Johansson, UlfLöfström, Tuve

Search in DiVA

By author/editor
Johansson, UlfLöfström, Tuve
By organisation
JTH, Jönköping AI Lab (JAIL)
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 56 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf