Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Investigating the impact of calibration on the quality of explanations
Jönköping University, Jönköping International Business School. Department of Information Technology, University of Borås, Borås, Sweden.ORCID iD: 0000-0001-9633-0423
Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).ORCID iD: 0000-0003-0274-9026
Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).ORCID iD: 0000-0003-0412-6199
Jönköping University, School of Engineering, JTH, Department of Computing.ORCID iD: 0009-0009-0404-2586
2023 (English)In: Annals of Mathematics and Artificial Intelligence, ISSN 1012-2443, E-ISSN 1573-7470Article in journal (Refereed) Epub ahead of print
Abstract [en]

Predictive models used in Decision Support Systems (DSS) are often requested to explain the reasoning to users. Explanations of instances consist of two parts; the predicted label with an associated certainty and a set of weights, one per feature, describing how each feature contributes to the prediction for the particular instance. In techniques like Local Interpretable Model-agnostic Explanations (LIME), the probability estimate from the underlying model is used as a measurement of certainty; consequently, the feature weights represent how each feature contributes to the probability estimate. It is, however, well-known that probability estimates from classifiers are often poorly calibrated, i.e., the probability estimates do not correspond to the actual probabilities of being correct. With this in mind, explanations from techniques like LIME risk becoming misleading since the feature weights will only describe how each feature contributes to the possibly inaccurate probability estimate. This paper investigates the impact of calibrating predictive models before applying LIME. The study includes 25 benchmark data sets, using Random forest and Extreme Gradient Boosting (xGBoost) as learners and Venn-Abers and Platt scaling as calibration methods. Results from the study show that explanations of better calibrated models are themselves better calibrated, with ECE and log loss for the explanations after calibration becoming more conformed to the model ECE and log loss. The conclusion is that calibration makes the models and the explanations better by accurately representing reality.

Place, publisher, year, edition, pages
Springer, 2023.
Keywords [en]
Calibration, Decision support systems, Explainable artificial intelligence, Predicting with confidence, Uncertainty in explanations, Venn Abers
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:hj:diva-60033DOI: 10.1007/s10472-023-09837-2ISI: 000948763400001Scopus ID: 2-s2.0-85149810932Local ID: HOA;;870772OAI: oai:DiVA.org:hj-60033DiVA, id: diva2:1746184
Funder
Knowledge FoundationAvailable from: 2023-03-27 Created: 2023-03-27 Last updated: 2023-11-08
In thesis
1. Trustworthy explanations: Improved decision support through well-calibrated uncertainty quantification
Open this publication in new window or tab >>Trustworthy explanations: Improved decision support through well-calibrated uncertainty quantification
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The use of Artificial Intelligence (AI) has transformed fields like disease diagnosis and defence. Utilising sophisticated Machine Learning (ML) models, AI predicts future events based on historical data, introducing complexity that challenges understanding and decision-making. Previous research emphasizes users’ difficulty discerning when to trust predictions due to model complexity, underscoring addressing model complexity and providing transparent explanations as pivotal for facilitating high-quality decisions.

Many ML models offer probability estimates for predictions, commonly used in methods providing explanations to guide users on prediction confidence. However, these probabilities often do not accurately reflect the actual distribution in the data, leading to potential user misinterpretation of prediction trustworthiness. Additionally, most explanation methods fail to convey whether the model’s probability is linked to any uncertainty, further diminishing the reliability of the explanations.

Evaluating the quality of explanations for decision support is challenging, and although highlighted as essential in research, there are no benchmark criteria for comparative evaluations.

This thesis introduces an innovative explanation method that generates reliable explanations, incorporating uncertainty information supporting users in determining when to trust the model’s predictions. The thesis also outlines strategies for evaluating explanation quality and facilitating comparative evaluations. Through empirical evaluations and user studies, the thesis provides practical insights to support decision-making utilising complex ML models.

Abstract [sv]

Användningen av Artificiell intelligens (AI) har förändrat områden som diagnosticering av sjukdomar och försvar. Genom att använda sofistikerade maskininlärningsmodeller predicerar AI framtida händelser baserat på historisk data. Modellernas komplexitet resulterar samtidigt i utmanande beslutsprocesser när orsakerna till prediktionerna är svårbegripliga. Tidigare forskning pekar på användares problem att avgöra prediktioners tillförlitlighet på grund av modellkomplexitet och belyser vikten av att tillhandahålla transparenta förklaringar för att underlätta högkvalitativa beslut.

Många maskininlärningsmodeller erbjuder sannolikhetsuppskattningar för prediktionerna, vilket vanligtvis används i metoder som ger förklaringar för att vägleda användare om prediktionernas tillförlitlighet. Dessa sannolikheter återspeglar dock ofta inte de faktiska fördelningarna i datat, vilket kan leda till att användare felaktigt tolkar prediktioner som tillförlitliga. Därutöver förmedlar de flesta förklaringsmetoder inte om prediktionernas sannolikheter är kopplade till någon osäkerhet, vilket minskar tillförlitligheten hos förklaringarna.

Att utvärdera kvaliteten på förklaringar för beslutsstöd är utmanande, och även om det har betonats som avgörande i forskning finns det inga benchmark-kriterier för jämförande utvärderingar.

Denna avhandling introducerar en innovativ förklaringsmetod som genererar tillförlitliga förklaringar vilka inkluderar osäkerhetsinformation för att stödja användare att avgöra när man kan lita på modellens prediktioner. Avhandlingen ger också förslag på strategier för att utvärdera kvaliteten på förklaringar och underlätta jämförande utvärderingar. Genom empiriska utvärderingar och användarstudier ger avhandlingen praktiska insikter för att stödja beslutsfattande användande komplexa maskininlärningsmodeller.

Place, publisher, year, edition, pages
Jönköping: Jönköping University, Jönköping International Business School, 2023. p. 72
Series
JIBS Dissertation Series, ISSN 1403-0470 ; 159
Keywords
Explainable Artificial Intelligence, Interpretable Machine Learning, Decision Support Systems, Uncertainty Estimation, Explanation Methods
National Category
Information Systems, Social aspects Computer Sciences
Identifiers
urn:nbn:se:hj:diva-62865 (URN)978-91-7914-031-1 (ISBN)978-91-7914-032-8 (ISBN)
Public defence
2023-12-12, B1014, Jönköping International Business School, Jönköping, 13:15 (English)
Opponent
Supervisors
Available from: 2023-11-08 Created: 2023-11-08 Last updated: 2023-11-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Löfström, HelenaLöfström, TuweJohansson, UlfSönströd, Cecilia

Search in DiVA

By author/editor
Löfström, HelenaLöfström, TuweJohansson, UlfSönströd, Cecilia
By organisation
Jönköping International Business SchoolJönköping AI Lab (JAIL)JTH, Department of Computing
In the same journal
Annals of Mathematics and Artificial Intelligence
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 281 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf