Change search
Refine search result
12 1 - 50 of 73
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ahlberg, Ernst
    et al.
    Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Winiwarter, Susanne
    Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Boström, Henrik
    Department of Computer and Systems Sciences, Stockholm University, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Norinder, Ulf
    Swetox, Karolinska Institutet, Unit of Toxicology Sciences, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Engkvist, Ola
    External Sciences, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Hammar, Oscar
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Bendtsen, Claus
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Cambridge, UK.
    Carlsson, Lars
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Using conformal prediction to prioritize compound synthesis in drug discovery2017In: Proceedings of Machine Learning Research: Volume 60: Conformal and Probabilistic Prediction and Applications, 13-16 June 2017, Stockholm, Sweden / [ed] Alex Gammerman, Vladimir Vovk, Zhiyuan Luo, and Harris Papadopoulos, Machine Learning Research , 2017, p. 174-184Conference paper (Refereed)
    Abstract [en]

    The choice of how much money and resources to spend to understand certain problems is of high interest in many areas. This work illustrates how computational models can be more tightly coupled with experiments to generate decision data at lower cost without reducing the quality of the decision. Several different strategies are explored to illustrate the trade off between lowering costs and quality in decisions.

    AUC is used as a performance metric and the number of objects that can be learnt from is constrained. Some of the strategies described reach AUC values over 0.9 and outperforms strategies that are more random. The strategies that use conformal predictor p-values show varying results, although some are top performing.

    The application studied is taken from the drug discovery process. In the early stages of this process compounds, that potentially could become marketed drugs, are being routinely tested in experimental assays to understand the distribution and interactions in humans.

    Download full text (pdf)
    Fulltext
  • 2.
    Boström, Henrik
    et al.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Mondrian conformal predictive distributions2021In: Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications, PMLR , 2021, Vol. 152, p. 24-38Conference paper (Refereed)
    Abstract [en]

    The distributions output by a standard (non-normalized) conformal predictive system all have the same shape but differ in location, while a normalized conformal predictive system outputs distributions that differ also in shape, through rescaling. An approach to further increasing the flexibility of the framework is proposed, called Mondrian conformal predictive distributions, which are (standard or normalized) conformal predictive distributions formed from multiple Mondrian categories. The effectiveness of the approach is demonstrated with an application to regression forests. By forming categories through binning of the predictions, it is shown that for this model class, the use of Mondrian conformal predictive distributions significantly outperforms the use of both standard and normalized conformal predictive distributions with respect to the continuous- ranked probability score. It is further shown that the use of Mondrian conformal predictive distributions results in as tight prediction intervals as produced by normalized conformal regressors, while improving upon the point predictions of the underlying regression forest.

  • 3.
    Boström, Henrik
    et al.
    Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Löfström, Tuve
    Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics. Jönköping University, School of Engineering, JTH. Research area Computer Science and Informatics.
    Evaluation of a variance-based nonconformity measure for regression forests2016In: Conformal and Probabilistic Prediction with Applications, Springer, 2016, p. 75-89Conference paper (Refereed)
    Abstract [en]

    In a previous large-scale empirical evaluation of conformal regression approaches, random forests using out-of-bag instances for calibration together with a k-nearest neighbor-based nonconformity measure, was shown to obtain state-of-the-art performance with respect to efficiency, i.e., average size of prediction regions. However, the use of the nearest-neighbor procedure not only requires that all training data have to be retained in conjunction with the underlying model, but also that a significant computational overhead is incurred, during both training and testing. In this study, a more straightforward nonconformity measure is investigated, where the difficulty estimate employed for normalization is based on the variance of the predictions made by the trees in a forest. A large-scale empirical evaluation is presented, showing that both the nearest-neighbor-based and the variance-based measures significantly outperform a standard (non-normalized) nonconformity measure, while no significant difference in efficiency between the two normalized approaches is observed. Moreover, the evaluation shows that state-of-theart performance is achieved by the variance-based measure at a computational cost that is several orders of magnitude lower than when employing the nearest-neighbor-based nonconformity measure. 

  • 4.
    Boström, Henrik
    et al.
    Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Accelerating difficulty estimation for conformal regression forests2017In: Annals of Mathematics and Artificial Intelligence, ISSN 1012-2443, E-ISSN 1573-7470, Vol. 81, no 1-2, p. 125-144Article in journal (Refereed)
    Abstract [en]

    The conformal prediction framework allows for specifying the probability of making incorrect predictions by a user-provided confidence level. In addition to a learning algorithm, the framework requires a real-valued function, called nonconformity measure, to be specified. The nonconformity measure does not affect the error rate, but the resulting efficiency, i.e., the size of output prediction regions, may vary substantially. A recent large-scale empirical evaluation of conformal regression approaches showed that using random forests as the learning algorithm together with a nonconformity measure based on out-of-bag errors normalized using a nearest-neighbor-based difficulty estimate, resulted in state-of-the-art performance with respect to efficiency. However, the nearest-neighbor procedure incurs a significant computational cost. In this study, a more straightforward nonconformity measure is investigated, where the difficulty estimate employed for normalization is based on the variance of the predictions made by the trees in a forest. A large-scale empirical evaluation is presented, showing that both the nearest-neighbor-based and the variance-based measures significantly outperform a standard (non-normalized) nonconformity measure, while no significant difference in efficiency between the two normalized approaches is observed. The evaluation moreover shows that the computational cost of the variance-based measure is several orders of magnitude lower than when employing the nearest-neighbor-based nonconformity measure. The use of out-of-bag instances for calibration does, however, result in nonconformity scores that are distributed differently from those obtained from test instances, questioning the validity of the approach. An adjustment of the variance-based measure is presented, which is shown to be valid and also to have a significant positive effect on the efficiency. For conformal regression forests, the variance-based nonconformity measure is hence a computationally efficient and theoretically well-founded alternative to the nearest-neighbor procedure. 

    Download full text (pdf)
    Fulltext
  • 5.
    Giri, Chandadevi
    et al.
    University of Borås, Department of Business Administration and Textile Management.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Predictive modeling of campaigns to quantify performance in fashion retail industry2019In: Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019, IEEE, 2019, p. 2267-2273Conference paper (Refereed)
    Abstract [en]

    Managing campaigns and promotions effectively is vital for the fashion retail industry. While retailers invest a lot of money in campaigns, customer retention is often very low. At innovative retailers, data-driven methods, aimed at understanding and ultimately optimizing campaigns are introduced. In this application paper, machine learning techniques are employed to analyze data about campaigns and promotions from a leading Swedish e-retailer. More specifically, predictive modeling is used to forecast the profitability and activation of campaigns using different kinds of promotions. In the empirical investigation, regression models are generated to estimate the profitability, and classification models are used to predict the overall success of the campaigns. In both cases, random forests are compared to individual tree models. As expected, the more complex ensembles are more accurate, but the usage of interpretable tree models makes it possible to analyze the underlying relationships, simply by inspecting the trees. In conclusion, the accuracy of the predictive models must be deemed high enough to make these data-driven methods attractive.

  • 6.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Bostrom, H.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Investigating Normalized Conformal Regressors2021In: 2021 IEEE Symposium Series on Computational Intelligence, SSCI 2021 - Proceedings, Institute of Electrical and Electronics Engineers (IEEE), 2021Conference paper (Other academic)
    Abstract [en]

    Conformal prediction can be applied on top of any machine learning predictive regression model, thus turning it into a conformal regressor. Given a significance level $\epsilon$, conformal regressors output valid prediction intervals, i.e., the probability that the interval covers the true value is exactly $1-\epsilon$. To obtain validity, a calibration set that is not used for training the model must be set aside. In standard inductive conformal regression, the size of the prediction intervals is then determined by the absolute error made by the predictive model on a specific instance in the calibration set, where different significance levels correspond to different instances. In this setting, all prediction intervals will have the same size, making the resulting models very unspecific. When adding a technique called normalization, however, the difficulty of each instance is estimated, and the interval sizes are adjusted accordingly. An integral part of normalized conformal regressors is a parameter called $\beta$, which determines the relative importance of the difficulty estimation and the error of the model. In this study, the effects of different underlying models, difficulty estimation functions and $\beta$ -values are investigated. The results from a large empirical study, using twenty publicly available data sets, show that better difficulty estimation functions will lead to both tighter and more specific prediction intervals. Furthermore, it is found that the $\beta$ -values used strongly affect the conformal regressor. While there is no specific $\beta$ -value that will always minimize the interval sizes, lower $\beta$ -values lead to more variation in the interval sizes, i.e., more specific models. In addition, the analysis also identifies that the normalization procedure introduces a small but unfortunate bias in the models. More specifically, normalization using low $\beta$ -values means that smaller intervals are more likely to be erroneous, while the opposite is true for higher $\beta$ -values. © 2021 IEEE.

  • 7.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Conformal Prediction Using Decision Trees2013Conference paper (Refereed)
    Abstract [en]

    Conformal prediction is a relatively new framework in which the predictive models output sets of predictions with a bound on the error rate, i.e., in a classification context, the probability of excluding the correct class label is lower than a predefined significance level. An investigation of the use of decision trees within the conformal prediction framework is presented, with the overall purpose to determine the effect of different algorithmic choices, including split criterion, pruning scheme and way to calculate the probability estimates. Since the error rate is bounded by the framework, the most important property of conformal predictors is efficiency, which concerns minimizing the number of elements in the output prediction sets. Results from one of the largest empirical investigations to date within the conformal prediction framework are presented, showing that in order to optimize efficiency, the decision trees should be induced using no pruning and with smoothed probability estimates. The choice of split criterion to use for the actual induction of the trees did not turn out to have any major impact on the efficiency. Finally, the experimentation also showed that when using decision trees, standard inductive conformal prediction was as efficient as the recently suggested method cross-conformal prediction. This is an encouraging results since cross-conformal prediction uses several decision trees, thus sacrificing the interpretability of a single decision tree.

    Download full text (pdf)
    fulltext
  • 8.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Linusson, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Regression conformal prediction with random forests2014In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 97, no 1-2, p. 155-176Article in journal (Refereed)
    Abstract [en]

    Regression conformal prediction produces prediction intervals that are valid, i.e., the probability of excluding the correct target value is bounded by a predefined confidence level. The most important criterion when comparing conformal regressors is efficiency; the prediction intervals should be as tight (informative) as possible. In this study, the use of random forests as the underlying model for regression conformal prediction is investigated and compared to existing state-of-the-art techniques, which are based on neural networks and k-nearest neighbors. In addition to their robust predictive performance, random forests allow for determining the size of the prediction intervals by using out-of-bag estimates instead of requiring a separate calibration set. An extensive empirical investigation, using 33 publicly available data sets, was undertaken to compare the use of random forests to existing stateof- the-art conformal predictors. The results show that the suggested approach, on almost all confidence levels and using both standard and normalized nonconformity functions, produced significantly more efficient conformal predictors than the existing alternatives.

    Download full text (pdf)
    fulltext
  • 9.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Linusson, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Rule Extraction with Guaranteed Fidelity2014Conference paper (Refereed)
    Abstract [en]

    This paper extends the conformal prediction framework to rule extraction, making it possible to extract interpretable models from opaque models in a setting where either the infidelity or the error rate is bounded by a predefined significance level. Experimental results on 27 publicly available data sets show that all three setups evaluated produced valid and rather efficient conformal predictors. The implication is that augmenting rule extraction with conformal prediction allows extraction of models where test set errors or test sets infidelities are guaranteed to be lower than a chosen acceptable level. Clearly this is beneficial for both typical rule extraction scenarios, i.e., either when the purpose is to explain an existing opaque model, or when it is to build a predictive model that must be interpretable.

    Download full text (pdf)
    fulltext
  • 10.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Evolved Decision Trees as Conformal Predictors2013Conference paper (Refereed)
    Abstract [en]

    In conformal prediction, predictive models output sets of predictions with a bound on the error rate. In classification, this translates to that the probability of excluding the correct class is lower than a predefined significance level, in the long run. Since the error rate is guaranteed, the most important criterion for conformal predictors is efficiency. Efficient conformal predictors minimize the number of elements in the output prediction sets, thus producing more informative predictions. This paper presents one of the first comprehensive studies where evolutionary algorithms are used to build conformal predictors. More specifically, decision trees evolved using genetic programming are evaluated as conformal predictors. In the experiments, the evolved trees are compared to decision trees induced using standard machine learning techniques on 33 publicly available benchmark data sets, with regard to predictive performance and efficiency. The results show that the evolved trees are generally more accurate, and the corresponding conformal predictors more efficient, than their induced counterparts. One important result is that the probability estimates of decision trees when used as conformal predictors should be smoothed, here using the Laplace correction. Finally, using the more discriminating Brier score instead of accuracy as the optimization criterion produced the most efficient conformal predictions.

    Download full text (pdf)
    fulltext
  • 11.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Increasing Rule Extraction Accuracy by Post-processing GP Trees2008In: Proceedings of the Congress on Evolutionary Computation, IEEE, 2008, p. 3010-3015Conference paper (Refereed)
    Abstract [en]

    Genetic programming (GP), is a very general and efficient technique, often capable of outperforming more specialized techniques on a variety of tasks. In this paper, we suggest a straightforward novel algorithm for post-processing of GP classification trees. The algorithm iteratively, one node at a time, searches for possible modifications that would result in higher accuracy. More specifically, the algorithm for each split evaluates every possible constant value and chooses the best. With this design, the post-processing algorithm can only increase training accuracy, never decrease it. In this study, we apply the suggested algorithm to GP trees, extracted from neural network ensembles. Experimentation, using 22 UCI datasets, shows that the post-processing results in higher test set accuracies on a large majority of datasets. As a matter of fact, for two setups of three evaluated, the increase in accuracy is statistically significant.

    Download full text (pdf)
    fulltext
  • 12.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Using Imaginary Ensembles to Select GP Classifiers2010In: Genetic Programming: 13th European Conference, EuroGP 2010, Istanbul, Turkey, April 7-9, 2010, Proceedings / [ed] A.I. et al. Esparcia-Alcazar, Springer, 2010, p. 278-288Conference paper (Refereed)
    Abstract [en]

    When predictive modeling requires comprehensible models, most data miners will use specialized techniques producing rule sets or decision trees. This study, however, shows that genetically evolved decision trees may very well outperform the more specialized techniques. The proposed approach evolves a number of decision trees and then uses one of several suggested selection strategies to pick one specific tree from that pool. The inherent inconsistency of evolution makes it possible to evolve each tree using all data, and still obtain somewhat different models. The main idea is to use these quite accurate and slightly diverse trees to form an imaginary ensemble, which is then used as a guide when selecting one specific tree. Simply put, the tree classifying the largest number of instances identically to the ensemble is chosen. In the experimentation, using 25 UCI data sets, two selection strategies obtained significantly higher accuracy than the standard rule inducer J48.

    Download full text (pdf)
    fulltext
  • 13.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Post-processing Evolved Decision Trees2009In: Foundations of Computational Intelligence / [ed] Ajith Abraham, Springer, 2009, p. 149-164Chapter in book (Other academic)
    Abstract [en]

    Although Genetic Programming (GP) is a very general technique, it is also quite powerful. As a matter of fact, GP has often been shown to outperform more specialized techniques on a variety of tasks. In data mining, GP has successfully been applied to most major tasks; e.g. classification, regression and clustering. In this chapter, we introduce, describe and evaluate a straightforward novel algorithm for post-processing genetically evolved decision trees. The algorithm works by iteratively, one node at a time, search for possible modifications that will result in higher accuracy. More specifically, the algorithm, for each interior test, evaluates every possible split for the current attribute and chooses the best. With this design, the post-processing algorithm can only increase training accuracy, never decrease it. In the experiments, the suggested algorithm is applied to GP decision trees, either induced directly from datasets, or extracted from neural network ensembles. The experimentation, using 22 UCI datasets, shows that the suggested post-processing technique results in higher test set accuracies on a large majority of the datasets. As a matter of fact, the increase in test accuracy is statistically significant for one of the four evaluated setups, and substantial on two out of the other three.

  • 14.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Linusson, H.
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Boström, H.
    Department of Computer and Systems Sciences, Stockholm University, Sweden.
    Model-agnostic nonconformity functions for conformal classification2017In: Proceedings of the International Joint Conference on Neural Networks, IEEE, 2017, p. 2072-2079Conference paper (Refereed)
    Abstract [en]

    A conformai predictor outputs prediction regions, for classification label sets. The key property of all conformai predictors is that they are valid, i.e., their error rate on novel data is bounded by a preset significance level. Thus, the key performance metric for evaluating conformal predictors is the size of the output prediction regions, where smaller (more informative) prediction regions are said to be more efficient. All conformal predictions rely on nonconformity functions, measuring the strangeness of an input-output pair, and the efficiency depends critically on the quality of the chosen nonconformity function. In this paper, three model-agnostic nonconformity functions, based on well-known loss functions, are evaluated with regard to how they affect efficiency. In the experimentation on 21 publicly available multi-class data sets, both single neural networks and ensembles of neural networks are used as underlying models for conformal classifiers. The results show that the choice of nonconformity function has a major impact on the efficiency, but also that different nonconformity functions should be used depending on the exact efficiency metric. For a high fraction of single-label predictions, a margin-based nonconformity function is the best option, while a nonconformity function based on the hinge loss obtained the smallest label sets on average.

  • 15.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Interpretable regression trees using conformal prediction2018In: Expert systems with applications, ISSN 0957-4174, E-ISSN 1873-6793, Vol. 97, p. 394-404Article in journal (Refereed)
    Abstract [en]

    A key property of conformal predictors is that they are valid, i.e., their error rate on novel data is bounded by a preset level of confidence. For regression, this is achieved by turning the point predictions of the underlying model into prediction intervals. Thus, the most important performance metric for evaluating conformal regressors is not the error rate, but the size of the prediction intervals, where models generating smaller (more informative) intervals are said to be more efficient. State-of-the-art conformal regressors typically utilize two separate predictive models: the underlying model providing the center point of each prediction interval, and a normalization model used to scale each prediction interval according to the estimated level of difficulty for each test instance. When using a regression tree as the underlying model, this approach may cause test instances falling into a specific leaf to receive different prediction intervals. This clearly deteriorates the interpretability of a conformal regression tree compared to a standard regression tree, since the path from the root to a leaf can no longer be translated into a rule explaining all predictions in that leaf. In fact, the model cannot even be interpreted on its own, i.e., without reference to the corresponding normalization model. Current practice effectively presents two options for constructing conformal regression trees: to employ a (global) normalization model, and thereby sacrifice interpretability; or to avoid normalization, and thereby sacrifice both efficiency and individualized predictions. In this paper, two additional approaches are considered, both employing local normalization: the first approach estimates the difficulty by the standard deviation of the target values in each leaf, while the second approach employs Mondrian conformal prediction, which results in regression trees where each rule (path from root node to leaf node) is independently valid. An empirical evaluation shows that the first approach is as efficient as current state-of-the-art approaches, thus eliminating the efficiency vs. interpretability trade-off present in existing methods. Moreover, it is shown that if a validity guarantee is required for each single rule, as provided by the Mondrian approach, a penalty with respect to efficiency has to be paid, but it is only substantial at very high confidence levels.

  • 16.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Producing Implicit Diversity in ANN Ensembles2012Conference paper (Refereed)
    Abstract [en]

    Combining several ANNs into ensembles normally results in a very accurate and robust predictive models. Many ANN ensemble techniques are, however, quite complicated and often explicitly optimize some diversity metric. Unfortunately, the lack of solid validation of the explicit algorithms, at least for classification, makes the use of diversity measures as part of an optimization function questionable. The merits of implicit methods, most notably bagging, are on the other hand experimentally established and well-known. This paper evaluates a number of straightforward techniques for introducing implicit diversity in ANN ensembles, including a novel technique producing diversity by using ANNs with different and slightly randomized link structures. The experimental results, comparing altogether 54 setups and two different ensemble sizes on 30 UCI data sets, show that all methods succeeded in producing implicit diversity, but that the effect on ensemble accuracy varied. Still, most setups evaluated did result in more accurate ensembles, compared to the baseline setup, especially for the larger ensemble size. As a matter of fact, several setups even obtained significantly higher ensemble accuracy than bagging. The analysis also identified that diversity was, relatively speaking, more important for the larger ensembles. Looking specifically at the methods used to increase the implicit diversity, setups using the technique that utilizes the randomized link structures generally produced the most accurate ensembles.

    Download full text (pdf)
    fulltext
  • 17.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Overproduce-and-Select: The Grim Reality2013Conference paper (Refereed)
    Abstract [en]

    Overproduce-and-select (OPAS) is a frequently used paradigm for building ensembles. In static OPAS, a large number of base classifiers are trained, before a subset of the available models is selected to be combined into the final ensemble. In general, the selected classifiers are supposed to be accurate and diverse for the OPAS strategy to result in highly accurate ensembles, but exactly how this is enforced in the selection process is not obvious. Most often, either individual models or ensembles are evaluated, using some performance metric, on available and labeled data. Naturally, the underlying assumption is that an observed advantage for the models (or the resulting ensemble) will carry over to test data. In the experimental study, a typical static OPAS scenario, using a pool of artificial neural networks and a number of very natural and frequently used performance measures, is evaluated on 22 publicly available data sets. The discouraging result is that although a fairly large proportion of the ensembles obtained higher test set accuracies, compared to using the entire pool as the ensemble, none of the selection criteria could be used to identify these highly accurate ensembles. Despite only investigating a specific scenario, we argue that the settings used are typical for static OPAS, thus making the results general enough to question the entire paradigm.

    Download full text (pdf)
    fulltext
  • 18.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Random Brains2013Conference paper (Refereed)
    Abstract [en]

    In this paper, we introduce and evaluate a novel method, called random brains, for producing neural network ensembles. The suggested method, which is heavily inspired by the random forest technique, produces diversity implicitly by using bootstrap training and randomized architectures. More specifically, for each base classifier multilayer perceptron, a number of randomly selected links between the input layer and the hidden layer are removed prior to training, thus resulting in potentially weaker but more diverse base classifiers. The experimental results on 20 UCI data sets show that random brains obtained significantly higher accuracy and AUC, compared to standard bagging of similar neural networks not utilizing randomized architectures. The analysis shows that the main reason for the increased ensemble performance is the ability to produce effective diversity, as indicated by the increase in the difficulty diversity measure.

    Download full text (pdf)
    fulltext
  • 19.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Empirically Investigating the Importance of Diversity2007Conference paper (Refereed)
  • 20.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Evaluating Standard Techniques for Implicit Diversity2008In: Advances in Knowledge Discovery and Data Mining, Springer, 2008, p. 613-622Conference paper (Refereed)
  • 21.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    The Importance of Diversity in Neural Network Ensembles: An Empirical Investigation2007Conference paper (Refereed)
    Abstract [en]

    When designing ensembles, it is almost an axiom that the base classifiers must be diverse in order for the ensemble to generalize well. Unfortunately, there is no clear definition of the key term diversity, leading to several diversity measures and many, more or less ad hoc, methods for diversity creation in ensembles. In addition, no specific diversity measure has shown to have a high correlation with test set accuracy. The purpose of this paper is to empirically evaluate ten different diversity measures, using neural network ensembles and 11 publicly available data sets. The main result is that all diversity measures evaluated, in this study too, show low or very low correlation with test set accuracy. Having said that, two measures; double fault and difficulty show slightly higher correlations compared to the other measures. The study furthermore shows that the correlation between accuracy measured on training or validation data and test set accuracy also is rather low. These results challenge ensemble design techniques where diversity is explicitly maximized or where ensemble accuracy on a hold-out set is used for optimization.

  • 22.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Sundell, Håkan
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Venn predictors using lazy learners2018In: Proceedings of the 2018 International Conference on Data Science, ICDATA'18 / [ed] R. Stahlbock, G. M. Weiss & M. Abou-Nasr, CSREA Press, 2018, p. 220-226Conference paper (Refereed)
    Abstract [en]

    Probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. Venn predictors, which can be used on top of any classifier, are automatically valid multiprobability predictors, making them extremely suitable for probabilistic classification. A Venn predictor outputs multiple probabilities for each label, so the predicted label is associated with a probability interval. While all Venn predictors are valid, their accuracy and the size of the probability interval are dependent on both the underlying model and some interior design choices. Specifically, all Venn predictors use so called Venn taxonomies for dividing the instances into a number of categories, each such taxonomy defining a different Venn predictor. A frequently used, but very basic taxonomy, is to categorize the instances based on their predicted label. In this paper, we investigate some more finegrained taxonomies, that use not only the predicted label but also some measures related to the confidence in individual predictions. The empirical investigation, using 22 publicly available data sets and lazy learners (kNN) as the underlying models, showed that the probability estimates from the Venn predictors, as expected, were extremely well-calibrated. Most importantly, using the basic (i.e., label-based) taxonomy produced significantly more accurate and informative Venn predictors compared to the more complex alternatives. In addition, the results also showed that when using lazy learners as underlying models, a transductive approach significantly outperformed an inductive, with regard to accuracy and informativeness. This result is in contrast to previous studies, where other underlying models were used.

    Download full text (pdf)
    fulltext
  • 23.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Locally Induced Predictive Models2011Conference paper (Refereed)
    Abstract [en]

    Most predictive modeling techniques utilize all available data to build global models. This is despite the wellknown fact that for many problems, the targeted relationship varies greatly over the input space, thus suggesting that localized models may improve predictive performance. In this paper, we suggest and evaluate a technique inducing one predictive model for each test instance, using only neighboring instances. In the experimentation, several different variations of the suggested algorithm producing localized decision trees and neural network models are evaluated on 30 UCI data sets. The main result is that the suggested approach generally yields better predictive performance than global models built using all available training data. As a matter of fact, all techniques producing J48 trees obtained significantly higher accuracy and AUC, compared to the global J48 model. For RBF network models, with their inherent ability to use localized information, the suggested approach was only successful with regard to accuracy, while global RBF models had a better ranking ability, as seen by their generally higher AUCs.

    Download full text (pdf)
    fulltext
  • 24.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Well-calibrated and specialized probability estimation trees2020In: Proceedings of the 2020 SIAM International Conference on Data Mining, SDM 2020 / [ed] C. Demeniconi and N. Chawla, Society for Industrial and Applied Mathematics, 2020, p. 415-423Conference paper (Refereed)
    Abstract [en]

    In many predictive modeling scenarios, the production set inputs that later will be used for the actual prediction is available and could be utilized in the modeling process. In fact, many predictive models are generated with an existing production set in mind. Despite this, few approaches utilize this information in order to produce models optimized on the production set at hand. If these models need to be comprehensible, the oracle coaching framework can be applied, often resulting in interpretable models, e.g., decision trees and rule sets, with accuracies on par with opaque models like neural networks and ensembles, on the specific production set. In oracle coaching, a strong but opaque predictive model is used to label instances, including the production set, which are later learned by a weaker but interpretable model. In this paper, oracle coaching is, for the first time, used for improving the calibration of probabilistic predictors. More specifically, setups where oracle coaching are combined with the techniques Platt scaling, isotonic regression and Venn-Abers are suggested and evaluated for calibrating probability estimation trees (PETs). A key contribution is the setup designs ensuring that the oracle-coached PETs, that per definition utilize knowledge about production data, remain well-calibrated. In the experimentation, using 23 publicly available data sets, it is shown that oracle-coached models are not only more accurate, but also significantly better calibrated, compared to standard induction. Interestingly enough, this holds both for the uncalibrated PETs, and for all calibration techniques evaluated, i.e., Platt scaling, isotonic regression and Venn-Abers. As expected, all three external techniques significantly improved the calibration of the original PETs. Finally, an outright comparison between the three external calibration techniques showed that Venn-Abers significantly outperformed the alternatives in most setups.

  • 25.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, H.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden.
    Conformal Predictive Distribution Trees2023In: Annals of Mathematics and Artificial Intelligence, ISSN 1012-2443, E-ISSN 1573-7470Article in journal (Refereed)
    Abstract [en]

    Being able to understand the logic behind predictions or recommendations on the instance level is at the heart of trustworthy machine learning models. Inherently interpretable models make this possible by allowing inspection and analysis of the model itself, thus exhibiting the logic behind each prediction, while providing an opportunity to gain insights about the underlying domain. Another important criterion for trustworthiness is the model’s ability to somehow communicate a measure of confidence in every specific prediction or recommendation. Indeed, the overall goal of this paper is to produce highly informative models that combine interpretability and algorithmic confidence. For this purpose, we introduce conformal predictive distribution trees, which is a novel form of regression trees where each leaf contains a conformal predictive distribution. Using this representation language, the proposed approach allows very versatile analyses of individual leaves in the regression trees. Specifically, depending on the chosen level of detail, the leaves, in addition to the normal point predictions, can provide either cumulative distributions or prediction intervals that are guaranteed to be well-calibrated. In the empirical evaluation, the suggested conformal predictive distribution trees are compared to the well-established conformal regressors, thus demonstrating the benefits of the enhanced representation.

  • 26.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, H.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden.
    Well-Calibrated and Sharp Interpretable Multi-Class Models2021In: Lecture Notes in Computer Science: Modeling Decisions for Artificial Intelligence / [ed] V. Torra & Y. Narukawa, Springer Science and Business Media Deutschland GmbH , 2021, Vol. 12898, p. 193-204Conference paper (Refereed)
    Abstract [en]

    Interpretable models make it possible to understand individual predictions, and are in many domains considered mandatory for user acceptance and trust. If coupled with communicated algorithmic confidence, interpretable models become even more informative, also making it possible to assess and compare the confidence expressed by the models in different predictions. To earn a user’s appropriate trust, however, the communicated algorithmic confidence must also be well-calibrated. In this paper, we suggest a novel way of extending Venn-Abers predictors to multi-class problems. The approach is applied to decision trees, providing well-calibrated probability intervals in the leaves. The result is one interpretable model with valid and sharp probability intervals, ready for inspection and analysis. In the experimentation, the proposed method is verified using 20 publicly available data sets showing that the generated models are indeed well-calibrated.

  • 27.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Calibrating multi-class models2021In: Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications / [ed] Lars Carlsson, Zhiyuan Luo, Giovanni Cherubin, Khuong An Nguyen, PMLR , 2021, Vol. 152, p. 111-130Conference paper (Refereed)
    Abstract [en]

    Predictive models communicating algorithmic confidence are very informative, but only if well-calibrated and sharp, i.e., providing accurate probability estimates adjusted for each instance. While almost all machine learning algorithms are able to produce probability estimates, these are often poorly calibrated, thus requiring external calibration. For multiclass problems, external calibration has typically been done using one-vs-all or all-vs-all schemes, thus adding to the computational complexity, but also making it impossible to analyze and inspect the predictive models. In this paper, we suggest a novel approach for calibrating inherently multi-class models. Instead of providing a probability distribution over all labels, the estimation is of the probability that the class label predicted by the underlying model is correct. In an extensive empirical study, it is shown that the suggested approach, when applied to both Platt scaling and Venn-Abers, is able to improve the probability estimates from decision trees, random forests and extreme gradient boosting.

  • 28.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Calibrating probability estimation trees using Venn-Abers predictors2019In: SIAM International Conference on Data Mining, SDM 2019, Society for Industrial and Applied Mathematics, 2019, p. 28-36Conference paper (Refereed)
    Abstract [en]

    Class labels output by standard decision trees are not very useful for making informed decisions, e.g., when comparing the expected utility of various alternatives. In contrast, probability estimation trees (PETs) output class probability distributions rather than single class labels. It is well known that estimating class probabilities in PETs by relative frequencies often lead to extreme probability estimates, and a number of approaches to provide more well-calibrated estimates have been proposed. In this study, a recent model-agnostic calibration approach, called Venn-Abers predictors is, for the first time, considered in the context of decision trees. Results from a large-scale empirical investigation are presented, comparing the novel approach to previous calibration techniques with respect to several different performance metrics, targeting both predictive performance and reliability of the estimates. All approaches are considered both with and without Laplace correction. The results show that using Venn-Abers predictors for calibration is a highly competitive approach, significantly outperforming Platt scaling, Isotonic regression and no calibration, with respect to almost all performance metrics used, independently of whether Laplace correction is applied or not. The only exception is AUC, where using non-calibrated PETs together with Laplace correction, actually is the best option, which can be explained by the fact that AUC is not affected by the absolute, but only relative, values of the probability estimates. 

  • 29.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Sönströd, Cecilia
    Dept. of Information Technology, University of Borås, Sweden.
    Interpretable and Specialized Conformal Predictors2019In: Conformal and Probabilistic Prediction and Applications / [ed] Alex Gammerman, Vladimir Vovk, Zhiyuan Luo, Evgueni Smirnov, 2019, p. 3-22Conference paper (Refereed)
    Abstract [en]

    In real-world scenarios, interpretable models are often required to explain predictions, and to allow for inspection and analysis of the model. The overall purpose of oracle coaching is to produce highly accurate, but interpretable, models optimized for a specific test set. Oracle coaching is applicable to the very common scenario where explanations and insights are needed for a specific batch of predictions, and the input vectors for this test set are available when building the predictive model. In this paper, oracle coaching is used for generating underlying classifiers for conformal prediction. The resulting conformal classifiers output valid label sets, i.e., the error rate on the test data is bounded by a preset significance level, as long as the labeled data used for calibration is exchangeable with the test set. Since validity is guaranteed for all conformal predictors, the key performance metric is efficiency, i.e., the size of the label sets, where smaller sets are more informative. The main contribution of this paper is the design of setups making sure that when oracle-coached decision trees, that per definition utilize knowledge about test data, are used as underlying models for conformal classifiers, the exchangeability between calibration and test data is maintained. Consequently, the resulting conformal classifiers retain the validity guarantees. In the experimentation, using a large number of publicly available data sets, the validity of the suggested setups is empirically demonstrated. Furthermore, the results show that the more accurate underlying models produced by oracle coaching also improved the efficiency of the corresponding conformal classifiers.

    Download full text (pdf)
    fulltext
  • 30.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Linusson, Henrik
    Högskolan i Borås, Department of Information Technology, Borås, Sweden.
    Boström, Henrik
    The Royal Institute of Technology (KTH), School of Electrical Engineering and Computer Science, Stockholm, Sweden.
    Efficient Venn Predictors using Random Forests2019In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 108, no 3, p. 535-550Article in journal (Refereed)
    Abstract [en]

    Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. In addition, a probabilistic classifier must, of course, also be as accurate as possible. In this paper, Venn predictors, and its special case Venn-Abers predictors, are evaluated for probabilistic classification, using random forests as the underlying models. Venn predictors output multiple probabilities for each label, i.e., the predicted label is associated with a probability interval. Since all Venn predictors are valid in the long run, the size of the probability intervals is very important, with tighter intervals being more informative. The standard solution when calibrating a classifier is to employ an additional step, transforming the outputs from a classifier into probability estimates, using a labeled data set not employed for training of the models. For random forests, and other bagged ensembles, it is, however, possible to use the out-of-bag instances for calibration, making all training data available for both model learning and calibration. This procedure has previously been successfully applied to conformal prediction, but was here evaluated for the first time for Venn predictors. The empirical investigation, using 22 publicly available data sets, showed that all four versions of the Venn predictors were better calibrated than both the raw estimates from the random forest, and the standard techniques Platt scaling and isotonic regression. Regarding both informativeness and accuracy, the standard Venn predictor calibrated on out-of-bag instances was the best setup evaluated. Most importantly, calibrating on out-of-bag instances, instead of using a separate calibration set, resulted in tighter intervals and more accurate models on every data set, for both the Venn predictors and the Venn-Abers predictors.

  • 31.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Ståhl, Niclas
    Jönköping University, School of Engineering, JTH, Department of Computing.
    Well-Calibrated Rule Extractors2022In: Proceedings of the Eleventh Symposium on Conformal and Probabilistic Prediction with Applications: Volume 179: Conformal and Probabilistic Prediction with Applications, 24-26 August 2022, Brighton, UK / [ed] U. Johansson, H. Boström, K. A. Nguyen, Z. Luo & L. Carlsson, ML Research Press , 2022, Vol. 179, p. 72-91Conference paper (Refereed)
    Abstract [en]

    While explainability is widely considered necessary for trustworthy predictive models, most explanation modules give only a limited understanding of the reasoning behind the predictions. In pedagogical rule extraction, an opaque model is approximated with a transparent model induced using original training instances, but with the predictions from the opaque model as targets. The result is an interpretable model revealing the exact reasoning used for every possible prediction. The pedagogical approach can be applied to any opaque model and use any learning algorithm producing transparent models as the actual rule extractor. Unfortunately, even if the extracted model is induced to mimic the opaque, test set fidelity may still be poor, thus clearly limiting the value of using the extracted model for explanations and analyses. In this paper, it is suggested to alleviate this problem by extracting probabilistic predictors with well-calibrated fitness estimates. For the calibration, Venn-Abers with its unique validity guarantees, is employed. Using a setup where decision trees are extracted from MLP neural networks, the suggested approach is first demonstrated in detail on one real-world data set. After that, a large-scale empirical evaluation using 25 publicly available benchmark data sets is presented. The results show that the method indeed extracts interpretable models with well-calibrated fitness estimates, i.e., the extracted model can be used for explaining the opaque. Specifically, in the setup used, every leaf in a decision tree contains a label and a well-calibrated probability interval for the fidelity. Consequently, a user could, in addition to obtaining explanations of individual predictions, find the parts of feature space where the decision tree is a good approximation of the MLP and not. In fact, using the sizes of the probability intervals, the models also provide an indication of how certain individual fitness estimates are.

  • 32.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Sundell, Håkan
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Gidenstam, Anders
    Department of Information Technology, University of Borås, Sweden.
    Boström, Henrik
    School of Information and Communication Technology, Royal Institute of Technology, Sweden.
    Venn predictors for well-calibrated probability estimation trees2018In: Conformal and Probabilistic Prediction and Applications / [ed] A. Gammerman, V. Vovk, Z. Luo, E. Smirnov, & R. Peeters, 2018, p. 3-14Conference paper (Refereed)
    Abstract [en]

    Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. The standard solution is to employ an additional step, transforming the outputs from a classifier into probability estimates. In this paper, Venn predictors are compared to Platt scaling and isotonic regression, for the purpose of producing well-calibrated probabilistic predictions from decision trees. The empirical investigation, using 22 publicly available data sets, showed that the probability estimates from the Venn predictor were extremely well-calibrated. In fact, in a direct comparison using the accepted reliability metric, the Venn predictor estimates were the most exact on every data set.

    Download full text (pdf)
    fulltext
  • 33.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Sönströd, Cecilia
    Jönköping University, School of Engineering, JTH, Department of Computing.
    Löfström, Helena
    Jönköping University, Jönköping International Business School.
    Conformal Prediction for Accuracy Guarantees in Classification with Reject Option2023In: Modeling Decisions for Artificial Intelligence: 20th International Conference, MDAI 2023, Umeå, Sweden, June 19–22, 2023, Proceedings / [ed] V. Torra and Y. Narukawa, Springer, 2023, p. 133-145Conference paper (Refereed)
    Abstract [en]

    A standard classifier is forced to predict the label of every test instance, even when confidence in the predictions is very low. In many scenarios, it would, however, be better to avoid making these predictions, maybe leaving them to a human expert. A classifier with that alternative is referred to as a classifier with reject option. In this paper, we propose an algorithm that, for a particular data set, automatically suggests a number of accuracy levels, which it will be able to meet perfectly, using a classifier with reject option. Since the basis of the suggested algorithm is conformal prediction, it comes with strong validity guarantees. The experimentation, using 25 publicly available two-class data sets, confirms that the algorithm obtains empirical accuracies very close to the requested levels. In addition, in an outright comparison with probabilistic predictors, including models calibrated with Platt scaling, the suggested algorithm clearly outperforms the alternatives.

  • 34.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Sonstrod, C.
    Dept. of Information Technology, University of Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Bostrom, H.
    School of Electrical Engineering and Computer Science, Kth Royal Institute of Technology, Sweden.
    Customized interpretable conformal regressors2019In: Proceedings - 2019 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2019, Institute of Electrical and Electronics Engineers (IEEE), 2019, p. 221-230, article id 8964179Conference paper (Refereed)
    Abstract [en]

    Interpretability is recognized as a key property of trustworthy predictive models. Only interpretable models make it straightforward to explain individual predictions, and allow inspection and analysis of the model itself. In real-world scenarios, these explanations and insights are often needed for a specific batch of predictions, i.e., a production set. If the input vectors for this production set are available when generating the predictive model, a methodology called oracle coaching can be used to produce highly accurate and interpretable models optimized for the specific production set. In this paper, oracle coaching is, for the first time, combined with the conformal prediction framework for predictive regression. A conformal regressor, which is built on top of a standard regression model, outputs valid prediction intervals, i.e., the error rate on novel data is bounded by a preset significance level, as long as the labeled data used for calibration is exchangeable with production data. Since validity is guaranteed for all conformal predictors, the key performance metric is the size of the prediction intervals, where tighter (more efficient) intervals are preferred. The efficiency of a conformal model depends on several factors, but more accurate underlying models will generally also lead to improved efficiency in the corresponding conformal predictor. A key contribution in this paper is the design of setups ensuring that when oracle coached regression trees, that per definition utilize knowledge about production data, are used as underlying models for conformal regressors, these remain valid. The experiments, using 20 publicly available regression data sets, demonstrate the validity of the suggested setups. Results also show that utilizing oracle-coached underlying models will generally lead to significantly more efficient conformal regressors, compared to when these are built on top of models induced using only training data. 

  • 35.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Chipper: A Novel Algorithm for Concept Description2008Conference paper (Refereed)
    Abstract [en]

    In this paper, several demands placed on concept description algorithms are identified and discussed. The most important criterion is the ability to produce compact rule sets that, in a natural and accurate way, describe the most important relationships in the underlying domain. An algorithm based on the identified criteria is presented and evaluated. The algorithm, named Chipper, produces decision lists, where each rule covers a maximum number of remaining instances while meeting requested accuracy requirements. In the experiments, Chipper is evaluated on nine UCI data sets. The main result is that Chipper produces compact and understandable rule sets, clearly fulfilling the overall goal of concept description. In the experiments, Chipper's accuracy is similar to standard decision tree and rule induction algorithms, while rule sets have superior comprehensibility.

    Download full text (pdf)
    fulltext
  • 36.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    One Tree to Explain Them All2011Conference paper (Refereed)
    Abstract [en]

    Random forest is an often used ensemble technique, renowned for its high predictive performance. Random forests models are, however, due to their sheer complexity inherently opaque, making human interpretation and analysis impossible. This paper presents a method of approximating the random forest with just one decision tree. The approach uses oracle coaching, a recently suggested technique where a weaker but transparent model is generated using combinations of regular training data and test data initially labeled by a strong classifier, called the oracle. In this study, the random forest plays the part of the oracle, while the transparent models are decision trees generated by either the standard tree inducer J48, or by evolving genetic programs. Evaluation on 30 data sets from the UCI repository shows that oracle coaching significantly improves both accuracy and area under ROC curve, compared to using training data only. As a matter of fact, resulting single tree models are as accurate as the random forest, on the specific test instances. Most importantly, this is not achieved by inducing or evolving huge trees having perfect fidelity; a large majority of all trees are instead rather compact and clearly comprehensible. The experiments also show that the evolution outperformed J48, with regard to accuracy, but that this came at the expense of slightly larger trees.

    Download full text (pdf)
    fulltext
  • 37.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Oracle Coached Decision Trees and Lists2010Conference paper (Refereed)
    Abstract [en]

    This paper introduces a novel method for obtaining increased predictive performance from transparent models in situations where production input vectors are available when building the model. First, labeled training data is used to build a powerful opaque model, called an oracle. Second, the oracle is applied to production instances, generating predicted target values, which are used as labels. Finally, these newly labeled instances are utilized, in different combinations with normal training data, when inducing a transparent model. Experimental results, on 26 UCI data sets, show that the use of oracle coaches significantly improves predictive performance, compared to standard model induction. Most importantly, both accuracy and AUC results are robust over all combinations of opaque and transparent models evaluated. This study thus implies that the straightforward procedure of using a coaching oracle, which can be used with arbitrary classifiers, yields significantly better predictive performance at a low computational cost.

    Download full text (pdf)
    fulltext
  • 38.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Obtaining accurate and comprehensible classifiers using oracle coaching2012In: Intelligent Data Analysis, ISSN 1088-467X, E-ISSN 1571-4128, Vol. Volume 16, no Number 2, p. 247-263Article in journal (Refereed)
    Abstract [en]

    While ensemble classifiers often reach high levels of predictive performance, the resulting models are opaque and hence do not allow direct interpretation. When employing methods that do generate transparent models, predictive performance typically has to be sacrificed. This paper presents a method of improving predictive performance of transparent models in the very common situation where instances to be classified, i.e., the production data, are known at the time of model building. This approach, named oracle coaching, employs a strong classifier, called an oracle, to guide the generation of a weaker, but transparent model. This is accomplished by using the oracle to predict class labels for the production data, and then applying the weaker method on this data, possibly in conjunction with the original training set. Evaluation on 30 data sets from the UCI repository shows that oracle coaching significantly improves predictive performance, measured by both accuracy and area under ROC curve, compared to using training data only. This result is shown to be robust for a variety of methods for generating the oracles and transparent models. More specifically, random forests and bagged radial basis function networks are used as oracles, while J48 and JRip are used for generating transparent models. The evaluation further shows that significantly better results are obtained when using the oracle-classified production data together with the original training data, instead of using only oracle data. An analysis of the fidelity of the transparent models to the oracles shows that performance gains can be expected from increasing oracle performance rather than from increasing fidelity. Finally, it is shown that further performance gains can be achieved by adjusting the relative weights of training data and oracle data.

  • 39.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Using Genetic Programming to Obtain Implicit Diversity2009Conference paper (Refereed)
    Abstract [en]

    When performing predictive data mining, the use of ensembles is known to increase prediction accuracy, compared to single models. To obtain this higher accuracy, ensembles should be built from base classifiers that are both accurate and diverse. The question of how to balance these two properties in order to maximize ensemble accuracy is, however, far from solved and many different techniques for obtaining ensemble diversity exist. One such technique is bagging, where implicit diversity is introduced by training base classifiers on different subsets of available data instances, thus resulting in less accurate, but diverse base classifiers. In this paper, genetic programming is used as an alternative method to obtain implicit diversity in ensembles by evolving accurate, but different base classifiers in the form of decision trees, thus exploiting the inherent inconsistency of genetic programming. The experiments show that the GP approach outperforms standard bagging of decision trees, obtaining significantly higher ensemble accuracy over 25 UCI datasets. This superior performance stems from base classifiers having both higher average accuracy and more diversity. Implicitly introducing diversity using GP thus works very well, since evolved base classifiers tend to be highly accurate and diverse.

    Download full text (pdf)
    fulltext
  • 40.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Sönströd, Cecilia
    Jönköping University, School of Engineering, JTH, Department of Computing.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, H.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Rule extraction with guarantees from regression models2022In: Pattern Recognition, ISSN 0031-3203, E-ISSN 1873-5142, Vol. 126, article id 108554Article in journal (Refereed)
    Abstract [en]

    Tools for understanding and explaining complex predictive models are critical for user acceptance and trust. One such tool is rule extraction, i.e., approximating opaque models with less powerful but interpretable models. Pedagogical (or black-box) rule extraction, where the interpretable model is induced using the original training instances, but with the predictions from the opaque model as targets, has many advantages compared to the decompositional (white-box) approach. Most importantly, pedagogical methods are agnostic to the kind of opaque model used, and any learning algorithm producing interpretable models can be employed for the learning step. The pedagogical approach has, however, one main problem, clearly limiting its utility. Specifically, while the extracted models are trained to mimic the opaque, there are absolutely no guarantees that this will transfer to novel data. This potentially low test set fidelity must be considered a severe drawback, in particular when the extracted models are used for explanation and analysis. In this paper, a novel approach, solving the problem with test set fidelity by utilizing the conformal prediction framework, is suggested for extracting interpretable regression models from opaque models. The extracted models are standard regression trees, but augmented with valid prediction intervals in the leaves. Depending on the exact setup, the use of conformal prediction guarantees that either the test set fidelity or the test set accuracy will be equal to a preset confidence level, in the long run. In the extensive empirical investigation, using 20 publicly available data sets, the validity of the extracted models is demonstrated. In addition, it is shown how normalization can be used to provide individualized prediction intervals, thus providing highly informative extracted models.

  • 41.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Sönströd, Cecilia
    Jönköping University, School of Engineering, JTH, Department of Computing.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Confidence Classifiers with Guaranteed Accuracy or Precision2023In: Proceedings of the Twelfth Symposium on Conformal and Probabilistic Prediction with Applications / [ed] H. Papadopoulos, K. A. Nguyen, H. Boström & L. Carlsson, Proceedings of Machine Learning Research (PMLR) , 2023, Vol. 204, p. 513-533Conference paper (Refereed)
    Abstract [en]

    In many situations, probabilistic predictors have replaced conformal classifiers. The main reason is arguably that the set predictions of conformal classifiers, with the accompanying significance level, are hard to interpret. In this paper, we demonstrate how conformal classification can be used as a basis for a classifier with reject option. Specifically, we introduce and evaluate two algorithms that are able to perfectly estimate accuracy or precision for a set of test instances, in a classifier with reject scenario. In the empirical investigation, the suggested algorithms are shown to clearly outperform both calibrated and uncalibrated probabilistic predictors.

  • 42.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Norinder, Ulf
    Boström, Henrik
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Using Feature Selection with Bagging and Rule Extraction in Drug Discovery2010Conference paper (Refereed)
    Abstract [en]

    This paper investigates different ways of combining feature selection with bagging and rule extraction in predictive modeling. Experiments on a large number of data sets from the medicinal chemistry domain, using standard algorithms implemented in theWeka data mining workbench, show that feature selection can lead to significantly improved predictive performance.When combining feature selection with bagging, employing the feature selection on each bootstrap obtains the best result.When using decision trees for rule extraction, the effect of feature selection can actually be detrimental, unless the transductive approach oracle coaching is also used. However, employing oracle coaching will lead to significantly improved performance, and the best results are obtainedwhen performing feature selection before training the opaque model. The overall conclusion is that it can make a substantial difference for the predictive performance exactly how feature selection is used in conjunction with other techniques.

    Download full text (pdf)
    fulltext
  • 43.
    König, Rikard
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Improving GP Classification Performance by Injection of Decision Trees2010Conference paper (Refereed)
    Abstract [en]

    This paper presents a novel hybrid method combining genetic programming and decision tree learning. The method starts by estimating a benchmark level of reasonable accuracy, based on decision tree performance on bootstrap samples of the training set. Next, a normal GP evolution is started with the aim of producing an accurate GP. At even intervals, the best GP in the population is evaluated against the accuracy benchmark. If the GP has higher accuracy than the benchmark, the evolution continues normally until the maximum number of generations is reached. If the accuracy is lower than the benchmark, two things happen. First, the fitness function is modified to allow larger GPs, able to represent more complex models. Secondly, a decision tree with increased size and trained on a bootstrap of the training data is injected into the population. The experiments show that the hybrid solution of injecting decision trees into a GP population gives synergetic effects producing results that are better than using either technique separately. The results, from 18 UCI data sets, show that the proposed method clearly outperforms normal GP, and is significantly better than the standard decision tree algorithm.

    Download full text (pdf)
    fulltext
  • 44.
    Linusson, Henrik
    et al.
    Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, Royal Institute of Technology, Kista, Sweden.
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Classification with reject option using conformal prediction2018In: Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3-6, 2018, Proceedings, Part I, Springer, 2018, p. 94-105Conference paper (Refereed)
    Abstract [en]

    In this paper, we propose a practically useful means of interpreting the predictions produced by a conformal classifier. The proposed interpretation leads to a classifier with a reject option, that allows the user to limit the number of erroneous predictions made on the test set, without any need to reveal the true labels of the test objects. The method described in this paper works by estimating the cumulative error count on a set of predictions provided by a conformal classifier, ordered by their confidence. Given a test set and a user-specified parameter k, the proposed classification procedure outputs the largest possible amount of predictions containing on average at most k errors, while refusing to make predictions for test objects where it is too uncertain. We conduct an empirical evaluation using benchmark datasets, and show that we are able to provide accurate estimates for the error rate on the test set. 

  • 45.
    Linusson, Henrik
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Dept. of Computer and Systems Sciences Stockholm University, Kista, Sweden.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Efficiency Comparison of Unstable Transductive and Inductive Conformal Classifiers2014Conference paper (Refereed)
    Abstract [en]

    In the conformal prediction literature, it appears axiomatic that transductive conformal classifiers possess a higher predictive efficiency than inductive conformal classifiers, however, this depends on whether or not the nonconformity function tends to overfit misclassified test examples. With the conformal prediction framework’s increasing popularity, it thus becomes necessary to clarify the settings in which this claim holds true. In this paper, the efficiency of transductive conformal classifiers based on decision tree, random forest and support vector machine classification models is compared to the efficiency of corresponding inductive conformal classifiers. The results show that the efficiency of conformal classifiers based on standard decision trees or random forests is substantially improved when used in the inductive mode, while conformal classifiers based on support vector machines are more efficient in the transductive mode. In addition, an analysis is presented that discusses the effects of calibration set size on inductive conformal classifier efficiency.

    Download full text (pdf)
    fulltext
  • 46.
    Linusson, Henrik
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Johansson, Ulf
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Boström, Henrik
    Dept. of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Löfström, Tuve
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Reliable Confidence Predictions Using Conformal Prediction2016In: Lecture Notes in Computer Science, 2016, p. 77-88Conference paper (Refereed)
    Abstract [en]

    Conformal classiers output condence prediction regions, i.e., multi-valued predictions that are guaranteed to contain the true output value of each test pattern with some predened probability. In order to fully utilize the predictions provided by a conformal classier, it is essential that those predictions are reliable, i.e., that a user is able to assess the quality of the predictions made. Although conformal classiers are statistically valid by default, the error probability of the prediction regions output are dependent on their size in such a way that smaller, and thus potentially more interesting, predictions are more likely to be incorrect. This paper proposes, and evaluates, a method for producing rened error probability estimates of prediction regions, that takes their size into account. The end result is a binary conformal condence predictor that is able to provide accurate error probability estimates for those prediction regions containing only a single class label.

    Download full text (pdf)
    fulltext
  • 47.
    Linusson, Henrik
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Signed-Error Conformal Regression2014In: Advances in Knowledge Discovery and Data Mining 18th Pacific-Asia Conference, PAKDD 2014 Tainan, Taiwan, May 13-16, 2014 Proceedings, Part I, Springer, 2014, p. 224-236Conference paper (Refereed)
    Abstract [en]

    This paper suggests a modification of the Conformal Prediction framework for regression that will strengthen the associated guarantee of validity. We motivate the need for this modification and argue that our conformal regressors are more closely tied to the actual error distribution of the underlying model, thus allowing for more natural interpretations of the prediction intervals. In the experimentation, we provide an empirical comparison of our conformal regressors to traditional conformal regressors and show that the proposed modification results in more robust two-tailed predictions, and more efficient one-tailed predictions.

    Download full text (pdf)
    fulltext
  • 48.
    Linusson, Henrik
    et al.
    Department of Information Technology, University of Borås, Sweden.
    Norinder, Ulf
    Swetox, Karolinska Institutet, Unit of Toxicology Sciences, Sweden.
    Boström, Henrik
    Department of Computer and Systems Sciences, Stockholm University, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    On the calibration of aggregated conformal predictors2017In: Proceedings of Machine Learning Research: Volume 60: Conformal and Probabilistic Prediction and Applications, 13-16 June 2017, Stockholm, Sweden / [ed] Alex Gammerman, Vladimir Vovk, Zhiyuan Luo, and Harris Papadopoulos, Machine Learning Research , 2017, p. 154-173Conference paper (Refereed)
    Abstract [en]

    Conformal prediction is a learning framework that produces models that associate with each of their predictions a measure of statistically valid confidence. These models are typically constructed on top of traditional machine learning algorithms. An important result of conformal prediction theory is that the models produced are provably valid under relatively weak assumptions—in particular, their validity is independent of the specific underlying learning algorithm on which they are based. Since validity is automatic, much research on conformal predictors has been focused on improving their informational and computational efficiency. As part of the efforts in constructing efficient conformal predictors, aggregated conformal predictors were developed, drawing inspiration from the field of classification and regression ensembles. Unlike early definitions of conformal prediction procedures, the validity of aggregated conformal predictors is not fully understood—while it has been shown that they might attain empirical exact validity under certain circumstances, their theoretical validity is conditional on additional assumptions that require further clarification. In this paper, we show why validity is not automatic for aggregated conformal predictors, and provide a revised definition of aggregated conformal predictors that gains approximate validity conditional on properties of the underlying learning algorithm.

    Download full text (pdf)
    Fulltext
  • 49.
    Löfström, Helena
    et al.
    Jönköping University, Jönköping International Business School.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Interpretable instance-based text classification for social science research projects2018In: Archives of Data Science, Series A, ISSN 2363-9881, Vol. 5, no 1Article in journal (Refereed)
    Abstract [en]

    In this study, two groups of respondents have evaluated explanations generated from an instance-based explanation method called WITE (Weighted Instance-based Text Explanations). One group consisted of 24 non-experts who answered a web survey about the words characterising the concepts of the classes and the other group consisted of three senior researchers and three respondents from a media house in Sweden who answered a questionnaire with open questions. The data used originates from one of the researchers’ project on media consumption in Sweden. The results from the non-experts indicate that WITE identified many words that corresponded to the human understanding but also included some insignificant or contrary words as important. In the results from the expert evaluation, there were indications that there is a risk that the explanations could persuade the users of the correctness of a prediction, even if it is incorrect. Consequently, the study indicates that an explanation method could be seen as a new actor which is able to persuade and interact with the humans and cause a change in the results of the classification of a text.

  • 50.
    Löfström, Helena
    et al.
    Jönköping University, Jönköping International Business School, JIBS, Informatics.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Sönströd, Cecilia
    Jönköping University, School of Engineering, JTH, Department of Computing.
    Calibrated explanations: With uncertainty information and counterfactuals2024In: Expert systems with applications, ISSN 0957-4174, E-ISSN 1873-6793, Vol. 246, article id 123154Article in journal (Refereed)
    Abstract [en]

    While local explanations for AI models can offer insights into individual predictions, such as feature importance, they are plagued by issues like instability. The unreliability of feature weights, often skewed due to poorly calibrated ML models, deepens these challenges. Moreover, the critical aspect of feature importance uncertainty remains mostly unaddressed in Explainable AI (XAI). The novel feature importance explanation method presented in this paper, called Calibrated Explanations (CE), is designed to tackle these issues head-on. Built on the foundation of Venn-Abers, CE not only calibrates the underlying model but also delivers reliable feature importance explanations with an exact definition of the feature weights. CE goes beyond conventional solutions by addressing output uncertainty. It accomplishes this by providing uncertainty quantification for both feature weights and the model’s probability estimates. Additionally, CE is model-agnostic, featuring easily comprehensible conditional rules and the ability to generate counterfactual explanations with embedded uncertainty quantification. Results from an evaluation with 25 benchmark datasets underscore the efficacy of CE, making it stand as a fast, reliable, stable, and robust solution.

12 1 - 50 of 73
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf