Change search
Refine search result
12 1 - 50 of 95
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ahlberg, Ernst
    et al.
    Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Winiwarter, Susanne
    Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Boström, Henrik
    Department of Computer and Systems Sciences, Stockholm University, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Norinder, Ulf
    Swetox, Karolinska Institutet, Unit of Toxicology Sciences, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Engkvist, Ola
    External Sciences, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Hammar, Oscar
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Bendtsen, Claus
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Cambridge, UK.
    Carlsson, Lars
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Using conformal prediction to prioritize compound synthesis in drug discovery2017In: Proceedings of Machine Learning Research: Volume 60: Conformal and Probabilistic Prediction and Applications, 13-16 June 2017, Stockholm, Sweden / [ed] Alex Gammerman, Vladimir Vovk, Zhiyuan Luo, and Harris Papadopoulos, Machine Learning Research , 2017, p. 174-184Conference paper (Refereed)
    Abstract [en]

    The choice of how much money and resources to spend to understand certain problems is of high interest in many areas. This work illustrates how computational models can be more tightly coupled with experiments to generate decision data at lower cost without reducing the quality of the decision. Several different strategies are explored to illustrate the trade off between lowering costs and quality in decisions.

    AUC is used as a performance metric and the number of objects that can be learnt from is constrained. Some of the strategies described reach AUC values over 0.9 and outperforms strategies that are more random. The strategies that use conformal predictor p-values show varying results, although some are top performing.

    The application studied is taken from the drug discovery process. In the early stages of this process compounds, that potentially could become marketed drugs, are being routinely tested in experimental assays to understand the distribution and interactions in humans.

    Download full text (pdf)
    Fulltext
  • 2.
    Alkhatib, A.
    et al.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Boström, H.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Ennadir, S.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Approximating Score-based Explanation Techniques Using Conformal Regression2023In: Proceedings of Machine Learning Research / [ed] H. Papadopoulos, K. A. Nguyen, H. Boström, L. Carlsson, ML Research Press , 2023, Vol. 204, p. 450-469Conference paper (Refereed)
    Abstract [en]

    Score-based explainable machine-learning techniques are often used to understand the logic behind black-box models. However, such explanation techniques are often computationally expensive, which limits their application in time-critical contexts. Therefore, we propose and investigate the use of computationally less costly regression models for approximating the output of score-based explanation techniques, such as SHAP. Moreover, validity guarantees for the approximated values are provided by the employed inductive conformal prediction framework. We propose several non-conformity measures designed to take the difficulty of approximating the explanations into account while keeping the computational cost low. We present results from a large-scale empirical investigation, in which the approximate explanations generated by our proposed models are evaluated with respect to efficiency (interval size). The results indicate that the proposed method can significantly improve execution time compared to the fast version of SHAP, TreeSHAP. The results also suggest that the proposed method can produce tight intervals, while providing validity guarantees. Moreover, the proposed approach allows for comparing explanations of different approximation methods and selecting a method based on how informative (tight) are the predicted intervals.

  • 3. Alkhatib, Amr
    et al.
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Assessing Explanation Quality by Venn Prediction2022In: Proceedings of the Eleventh Symposium on Conformal and Probabilistic Prediction with Applications: Volume 179: Conformal and Probabilistic Prediction with Applications, 24-26 August 2022, Brighton, UK / [ed] U. Johansson, H. Boström, K. A. Nguyen, Z. Luo & L. Carlsson, ML Research Press , 2022, Vol. 179, p. 42-54Conference paper (Refereed)
    Abstract [en]

    Rules output by explainable machine learning techniques naturally come with a degree of uncertainty, as the complex functionality of the underlying black-box model often can be difficult to approximate by a single, interpretable rule. However, the uncertainty of these approximations is not properly quantified by current explanatory techniques. The use of Venn prediction is here proposed and investigated as a means to quantify the uncertainty of the explanations and thereby also allow for competing explanation techniques to be evaluated with respect to their relative uncertainty. A number of metrics of rule explanation quality based on uncertainty are proposed and discussed, including metrics that capture the tendency of the explanations to predict the correct outcome of a black-box model on new instances, how informative (tight) the produced intervals are, and how certain a rule is when predicting one class. An empirical investigation is presented, in which explanations produced by the state-of-the-art technique Anchors are compared to explanatory rules obtained from association rule mining. The results suggest that the association rule mining approach may provide explanations with less uncertainty towards the correct label, as predicted by the black-box model, compared to Anchors. The results also show that the explanatory rules obtained through association rule mining result in tighter intervals and are closer to either one or zero compared to Anchors, i.e., they are more certain towards a specific class label.

  • 4.
    Arvidsson, Simon
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing.
    Gabrielsson, Patrick
    Jönköping University, School of Engineering, JTH, Department of Computing.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Texture Mapping of Flags onto Polandball Characters using Convolutional Neural Nets2021In: 2021 International Joint Conference on Neural Networks (IJCNN), 2021, p. 1-7Conference paper (Refereed)
    Abstract [en]

    Polandball comics are hand-drawn satirical content that portray personified countries in a unique style. Although certain parts of these comics, such as ball outlines, are easy to draw, some country flags are complex and require time, effort, and skill to depict correctly. Convolutional Neural Networks have shown success in image synthesis tasks but lack the ability to rescale and rotate images for texture mapping. The domain of Virtual Try-On Networks has made great progress in networks that can handle spatially invariant transforms. We show that similar methods can be used in another domain dependent on texture mapping, namely generating valid, rule-abiding Poland-ball characters given an outline and a country flag. To evaluate our method we make use of the Fréchet Inception Distance where we achieved a score of 34.9. Multiple configurations of the model were evaluated to show that all modules used in the model contribute to the achieved performance. The main contributions in this paper are: a model that can be used by Polandball artists to aid in comic creation and a dataset with over 40,000 labeled Polandball characters for computer vision tasks.

  • 5.
    Boström, Henrik
    et al.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Mondrian conformal predictive distributions2021In: Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications, PMLR , 2021, Vol. 152, p. 24-38Conference paper (Refereed)
    Abstract [en]

    The distributions output by a standard (non-normalized) conformal predictive system all have the same shape but differ in location, while a normalized conformal predictive system outputs distributions that differ also in shape, through rescaling. An approach to further increasing the flexibility of the framework is proposed, called Mondrian conformal predictive distributions, which are (standard or normalized) conformal predictive distributions formed from multiple Mondrian categories. The effectiveness of the approach is demonstrated with an application to regression forests. By forming categories through binning of the predictions, it is shown that for this model class, the use of Mondrian conformal predictive distributions significantly outperforms the use of both standard and normalized conformal predictive distributions with respect to the continuous- ranked probability score. It is further shown that the use of Mondrian conformal predictive distributions results in as tight prediction intervals as produced by normalized conformal regressors, while improving upon the point predictions of the underlying regression forest.

  • 6.
    Boström, Henrik
    et al.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Vesterberg, Anders
    Scania CV AB, Sweden.
    Predicting with Confidence from Survival Data2019In: Conformal and Probabilistic Prediction and Applications / [ed] Alex Gammerman, Vladimir Vovk, Zhiyuan Luo, Evgueni Smirnov, 2019, p. 123-141Conference paper (Refereed)
    Abstract [en]

    Survival modeling concerns predicting whether or not an event will occur before or on a given point in time. In a recent study, the conformal prediction framework was applied to this task, and so-called conformal random survival forest was proposed. It was empirically shown that the error level of this model indeed is very close to the provided confidence level, and also that the error for predicting each outcome, i.e., event or no-event, can be controlled separately by employing a Mondrian approach. The addressed task concerned making predictions for time points as provided by the underlying distribution. However, if one instead is interested in making predictions with respect to some specific time point, the guarantee of the conformal prediction framework no longer holds, as one is effectively considering a sample from another distribution than from which the calibration instances have been drawn. In this study, we propose a modification of the approach for specific time points, which transforms the problem into a binary classification task, thereby allowing the error level to be controlled. The latter is demonstrated by an empirical investigation using both a collection of publicly available datasets and two in-house datasets from a truck manufacturing company.

    Download full text (pdf)
    fulltext
  • 7.
    Boström, Henrik
    et al.
    Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Löfström, Tuve
    Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics. Jönköping University, School of Engineering, JTH. Research area Computer Science and Informatics.
    Evaluation of a variance-based nonconformity measure for regression forests2016In: Conformal and Probabilistic Prediction with Applications, Springer, 2016, p. 75-89Conference paper (Refereed)
    Abstract [en]

    In a previous large-scale empirical evaluation of conformal regression approaches, random forests using out-of-bag instances for calibration together with a k-nearest neighbor-based nonconformity measure, was shown to obtain state-of-the-art performance with respect to efficiency, i.e., average size of prediction regions. However, the use of the nearest-neighbor procedure not only requires that all training data have to be retained in conjunction with the underlying model, but also that a significant computational overhead is incurred, during both training and testing. In this study, a more straightforward nonconformity measure is investigated, where the difficulty estimate employed for normalization is based on the variance of the predictions made by the trees in a forest. A large-scale empirical evaluation is presented, showing that both the nearest-neighbor-based and the variance-based measures significantly outperform a standard (non-normalized) nonconformity measure, while no significant difference in efficiency between the two normalized approaches is observed. Moreover, the evaluation shows that state-of-theart performance is achieved by the variance-based measure at a computational cost that is several orders of magnitude lower than when employing the nearest-neighbor-based nonconformity measure. 

  • 8.
    Boström, Henrik
    et al.
    Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Accelerating difficulty estimation for conformal regression forests2017In: Annals of Mathematics and Artificial Intelligence, ISSN 1012-2443, E-ISSN 1573-7470, Vol. 81, no 1-2, p. 125-144Article in journal (Refereed)
    Abstract [en]

    The conformal prediction framework allows for specifying the probability of making incorrect predictions by a user-provided confidence level. In addition to a learning algorithm, the framework requires a real-valued function, called nonconformity measure, to be specified. The nonconformity measure does not affect the error rate, but the resulting efficiency, i.e., the size of output prediction regions, may vary substantially. A recent large-scale empirical evaluation of conformal regression approaches showed that using random forests as the learning algorithm together with a nonconformity measure based on out-of-bag errors normalized using a nearest-neighbor-based difficulty estimate, resulted in state-of-the-art performance with respect to efficiency. However, the nearest-neighbor procedure incurs a significant computational cost. In this study, a more straightforward nonconformity measure is investigated, where the difficulty estimate employed for normalization is based on the variance of the predictions made by the trees in a forest. A large-scale empirical evaluation is presented, showing that both the nearest-neighbor-based and the variance-based measures significantly outperform a standard (non-normalized) nonconformity measure, while no significant difference in efficiency between the two normalized approaches is observed. The evaluation moreover shows that the computational cost of the variance-based measure is several orders of magnitude lower than when employing the nearest-neighbor-based nonconformity measure. The use of out-of-bag instances for calibration does, however, result in nonconformity scores that are distributed differently from those obtained from test instances, questioning the validity of the approach. An adjustment of the variance-based measure is presented, which is shown to be valid and also to have a significant positive effect on the efficiency. For conformal regression forests, the variance-based nonconformity measure is hence a computationally efficient and theoretically well-founded alternative to the nearest-neighbor procedure. 

    Download full text (pdf)
    Fulltext
  • 9.
    Buendia, Ruben
    et al.
    Department of Information Technology, University of Borås, Borås, Sweden.
    Kogej, Thierry
    Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Engkvist, Ola
    Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Carlsson, Lars
    Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Toccaceli, Paolo
    Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom.
    Ahlberg, Ernst
    Data Science and AI, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Accurate Hit Estimation for Iterative Screening Using Venn-ABERS Predictors2019In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 59, no 3, p. 1230-1237Article in journal (Refereed)
    Abstract [en]

    Iterative screening has emerged as a promising approach to increase the efficiency of high-throughput screening (HTS) campaigns in drug discovery. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models. One of the challenges of iterative screening is to decide how many iterations to perform. This is mainly related to difficulties in estimating the prospective hit rate in any given iteration. In this article, a novel method based on Venn - ABERS predictors is proposed. The method provides accurate estimates of the number of hits retrieved in any given iteration during an HTS campaign. The estimates provide the necessary information to support the decision on the number of iterations needed to maximize the screening outcome. Thus, this method offers a prospective screening strategy for early-stage drug discovery.

  • 10.
    Carlsson, Lars
    et al.
    Drug Safety and Metabolism, AstraZeneca Innovative Medicines and Early Development, Mölndal, Sweden.
    Ahlberg, Ernst
    Drug Safety and Metabolism, AstraZeneca Innovative Medicines and Early Development, Mölndal, Sweden.
    Boström, Henrik
    Department of Systems and Computer Sciences, Stockholm University, Stockholm, Sweden.
    Johansson, Ulf
    School of Business and IT, University of Borås, Borås, Sweden.
    Linusson, Henrik
    School of Business and IT, University of Borås, Borås, Sweden.
    Modifications to p-Values of conformal predictors2015In: Statistical learning and data sciences, Springer, 2015, p. 251-259Conference paper (Refereed)
    Abstract [en]

    The original definition of a p-value in a conformal predictor can sometimes lead to too conservative prediction regions when the number of training or calibration examples is small. The situation can be improved by using a modification to define an approximate p-value. Two modified p-values are presented that converges to the original p-value as the number of training or calibration examples goes to infinity. Numerical experiments empirically support the use of a p-value we call the interpolated p-value for conformal prediction. The interpolated p-value seems to be producing prediction sets that have an error rate which corresponds well to the prescribed significance level.

  • 11.
    Dahlbom, Anders
    et al.
    Högskolan i Skövde.
    Riveiro, Maria
    Högskolan i Skövde, Institutionen för informationsteknologi. Högskolan i Skövde, Forskningscentrum för Informationsteknologi.
    König, Rikard
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Johansson, Ulf
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Brattberg, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Supporting Golf Coaching with 3D Modeling of Swings2014In: Sportinformatik X: Jahrestagung der dvs-Sektion Sportinformatik, Hamburg: Feldhaus Verlag , 2014, 10, p. 142-148Chapter in book (Refereed)
  • 12.
    Gabrielsson, Patrick
    et al.
    Department of Information Technology, University of Borås, Sweden.
    Johansson, Ulf
    Department of Information Technology, University of Borås, Sweden.
    High-frequency equity index futures trading using recurrent reinforcement learning with candlesticks2015In: Proceedings - 2015 IEEE Symposium Series on Computational Intelligence, SSCI 2015, IEEE, 2015, p. 734-741Conference paper (Refereed)
    Abstract [en]

    In 1997, Moody and Wu presented recurrent reinforcement learning (RRL) as a viable machine learning method within algorithmic trading. Subsequent research has shown a degree of controversy with regards to the benefits of incorporating technical indicators in the recurrent reinforcement learning framework. In 1991, Nison introduced Japanese candlesticks to the global research community as an alternative to employing traditional indicators within the technical analysis of financial time series. The literature accumulated over the past two and a half decades of research contains conflicting results with regards to the utility of using Japanese candlestick patterns to exploit inefficiencies in financial time series. In this paper, we combine features based on Japanese candlesticks with recurrent reinforcement learning to produce a high-frequency algorithmic trading system for the E-mini S&P 500 index futures market. Our empirical study shows a statistically significant increase in both return and Sharpe ratio compared to relevant benchmarks, suggesting the existence of exploitable spatio-Temporal structure in Japanese candlestick patterns and the ability of recurrent reinforcement learning to detect and take advantage of this structure in a high-frequency equity index futures trading environment.

  • 13.
    Gabrielsson, Patrick
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Co-Evolving Online High-Frequency Trading Strategies Using Grammatical Evolution2014Conference paper (Refereed)
    Abstract [en]

    Numerous sophisticated algorithms exist for discovering reoccurring patterns in financial time series. However, the most accurate techniques available produce opaque models, from which it is impossible to discern the rationale behind trading decisions. It is therefore desirable to sacrifice some degree of accuracy for transparency. One fairly recent evolutionary computational technology that creates transparent models, using a user-specified grammar, is grammatical evolution (GE). In this paper, we explore the possibility of evolving transparent entry- and exit trading strategies for the E-mini S&P 500 index futures market in a high-frequency trading environment using grammatical evolution. We compare the performance of models incorporating risk into their calculations with models that do not. Our empirical results suggest that profitable, risk-averse, transparent trading strategies for the E-mini S&P 500 can be obtained using grammatical evolution together with technical indicators.

    Download full text (pdf)
    fulltext
  • 14.
    Gabrielsson, Patrick
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Evolving Hierarchical Temporal Memory-Based Trading Models2013Conference paper (Refereed)
    Abstract [en]

    We explore the possibility of using the genetic algorithm to optimize trading models based on the Hierarchical Temporal Memory (HTM) machine learning technology. Technical indicators, derived from intraday tick data for the E-mini S&P 500 futures market (ES), were used as feature vectors to the HTM models. All models were configured as binary classifiers, using a simple buy-and-hold trading strategy, and followed a supervised training scheme. The data set was partitioned into multiple folds to enable a modified cross validation scheme. Artificial Neural Networks (ANNs) were used to benchmark HTM performance. The results show that the genetic algorithm succeeded in finding predictive models with good performance and generalization ability. The HTM models outperformed the neural network models on the chosen data set and both technologies yielded profitable results with above average accuracy.

    Download full text (pdf)
    fulltext
  • 15.
    Gabrielsson, Patrick
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Hierarchical Temporal Memory-based algorithmic trading of financial markets2012Conference paper (Refereed)
    Abstract [en]

    This paper explores the possibility of using the Hierarchical Temporal Memory (HTM) machine learning technology to create a profitable software agent for trading financial markets. Technical indicators, derived from intraday tick data for the E-mini S&P 500 futures market (ES), were used as features vectors to the HTM models. All models were configured as binary classifiers, using a simple buy-and-hold trading strategy, and followed a supervised training scheme. The data set was divided into a training set, a validation set and three test sets; bearish, bullish and horizontal. The best performing model on the validation set was tested on the three test sets. Artificial Neural Networks (ANNs) were subjected to the same data sets in order to benchmark HTM performance. The results suggest that the HTM technology can be used together with a feature vector of technical indicators to create a profitable trading algorithm for financial markets. Results also suggest that HTM performance is, at the very least, comparable to commonly applied neural network models.

    Download full text (pdf)
    fulltext
  • 16.
    Giri, Chandadevi
    et al.
    University of Borås, Department of Business Administration and Textile Management.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Predictive modeling of campaigns to quantify performance in fashion retail industry2019In: Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019, IEEE, 2019, p. 2267-2273Conference paper (Refereed)
    Abstract [en]

    Managing campaigns and promotions effectively is vital for the fashion retail industry. While retailers invest a lot of money in campaigns, customer retention is often very low. At innovative retailers, data-driven methods, aimed at understanding and ultimately optimizing campaigns are introduced. In this application paper, machine learning techniques are employed to analyze data about campaigns and promotions from a leading Swedish e-retailer. More specifically, predictive modeling is used to forecast the profitability and activation of campaigns using different kinds of promotions. In the empirical investigation, regression models are generated to estimate the profitability, and classification models are used to predict the overall success of the campaigns. In both cases, random forests are compared to individual tree models. As expected, the more complex ensembles are more accurate, but the usage of interpretable tree models makes it possible to analyze the underlying relationships, simply by inspecting the trees. In conclusion, the accuracy of the predictive models must be deemed high enough to make these data-driven methods attractive.

  • 17.
    Johansson, Ulf
    et al.
    Department of Information Technology, University of Borås, Borås, Sweden.
    Ahlberg, Ernst
    Drug Safety and Metabolism, AstraZeneca Innovative Medicines and Early Development, Mölndal, Sweden.
    Boström, Henrik
    Department of Systems and Computer Sciences, Stockholm University, Stockholm, Sweden.
    Carlsson, Lars
    Drug Safety and Metabolism, AstraZeneca Innovative Medicines and Early Development, Mölndal, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Sönströd, Cecilia
    Department of Information Technology, University of Borås, Borås, Sweden.
    Handling small calibration sets in mondrian inductive conformal regressors2015In: Statistical Learning and Data Sciences, Springer, 2015, p. 271-280Conference paper (Refereed)
    Abstract [en]

    In inductive conformal prediction, calibration sets must contain an adequate number of instances to support the chosen confidence level. This problem is particularly prevalent when using Mondrian inductive conformal prediction, where the input space is partitioned into independently valid prediction regions. In this study, Mondrian conformal regressors, in the form of regression trees, are used to investigate two problematic aspects of small calibration sets. If there are too few calibration instances to support the significance level, we suggest using either extrapolation or altering the model. In situations where the desired significance level is between two calibration instances, the standard procedure is to choose the more nonconforming one, thus guaranteeing validity, but producing conservative conformal predictors. The suggested solution is to use interpolation between calibration instances. All proposed techniques are empirically evaluated and compared to the standard approach on 30 benchmark data sets. The results show that while extrapolation often results in invalid models, interpolation works extremely well and provides increased efficiency with preserved empirical validity.

  • 18.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Bostrom, H.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Investigating Normalized Conformal Regressors2021In: 2021 IEEE Symposium Series on Computational Intelligence, SSCI 2021 - Proceedings, Institute of Electrical and Electronics Engineers (IEEE), 2021Conference paper (Other academic)
    Abstract [en]

    Conformal prediction can be applied on top of any machine learning predictive regression model, thus turning it into a conformal regressor. Given a significance level $\epsilon$, conformal regressors output valid prediction intervals, i.e., the probability that the interval covers the true value is exactly $1-\epsilon$. To obtain validity, a calibration set that is not used for training the model must be set aside. In standard inductive conformal regression, the size of the prediction intervals is then determined by the absolute error made by the predictive model on a specific instance in the calibration set, where different significance levels correspond to different instances. In this setting, all prediction intervals will have the same size, making the resulting models very unspecific. When adding a technique called normalization, however, the difficulty of each instance is estimated, and the interval sizes are adjusted accordingly. An integral part of normalized conformal regressors is a parameter called $\beta$, which determines the relative importance of the difficulty estimation and the error of the model. In this study, the effects of different underlying models, difficulty estimation functions and $\beta$ -values are investigated. The results from a large empirical study, using twenty publicly available data sets, show that better difficulty estimation functions will lead to both tighter and more specific prediction intervals. Furthermore, it is found that the $\beta$ -values used strongly affect the conformal regressor. While there is no specific $\beta$ -value that will always minimize the interval sizes, lower $\beta$ -values lead to more variation in the interval sizes, i.e., more specific models. In addition, the analysis also identifies that the normalization procedure introduces a small but unfortunate bias in the models. More specifically, normalization using low $\beta$ -values means that smaller intervals are more likely to be erroneous, while the opposite is true for higher $\beta$ -values. © 2021 IEEE.

  • 19.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, H.
    KTH Royal Institute of Technology, Stockholm, Sweden.
    Nguyen, K. A.
    University of Brighton, United Kingdom.
    Luo, Z.
    Royal Holloway, University of London, Egham, United Kingdom.
    Carlsson, L.
    Royal Holloway, University of London, Egham, United Kingdom.
    Preface2022In: Proceedings of Machine Learning Research / [ed] N. Lawrence, ML Research Press , 2022, Vol. 179, p. 1-3Conference paper (Other academic)
  • 20.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Conformal Prediction Using Decision Trees2013Conference paper (Refereed)
    Abstract [en]

    Conformal prediction is a relatively new framework in which the predictive models output sets of predictions with a bound on the error rate, i.e., in a classification context, the probability of excluding the correct class label is lower than a predefined significance level. An investigation of the use of decision trees within the conformal prediction framework is presented, with the overall purpose to determine the effect of different algorithmic choices, including split criterion, pruning scheme and way to calculate the probability estimates. Since the error rate is bounded by the framework, the most important property of conformal predictors is efficiency, which concerns minimizing the number of elements in the output prediction sets. Results from one of the largest empirical investigations to date within the conformal prediction framework are presented, showing that in order to optimize efficiency, the decision trees should be induced using no pruning and with smoothed probability estimates. The choice of split criterion to use for the actual induction of the trees did not turn out to have any major impact on the efficiency. Finally, the experimentation also showed that when using decision trees, standard inductive conformal prediction was as efficient as the recently suggested method cross-conformal prediction. This is an encouraging results since cross-conformal prediction uses several decision trees, thus sacrificing the interpretability of a single decision tree.

    Download full text (pdf)
    fulltext
  • 21.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Linusson, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Regression conformal prediction with random forests2014In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 97, no 1-2, p. 155-176Article in journal (Refereed)
    Abstract [en]

    Regression conformal prediction produces prediction intervals that are valid, i.e., the probability of excluding the correct target value is bounded by a predefined confidence level. The most important criterion when comparing conformal regressors is efficiency; the prediction intervals should be as tight (informative) as possible. In this study, the use of random forests as the underlying model for regression conformal prediction is investigated and compared to existing state-of-the-art techniques, which are based on neural networks and k-nearest neighbors. In addition to their robust predictive performance, random forests allow for determining the size of the prediction intervals by using out-of-bag estimates instead of requiring a separate calibration set. An extensive empirical investigation, using 33 publicly available data sets, was undertaken to compare the use of random forests to existing stateof- the-art conformal predictors. The results show that the suggested approach, on almost all confidence levels and using both standard and normalized nonconformity functions, produced significantly more efficient conformal predictors than the existing alternatives.

    Download full text (pdf)
    fulltext
  • 22.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Gabrielsson, Patrick
    Dept. of Information Technology, University of Borås, Sweden.
    Are Traditional Neural Networks Well-Calibrated?2019In: Proceedings of the International Joint Conference on Neural Networks, IEEE, 2019, Vol. July, article id 8851962Conference paper (Refereed)
    Abstract [en]

    Traditional neural networks are generally considered to be well-calibrated. Consequently, the established best practice is to not try to improve the calibration using general techniques like Platt scaling. In this paper, it is demonstrated, using 25 publicly available two-class data sets, that both single multilayer perceptrons and ensembles of multilayer perceptrons in fact often are poorly calibrated. Furthermore, from the experimental results, it is obvious that the calibration can be significantly improved by using either Platt scaling or Venn-Abers predictors. These results stand in sharp contrast to the standard recommendations for the use of neural networks as probabilistic classifiers. The empirical investigation also shows that for bagged ensembles, it is beneficiary to calibrate on the out-of-bag instances, despite the fact that this leads to using substantially smaller ensembles for the predictions. Finally, an outright comparison between Platt scaling and Venn-Abers predictors shows that the latter most often produced significantly better calibrations, especially when calibrated on out-of-bag instances. 

  • 23.
    Johansson, Ulf
    et al.
    Department of Information Technology, University of Borås, Sweden.
    Konig, R.
    Department of Information Technology, University of Borås, Sweden.
    Brattberg, P.
    Department of Information Technology, University of Borås, Sweden.
    Dahlbom, A.
    School of Informatics, University of Skövde, Sweden.
    Riveiro, Maria
    Department of Information Technology, University of Borås, Sweden.
    Mining trackman golf data2016In: Proceedings - 2015 International Conference on Computational Science and Computational Intelligence, CSCI 2015, IEEE, 2016, p. 380-385Conference paper (Refereed)
    Abstract [en]

    Recently, innovative technology like Trackman has made it possible to generate data describing golf swings. In this application paper, we analyze Trackman data from 275 golfers using descriptive statistics and machine learning techniques. The overall goal is to find non-trivial and general patterns in the data that can be used to identify and explain what separates skilled golfers from poor. Experimental results show that random forest models, generated from Trackman data, were able to predict the handicap of a golfer, with a performance comparable to human experts. Based on interpretable predictive models, descriptive statistics and correlation analysis, the most distinguishing property of better golfers is their consistency. In addition, the analysis shows that better players have superior control of the club head at impact and generally hit the ball straighter. A very interesting finding is that better players also tend to swing flatter. Finally, an outright comparison between data describing the club head movement and ball flight data, indicates that a majority of golfers do not hit the ball solid enough for the basic golf theory to apply.

  • 24.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Linusson, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Rule Extraction with Guaranteed Fidelity2014Conference paper (Refereed)
    Abstract [en]

    This paper extends the conformal prediction framework to rule extraction, making it possible to extract interpretable models from opaque models in a setting where either the infidelity or the error rate is bounded by a predefined significance level. Experimental results on 27 publicly available data sets show that all three setups evaluated produced valid and rather efficient conformal predictors. The implication is that augmenting rule extraction with conformal prediction allows extraction of models where test set errors or test sets infidelities are guaranteed to be lower than a chosen acceptable level. Clearly this is beneficial for both typical rule extraction scenarios, i.e., either when the purpose is to explain an existing opaque model, or when it is to build a predictive model that must be interpretable.

    Download full text (pdf)
    fulltext
  • 25.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Evolved Decision Trees as Conformal Predictors2013Conference paper (Refereed)
    Abstract [en]

    In conformal prediction, predictive models output sets of predictions with a bound on the error rate. In classification, this translates to that the probability of excluding the correct class is lower than a predefined significance level, in the long run. Since the error rate is guaranteed, the most important criterion for conformal predictors is efficiency. Efficient conformal predictors minimize the number of elements in the output prediction sets, thus producing more informative predictions. This paper presents one of the first comprehensive studies where evolutionary algorithms are used to build conformal predictors. More specifically, decision trees evolved using genetic programming are evaluated as conformal predictors. In the experiments, the evolved trees are compared to decision trees induced using standard machine learning techniques on 33 publicly available benchmark data sets, with regard to predictive performance and efficiency. The results show that the evolved trees are generally more accurate, and the corresponding conformal predictors more efficient, than their induced counterparts. One important result is that the probability estimates of decision trees when used as conformal predictors should be smoothed, here using the Laplace correction. Finally, using the more discriminating Brier score instead of accuracy as the optimization criterion produced the most efficient conformal predictions.

    Download full text (pdf)
    fulltext
  • 26.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Increasing Rule Extraction Accuracy by Post-processing GP Trees2008In: Proceedings of the Congress on Evolutionary Computation, IEEE, 2008, p. 3010-3015Conference paper (Refereed)
    Abstract [en]

    Genetic programming (GP), is a very general and efficient technique, often capable of outperforming more specialized techniques on a variety of tasks. In this paper, we suggest a straightforward novel algorithm for post-processing of GP classification trees. The algorithm iteratively, one node at a time, searches for possible modifications that would result in higher accuracy. More specifically, the algorithm for each split evaluates every possible constant value and chooses the best. With this design, the post-processing algorithm can only increase training accuracy, never decrease it. In this study, we apply the suggested algorithm to GP trees, extracted from neural network ensembles. Experimentation, using 22 UCI datasets, shows that the post-processing results in higher test set accuracies on a large majority of datasets. As a matter of fact, for two setups of three evaluated, the increase in accuracy is statistically significant.

    Download full text (pdf)
    fulltext
  • 27.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Using Imaginary Ensembles to Select GP Classifiers2010In: Genetic Programming: 13th European Conference, EuroGP 2010, Istanbul, Turkey, April 7-9, 2010, Proceedings / [ed] A.I. et al. Esparcia-Alcazar, Springer, 2010, p. 278-288Conference paper (Refereed)
    Abstract [en]

    When predictive modeling requires comprehensible models, most data miners will use specialized techniques producing rule sets or decision trees. This study, however, shows that genetically evolved decision trees may very well outperform the more specialized techniques. The proposed approach evolves a number of decision trees and then uses one of several suggested selection strategies to pick one specific tree from that pool. The inherent inconsistency of evolution makes it possible to evolve each tree using all data, and still obtain somewhat different models. The main idea is to use these quite accurate and slightly diverse trees to form an imaginary ensemble, which is then used as a guide when selecting one specific tree. Simply put, the tree classifying the largest number of instances identically to the ensemble is chosen. In the experimentation, using 25 UCI data sets, two selection strategies obtained significantly higher accuracy than the standard rule inducer J48.

    Download full text (pdf)
    fulltext
  • 28.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Post-processing Evolved Decision Trees2009In: Foundations of Computational Intelligence / [ed] Ajith Abraham, Springer, 2009, p. 149-164Chapter in book (Other academic)
    Abstract [en]

    Although Genetic Programming (GP) is a very general technique, it is also quite powerful. As a matter of fact, GP has often been shown to outperform more specialized techniques on a variety of tasks. In data mining, GP has successfully been applied to most major tasks; e.g. classification, regression and clustering. In this chapter, we introduce, describe and evaluate a straightforward novel algorithm for post-processing genetically evolved decision trees. The algorithm works by iteratively, one node at a time, search for possible modifications that will result in higher accuracy. More specifically, the algorithm, for each interior test, evaluates every possible split for the current attribute and chooses the best. With this design, the post-processing algorithm can only increase training accuracy, never decrease it. In the experiments, the suggested algorithm is applied to GP decision trees, either induced directly from datasets, or extracted from neural network ensembles. The experimentation, using 22 UCI datasets, shows that the suggested post-processing technique results in higher test set accuracies on a large majority of the datasets. As a matter of fact, the increase in test accuracy is statistically significant for one of the four evaluated setups, and substantial on two out of the other three.

  • 29.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Linusson, H.
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Boström, H.
    Department of Computer and Systems Sciences, Stockholm University, Sweden.
    Model-agnostic nonconformity functions for conformal classification2017In: Proceedings of the International Joint Conference on Neural Networks, IEEE, 2017, p. 2072-2079Conference paper (Refereed)
    Abstract [en]

    A conformai predictor outputs prediction regions, for classification label sets. The key property of all conformai predictors is that they are valid, i.e., their error rate on novel data is bounded by a preset significance level. Thus, the key performance metric for evaluating conformal predictors is the size of the output prediction regions, where smaller (more informative) prediction regions are said to be more efficient. All conformal predictions rely on nonconformity functions, measuring the strangeness of an input-output pair, and the efficiency depends critically on the quality of the chosen nonconformity function. In this paper, three model-agnostic nonconformity functions, based on well-known loss functions, are evaluated with regard to how they affect efficiency. In the experimentation on 21 publicly available multi-class data sets, both single neural networks and ensembles of neural networks are used as underlying models for conformal classifiers. The results show that the choice of nonconformity function has a major impact on the efficiency, but also that different nonconformity functions should be used depending on the exact efficiency metric. For a high fraction of single-label predictions, a margin-based nonconformity function is the best option, while a nonconformity function based on the hinge loss obtained the smallest label sets on average.

  • 30.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Interpretable regression trees using conformal prediction2018In: Expert systems with applications, ISSN 0957-4174, E-ISSN 1873-6793, Vol. 97, p. 394-404Article in journal (Refereed)
    Abstract [en]

    A key property of conformal predictors is that they are valid, i.e., their error rate on novel data is bounded by a preset level of confidence. For regression, this is achieved by turning the point predictions of the underlying model into prediction intervals. Thus, the most important performance metric for evaluating conformal regressors is not the error rate, but the size of the prediction intervals, where models generating smaller (more informative) intervals are said to be more efficient. State-of-the-art conformal regressors typically utilize two separate predictive models: the underlying model providing the center point of each prediction interval, and a normalization model used to scale each prediction interval according to the estimated level of difficulty for each test instance. When using a regression tree as the underlying model, this approach may cause test instances falling into a specific leaf to receive different prediction intervals. This clearly deteriorates the interpretability of a conformal regression tree compared to a standard regression tree, since the path from the root to a leaf can no longer be translated into a rule explaining all predictions in that leaf. In fact, the model cannot even be interpreted on its own, i.e., without reference to the corresponding normalization model. Current practice effectively presents two options for constructing conformal regression trees: to employ a (global) normalization model, and thereby sacrifice interpretability; or to avoid normalization, and thereby sacrifice both efficiency and individualized predictions. In this paper, two additional approaches are considered, both employing local normalization: the first approach estimates the difficulty by the standard deviation of the target values in each leaf, while the second approach employs Mondrian conformal prediction, which results in regression trees where each rule (path from root node to leaf node) is independently valid. An empirical evaluation shows that the first approach is as efficient as current state-of-the-art approaches, thus eliminating the efficiency vs. interpretability trade-off present in existing methods. Moreover, it is shown that if a validity guarantee is required for each single rule, as provided by the Mondrian approach, a penalty with respect to efficiency has to be paid, but it is only substantial at very high confidence levels.

  • 31.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Producing Implicit Diversity in ANN Ensembles2012Conference paper (Refereed)
    Abstract [en]

    Combining several ANNs into ensembles normally results in a very accurate and robust predictive models. Many ANN ensemble techniques are, however, quite complicated and often explicitly optimize some diversity metric. Unfortunately, the lack of solid validation of the explicit algorithms, at least for classification, makes the use of diversity measures as part of an optimization function questionable. The merits of implicit methods, most notably bagging, are on the other hand experimentally established and well-known. This paper evaluates a number of straightforward techniques for introducing implicit diversity in ANN ensembles, including a novel technique producing diversity by using ANNs with different and slightly randomized link structures. The experimental results, comparing altogether 54 setups and two different ensemble sizes on 30 UCI data sets, show that all methods succeeded in producing implicit diversity, but that the effect on ensemble accuracy varied. Still, most setups evaluated did result in more accurate ensembles, compared to the baseline setup, especially for the larger ensemble size. As a matter of fact, several setups even obtained significantly higher ensemble accuracy than bagging. The analysis also identified that diversity was, relatively speaking, more important for the larger ensembles. Looking specifically at the methods used to increase the implicit diversity, setups using the technique that utilizes the randomized link structures generally produced the most accurate ensembles.

    Download full text (pdf)
    fulltext
  • 32.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Overproduce-and-Select: The Grim Reality2013Conference paper (Refereed)
    Abstract [en]

    Overproduce-and-select (OPAS) is a frequently used paradigm for building ensembles. In static OPAS, a large number of base classifiers are trained, before a subset of the available models is selected to be combined into the final ensemble. In general, the selected classifiers are supposed to be accurate and diverse for the OPAS strategy to result in highly accurate ensembles, but exactly how this is enforced in the selection process is not obvious. Most often, either individual models or ensembles are evaluated, using some performance metric, on available and labeled data. Naturally, the underlying assumption is that an observed advantage for the models (or the resulting ensemble) will carry over to test data. In the experimental study, a typical static OPAS scenario, using a pool of artificial neural networks and a number of very natural and frequently used performance measures, is evaluated on 22 publicly available data sets. The discouraging result is that although a fairly large proportion of the ensembles obtained higher test set accuracies, compared to using the entire pool as the ensemble, none of the selection criteria could be used to identify these highly accurate ensembles. Despite only investigating a specific scenario, we argue that the settings used are typical for static OPAS, thus making the results general enough to question the entire paradigm.

    Download full text (pdf)
    fulltext
  • 33.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Random Brains2013Conference paper (Refereed)
    Abstract [en]

    In this paper, we introduce and evaluate a novel method, called random brains, for producing neural network ensembles. The suggested method, which is heavily inspired by the random forest technique, produces diversity implicitly by using bootstrap training and randomized architectures. More specifically, for each base classifier multilayer perceptron, a number of randomly selected links between the input layer and the hidden layer are removed prior to training, thus resulting in potentially weaker but more diverse base classifiers. The experimental results on 20 UCI data sets show that random brains obtained significantly higher accuracy and AUC, compared to standard bagging of similar neural networks not utilizing randomized architectures. The analysis shows that the main reason for the increased ensemble performance is the ability to produce effective diversity, as indicated by the increase in the difficulty diversity measure.

    Download full text (pdf)
    fulltext
  • 34.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Empirically Investigating the Importance of Diversity2007Conference paper (Refereed)
  • 35.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Evaluating Standard Techniques for Implicit Diversity2008In: Advances in Knowledge Discovery and Data Mining, Springer, 2008, p. 613-622Conference paper (Refereed)
  • 36.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    The Importance of Diversity in Neural Network Ensembles: An Empirical Investigation2007Conference paper (Refereed)
    Abstract [en]

    When designing ensembles, it is almost an axiom that the base classifiers must be diverse in order for the ensemble to generalize well. Unfortunately, there is no clear definition of the key term diversity, leading to several diversity measures and many, more or less ad hoc, methods for diversity creation in ensembles. In addition, no specific diversity measure has shown to have a high correlation with test set accuracy. The purpose of this paper is to empirically evaluate ten different diversity measures, using neural network ensembles and 11 publicly available data sets. The main result is that all diversity measures evaluated, in this study too, show low or very low correlation with test set accuracy. Having said that, two measures; double fault and difficulty show slightly higher correlations compared to the other measures. The study furthermore shows that the correlation between accuracy measured on training or validation data and test set accuracy also is rather low. These results challenge ensemble design techniques where diversity is explicitly maximized or where ensemble accuracy on a hold-out set is used for optimization.

  • 37.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Sundell, Håkan
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Venn predictors using lazy learners2018In: Proceedings of the 2018 International Conference on Data Science, ICDATA'18 / [ed] R. Stahlbock, G. M. Weiss & M. Abou-Nasr, CSREA Press, 2018, p. 220-226Conference paper (Refereed)
    Abstract [en]

    Probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. Venn predictors, which can be used on top of any classifier, are automatically valid multiprobability predictors, making them extremely suitable for probabilistic classification. A Venn predictor outputs multiple probabilities for each label, so the predicted label is associated with a probability interval. While all Venn predictors are valid, their accuracy and the size of the probability interval are dependent on both the underlying model and some interior design choices. Specifically, all Venn predictors use so called Venn taxonomies for dividing the instances into a number of categories, each such taxonomy defining a different Venn predictor. A frequently used, but very basic taxonomy, is to categorize the instances based on their predicted label. In this paper, we investigate some more finegrained taxonomies, that use not only the predicted label but also some measures related to the confidence in individual predictions. The empirical investigation, using 22 publicly available data sets and lazy learners (kNN) as the underlying models, showed that the probability estimates from the Venn predictors, as expected, were extremely well-calibrated. Most importantly, using the basic (i.e., label-based) taxonomy produced significantly more accurate and informative Venn predictors compared to the more complex alternatives. In addition, the results also showed that when using lazy learners as underlying models, a transductive approach significantly outperformed an inductive, with regard to accuracy and informativeness. This result is in contrast to previous studies, where other underlying models were used.

    Download full text (pdf)
    fulltext
  • 38.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Locally Induced Predictive Models2011Conference paper (Refereed)
    Abstract [en]

    Most predictive modeling techniques utilize all available data to build global models. This is despite the wellknown fact that for many problems, the targeted relationship varies greatly over the input space, thus suggesting that localized models may improve predictive performance. In this paper, we suggest and evaluate a technique inducing one predictive model for each test instance, using only neighboring instances. In the experimentation, several different variations of the suggested algorithm producing localized decision trees and neural network models are evaluated on 30 UCI data sets. The main result is that the suggested approach generally yields better predictive performance than global models built using all available training data. As a matter of fact, all techniques producing J48 trees obtained significantly higher accuracy and AUC, compared to the global J48 model. For RBF network models, with their inherent ability to use localized information, the suggested approach was only successful with regard to accuracy, while global RBF models had a better ranking ability, as seen by their generally higher AUCs.

    Download full text (pdf)
    fulltext
  • 39.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Well-calibrated and specialized probability estimation trees2020In: Proceedings of the 2020 SIAM International Conference on Data Mining, SDM 2020 / [ed] C. Demeniconi and N. Chawla, Society for Industrial and Applied Mathematics, 2020, p. 415-423Conference paper (Refereed)
    Abstract [en]

    In many predictive modeling scenarios, the production set inputs that later will be used for the actual prediction is available and could be utilized in the modeling process. In fact, many predictive models are generated with an existing production set in mind. Despite this, few approaches utilize this information in order to produce models optimized on the production set at hand. If these models need to be comprehensible, the oracle coaching framework can be applied, often resulting in interpretable models, e.g., decision trees and rule sets, with accuracies on par with opaque models like neural networks and ensembles, on the specific production set. In oracle coaching, a strong but opaque predictive model is used to label instances, including the production set, which are later learned by a weaker but interpretable model. In this paper, oracle coaching is, for the first time, used for improving the calibration of probabilistic predictors. More specifically, setups where oracle coaching are combined with the techniques Platt scaling, isotonic regression and Venn-Abers are suggested and evaluated for calibrating probability estimation trees (PETs). A key contribution is the setup designs ensuring that the oracle-coached PETs, that per definition utilize knowledge about production data, remain well-calibrated. In the experimentation, using 23 publicly available data sets, it is shown that oracle-coached models are not only more accurate, but also significantly better calibrated, compared to standard induction. Interestingly enough, this holds both for the uncalibrated PETs, and for all calibration techniques evaluated, i.e., Platt scaling, isotonic regression and Venn-Abers. As expected, all three external techniques significantly improved the calibration of the original PETs. Finally, an outright comparison between the three external calibration techniques showed that Venn-Abers significantly outperformed the alternatives in most setups.

  • 40.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, H.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden.
    Conformal Predictive Distribution Trees2023In: Annals of Mathematics and Artificial Intelligence, ISSN 1012-2443, E-ISSN 1573-7470Article in journal (Refereed)
    Abstract [en]

    Being able to understand the logic behind predictions or recommendations on the instance level is at the heart of trustworthy machine learning models. Inherently interpretable models make this possible by allowing inspection and analysis of the model itself, thus exhibiting the logic behind each prediction, while providing an opportunity to gain insights about the underlying domain. Another important criterion for trustworthiness is the model’s ability to somehow communicate a measure of confidence in every specific prediction or recommendation. Indeed, the overall goal of this paper is to produce highly informative models that combine interpretability and algorithmic confidence. For this purpose, we introduce conformal predictive distribution trees, which is a novel form of regression trees where each leaf contains a conformal predictive distribution. Using this representation language, the proposed approach allows very versatile analyses of individual leaves in the regression trees. Specifically, depending on the chosen level of detail, the leaves, in addition to the normal point predictions, can provide either cumulative distributions or prediction intervals that are guaranteed to be well-calibrated. In the empirical evaluation, the suggested conformal predictive distribution trees are compared to the well-established conformal regressors, thus demonstrating the benefits of the enhanced representation.

  • 41.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, H.
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden.
    Well-Calibrated and Sharp Interpretable Multi-Class Models2021In: Lecture Notes in Computer Science: Modeling Decisions for Artificial Intelligence / [ed] V. Torra & Y. Narukawa, Springer Science and Business Media Deutschland GmbH , 2021, Vol. 12898, p. 193-204Conference paper (Refereed)
    Abstract [en]

    Interpretable models make it possible to understand individual predictions, and are in many domains considered mandatory for user acceptance and trust. If coupled with communicated algorithmic confidence, interpretable models become even more informative, also making it possible to assess and compare the confidence expressed by the models in different predictions. To earn a user’s appropriate trust, however, the communicated algorithmic confidence must also be well-calibrated. In this paper, we suggest a novel way of extending Venn-Abers predictors to multi-class problems. The approach is applied to decision trees, providing well-calibrated probability intervals in the leaves. The result is one interpretable model with valid and sharp probability intervals, ready for inspection and analysis. In the experimentation, the proposed method is verified using 20 publicly available data sets showing that the generated models are indeed well-calibrated.

  • 42.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Calibrating multi-class models2021In: Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications / [ed] Lars Carlsson, Zhiyuan Luo, Giovanni Cherubin, Khuong An Nguyen, PMLR , 2021, Vol. 152, p. 111-130Conference paper (Refereed)
    Abstract [en]

    Predictive models communicating algorithmic confidence are very informative, but only if well-calibrated and sharp, i.e., providing accurate probability estimates adjusted for each instance. While almost all machine learning algorithms are able to produce probability estimates, these are often poorly calibrated, thus requiring external calibration. For multiclass problems, external calibration has typically been done using one-vs-all or all-vs-all schemes, thus adding to the computational complexity, but also making it impossible to analyze and inspect the predictive models. In this paper, we suggest a novel approach for calibrating inherently multi-class models. Instead of providing a probability distribution over all labels, the estimation is of the probability that the class label predicted by the underlying model is correct. In an extensive empirical study, it is shown that the suggested approach, when applied to both Platt scaling and Venn-Abers, is able to improve the probability estimates from decision trees, random forests and extreme gradient boosting.

  • 43.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Calibrating probability estimation trees using Venn-Abers predictors2019In: SIAM International Conference on Data Mining, SDM 2019, Society for Industrial and Applied Mathematics, 2019, p. 28-36Conference paper (Refereed)
    Abstract [en]

    Class labels output by standard decision trees are not very useful for making informed decisions, e.g., when comparing the expected utility of various alternatives. In contrast, probability estimation trees (PETs) output class probability distributions rather than single class labels. It is well known that estimating class probabilities in PETs by relative frequencies often lead to extreme probability estimates, and a number of approaches to provide more well-calibrated estimates have been proposed. In this study, a recent model-agnostic calibration approach, called Venn-Abers predictors is, for the first time, considered in the context of decision trees. Results from a large-scale empirical investigation are presented, comparing the novel approach to previous calibration techniques with respect to several different performance metrics, targeting both predictive performance and reliability of the estimates. All approaches are considered both with and without Laplace correction. The results show that using Venn-Abers predictors for calibration is a highly competitive approach, significantly outperforming Platt scaling, Isotonic regression and no calibration, with respect to almost all performance metrics used, independently of whether Laplace correction is applied or not. The only exception is AUC, where using non-calibrated PETs together with Laplace correction, actually is the best option, which can be explained by the fact that AUC is not affected by the absolute, but only relative, values of the probability estimates. 

  • 44.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Sönströd, Cecilia
    Dept. of Information Technology, University of Borås, Sweden.
    Interpretable and Specialized Conformal Predictors2019In: Conformal and Probabilistic Prediction and Applications / [ed] Alex Gammerman, Vladimir Vovk, Zhiyuan Luo, Evgueni Smirnov, 2019, p. 3-22Conference paper (Refereed)
    Abstract [en]

    In real-world scenarios, interpretable models are often required to explain predictions, and to allow for inspection and analysis of the model. The overall purpose of oracle coaching is to produce highly accurate, but interpretable, models optimized for a specific test set. Oracle coaching is applicable to the very common scenario where explanations and insights are needed for a specific batch of predictions, and the input vectors for this test set are available when building the predictive model. In this paper, oracle coaching is used for generating underlying classifiers for conformal prediction. The resulting conformal classifiers output valid label sets, i.e., the error rate on the test data is bounded by a preset significance level, as long as the labeled data used for calibration is exchangeable with the test set. Since validity is guaranteed for all conformal predictors, the key performance metric is efficiency, i.e., the size of the label sets, where smaller sets are more informative. The main contribution of this paper is the design of setups making sure that when oracle-coached decision trees, that per definition utilize knowledge about test data, are used as underlying models for conformal classifiers, the exchangeability between calibration and test data is maintained. Consequently, the resulting conformal classifiers retain the validity guarantees. In the experimentation, using a large number of publicly available data sets, the validity of the suggested setups is empirically demonstrated. Furthermore, the results show that the more accurate underlying models produced by oracle coaching also improved the efficiency of the corresponding conformal classifiers.

    Download full text (pdf)
    fulltext
  • 45.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Linusson, Henrik
    Högskolan i Borås, Department of Information Technology, Borås, Sweden.
    Boström, Henrik
    The Royal Institute of Technology (KTH), School of Electrical Engineering and Computer Science, Stockholm, Sweden.
    Efficient Venn Predictors using Random Forests2019In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 108, no 3, p. 535-550Article in journal (Refereed)
    Abstract [en]

    Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. In addition, a probabilistic classifier must, of course, also be as accurate as possible. In this paper, Venn predictors, and its special case Venn-Abers predictors, are evaluated for probabilistic classification, using random forests as the underlying models. Venn predictors output multiple probabilities for each label, i.e., the predicted label is associated with a probability interval. Since all Venn predictors are valid in the long run, the size of the probability intervals is very important, with tighter intervals being more informative. The standard solution when calibrating a classifier is to employ an additional step, transforming the outputs from a classifier into probability estimates, using a labeled data set not employed for training of the models. For random forests, and other bagged ensembles, it is, however, possible to use the out-of-bag instances for calibration, making all training data available for both model learning and calibration. This procedure has previously been successfully applied to conformal prediction, but was here evaluated for the first time for Venn predictors. The empirical investigation, using 22 publicly available data sets, showed that all four versions of the Venn predictors were better calibrated than both the raw estimates from the random forest, and the standard techniques Platt scaling and isotonic regression. Regarding both informativeness and accuracy, the standard Venn predictor calibrated on out-of-bag instances was the best setup evaluated. Most importantly, calibrating on out-of-bag instances, instead of using a separate calibration set, resulted in tighter intervals and more accurate models on every data set, for both the Venn predictors and the Venn-Abers predictors.

  • 46.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Ståhl, Niclas
    Jönköping University, School of Engineering, JTH, Department of Computing.
    Well-Calibrated Rule Extractors2022In: Proceedings of the Eleventh Symposium on Conformal and Probabilistic Prediction with Applications: Volume 179: Conformal and Probabilistic Prediction with Applications, 24-26 August 2022, Brighton, UK / [ed] U. Johansson, H. Boström, K. A. Nguyen, Z. Luo & L. Carlsson, ML Research Press , 2022, Vol. 179, p. 72-91Conference paper (Refereed)
    Abstract [en]

    While explainability is widely considered necessary for trustworthy predictive models, most explanation modules give only a limited understanding of the reasoning behind the predictions. In pedagogical rule extraction, an opaque model is approximated with a transparent model induced using original training instances, but with the predictions from the opaque model as targets. The result is an interpretable model revealing the exact reasoning used for every possible prediction. The pedagogical approach can be applied to any opaque model and use any learning algorithm producing transparent models as the actual rule extractor. Unfortunately, even if the extracted model is induced to mimic the opaque, test set fidelity may still be poor, thus clearly limiting the value of using the extracted model for explanations and analyses. In this paper, it is suggested to alleviate this problem by extracting probabilistic predictors with well-calibrated fitness estimates. For the calibration, Venn-Abers with its unique validity guarantees, is employed. Using a setup where decision trees are extracted from MLP neural networks, the suggested approach is first demonstrated in detail on one real-world data set. After that, a large-scale empirical evaluation using 25 publicly available benchmark data sets is presented. The results show that the method indeed extracts interpretable models with well-calibrated fitness estimates, i.e., the extracted model can be used for explaining the opaque. Specifically, in the setup used, every leaf in a decision tree contains a label and a well-calibrated probability interval for the fidelity. Consequently, a user could, in addition to obtaining explanations of individual predictions, find the parts of feature space where the decision tree is a good approximation of the MLP and not. In fact, using the sizes of the probability intervals, the models also provide an indication of how certain individual fitness estimates are.

  • 47.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Sundell, Håkan
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Gidenstam, Anders
    Department of Information Technology, University of Borås, Sweden.
    Boström, Henrik
    School of Information and Communication Technology, Royal Institute of Technology, Sweden.
    Venn predictors for well-calibrated probability estimation trees2018In: Conformal and Probabilistic Prediction and Applications / [ed] A. Gammerman, V. Vovk, Z. Luo, E. Smirnov, & R. Peeters, 2018, p. 3-14Conference paper (Refereed)
    Abstract [en]

    Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. The standard solution is to employ an additional step, transforming the outputs from a classifier into probability estimates. In this paper, Venn predictors are compared to Platt scaling and isotonic regression, for the purpose of producing well-calibrated probabilistic predictions from decision trees. The empirical investigation, using 22 publicly available data sets, showed that the probability estimates from the Venn predictor were extremely well-calibrated. In fact, in a direct comparison using the accepted reliability metric, the Venn predictor estimates were the most exact on every data set.

    Download full text (pdf)
    fulltext
  • 48.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Sönströd, Cecilia
    Jönköping University, School of Engineering, JTH, Department of Computing.
    Löfström, Helena
    Jönköping University, Jönköping International Business School.
    Conformal Prediction for Accuracy Guarantees in Classification with Reject Option2023In: Modeling Decisions for Artificial Intelligence: 20th International Conference, MDAI 2023, Umeå, Sweden, June 19–22, 2023, Proceedings / [ed] V. Torra and Y. Narukawa, Springer, 2023, p. 133-145Conference paper (Refereed)
    Abstract [en]

    A standard classifier is forced to predict the label of every test instance, even when confidence in the predictions is very low. In many scenarios, it would, however, be better to avoid making these predictions, maybe leaving them to a human expert. A classifier with that alternative is referred to as a classifier with reject option. In this paper, we propose an algorithm that, for a particular data set, automatically suggests a number of accuracy levels, which it will be able to meet perfectly, using a classifier with reject option. Since the basis of the suggested algorithm is conformal prediction, it comes with strong validity guarantees. The experimentation, using 25 publicly available two-class data sets, confirms that the algorithm obtains empirical accuracies very close to the requested levels. In addition, in an outright comparison with probabilistic predictors, including models calibrated with Platt scaling, the suggested algorithm clearly outperforms the alternatives.

  • 49.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Sonstrod, C.
    Dept. of Information Technology, University of Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Department of Computing, Jönköping AI Lab (JAIL).
    Bostrom, H.
    School of Electrical Engineering and Computer Science, Kth Royal Institute of Technology, Sweden.
    Customized interpretable conformal regressors2019In: Proceedings - 2019 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2019, Institute of Electrical and Electronics Engineers (IEEE), 2019, p. 221-230, article id 8964179Conference paper (Refereed)
    Abstract [en]

    Interpretability is recognized as a key property of trustworthy predictive models. Only interpretable models make it straightforward to explain individual predictions, and allow inspection and analysis of the model itself. In real-world scenarios, these explanations and insights are often needed for a specific batch of predictions, i.e., a production set. If the input vectors for this production set are available when generating the predictive model, a methodology called oracle coaching can be used to produce highly accurate and interpretable models optimized for the specific production set. In this paper, oracle coaching is, for the first time, combined with the conformal prediction framework for predictive regression. A conformal regressor, which is built on top of a standard regression model, outputs valid prediction intervals, i.e., the error rate on novel data is bounded by a preset significance level, as long as the labeled data used for calibration is exchangeable with production data. Since validity is guaranteed for all conformal predictors, the key performance metric is the size of the prediction intervals, where tighter (more efficient) intervals are preferred. The efficiency of a conformal model depends on several factors, but more accurate underlying models will generally also lead to improved efficiency in the corresponding conformal predictor. A key contribution in this paper is the design of setups ensuring that when oracle coached regression trees, that per definition utilize knowledge about production data, are used as underlying models for conformal regressors, these remain valid. The experiments, using 20 publicly available regression data sets, demonstrate the validity of the suggested setups. Results also show that utilizing oracle-coached underlying models will generally lead to significantly more efficient conformal regressors, compared to when these are built on top of models induced using only training data. 

  • 50.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Sundström, Malin
    Högskolan i Borås, Akademin för textil, teknik och ekonomi.
    Sundell, Håkan
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Rickard, König
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Jenny, Balkow
    Högskolan i Borås, Akademin för textil, teknik och ekonomi.
    Dataanalys för ökad kundförståelse2016Report (Other (popular science, discussion, etc.))
12 1 - 50 of 95
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf