Change search
Refine search result
12 1 - 50 of 64
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ahlberg, Ernst
    et al.
    Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Winiwarter, Susanne
    Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Boström, Henrik
    Department of Computer and Systems Sciences, Stockholm University, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Norinder, Ulf
    Swetox, Karolinska Institutet, Unit of Toxicology Sciences, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Engkvist, Ola
    External Sciences, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Hammar, Oscar
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Bendtsen, Claus
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Cambridge, UK.
    Carlsson, Lars
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Using conformal prediction to prioritize compound synthesis in drug discovery2017In: Proceedings of Machine Learning Research: Volume 60: Conformal and Probabilistic Prediction and Applications, 13-16 June 2017, Stockholm, Sweden / [ed] Alex Gammerman, Vladimir Vovk, Zhiyuan Luo, and Harris Papadopoulos, Machine Learning Research , 2017, p. 174-184Conference paper (Refereed)
    Abstract [en]

    The choice of how much money and resources to spend to understand certain problems is of high interest in many areas. This work illustrates how computational models can be more tightly coupled with experiments to generate decision data at lower cost without reducing the quality of the decision. Several different strategies are explored to illustrate the trade off between lowering costs and quality in decisions.

    AUC is used as a performance metric and the number of objects that can be learnt from is constrained. Some of the strategies described reach AUC values over 0.9 and outperforms strategies that are more random. The strategies that use conformal predictor p-values show varying results, although some are top performing.

    The application studied is taken from the drug discovery process. In the early stages of this process compounds, that potentially could become marketed drugs, are being routinely tested in experimental assays to understand the distribution and interactions in humans.

  • 2.
    Boström, Henrik
    et al.
    Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Löfström, Tuve
    Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics. Jönköping University, School of Engineering, JTH. Research area Computer Science and Informatics.
    Evaluation of a variance-based nonconformity measure for regression forests2016In: Conformal and Probabilistic Prediction with Applications, Springer, 2016, p. 75-89Conference paper (Refereed)
    Abstract [en]

    In a previous large-scale empirical evaluation of conformal regression approaches, random forests using out-of-bag instances for calibration together with a k-nearest neighbor-based nonconformity measure, was shown to obtain state-of-the-art performance with respect to efficiency, i.e., average size of prediction regions. However, the use of the nearest-neighbor procedure not only requires that all training data have to be retained in conjunction with the underlying model, but also that a significant computational overhead is incurred, during both training and testing. In this study, a more straightforward nonconformity measure is investigated, where the difficulty estimate employed for normalization is based on the variance of the predictions made by the trees in a forest. A large-scale empirical evaluation is presented, showing that both the nearest-neighbor-based and the variance-based measures significantly outperform a standard (non-normalized) nonconformity measure, while no significant difference in efficiency between the two normalized approaches is observed. Moreover, the evaluation shows that state-of-theart performance is achieved by the variance-based measure at a computational cost that is several orders of magnitude lower than when employing the nearest-neighbor-based nonconformity measure. 

  • 3.
    Boström, Henrik
    et al.
    Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Accelerating difficulty estimation for conformal regression forests2017In: Annals of Mathematics and Artificial Intelligence, ISSN 1012-2443, E-ISSN 1573-7470, Vol. 81, no 1-2, p. 125-144Article in journal (Refereed)
    Abstract [en]

    The conformal prediction framework allows for specifying the probability of making incorrect predictions by a user-provided confidence level. In addition to a learning algorithm, the framework requires a real-valued function, called nonconformity measure, to be specified. The nonconformity measure does not affect the error rate, but the resulting efficiency, i.e., the size of output prediction regions, may vary substantially. A recent large-scale empirical evaluation of conformal regression approaches showed that using random forests as the learning algorithm together with a nonconformity measure based on out-of-bag errors normalized using a nearest-neighbor-based difficulty estimate, resulted in state-of-the-art performance with respect to efficiency. However, the nearest-neighbor procedure incurs a significant computational cost. In this study, a more straightforward nonconformity measure is investigated, where the difficulty estimate employed for normalization is based on the variance of the predictions made by the trees in a forest. A large-scale empirical evaluation is presented, showing that both the nearest-neighbor-based and the variance-based measures significantly outperform a standard (non-normalized) nonconformity measure, while no significant difference in efficiency between the two normalized approaches is observed. The evaluation moreover shows that the computational cost of the variance-based measure is several orders of magnitude lower than when employing the nearest-neighbor-based nonconformity measure. The use of out-of-bag instances for calibration does, however, result in nonconformity scores that are distributed differently from those obtained from test instances, questioning the validity of the approach. An adjustment of the variance-based measure is presented, which is shown to be valid and also to have a significant positive effect on the efficiency. For conformal regression forests, the variance-based nonconformity measure is hence a computationally efficient and theoretically well-founded alternative to the nearest-neighbor procedure. 

  • 4.
    Buendia, Ruben
    et al.
    Department of Information Technology, University of Borås, Borås, Sweden.
    Kogej, Thierry
    Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Engkvist, Ola
    Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Carlsson, Lars
    Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Toccaceli, Paolo
    Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom.
    Ahlberg, Ernst
    Data Science and AI, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Accurate Hit Estimation for Iterative Screening Using Venn-ABERS Predictors2019In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 59, no 3, p. 1230-1237Article in journal (Refereed)
    Abstract [en]

    Iterative screening has emerged as a promising approach to increase the efficiency of high-throughput screening (HTS) campaigns in drug discovery. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models. One of the challenges of iterative screening is to decide how many iterations to perform. This is mainly related to difficulties in estimating the prospective hit rate in any given iteration. In this article, a novel method based on Venn - ABERS predictors is proposed. The method provides accurate estimates of the number of hits retrieved in any given iteration during an HTS campaign. The estimates provide the necessary information to support the decision on the number of iterations needed to maximize the screening outcome. Thus, this method offers a prospective screening strategy for early-stage drug discovery.

  • 5.
    Carlsson, Lars
    et al.
    Drug Safety and Metabolism, AstraZeneca Innovative Medicines and Early Development, Mölndal, Sweden.
    Ahlberg, Ernst
    Drug Safety and Metabolism, AstraZeneca Innovative Medicines and Early Development, Mölndal, Sweden.
    Boström, Henrik
    Department of Systems and Computer Sciences, Stockholm University, Stockholm, Sweden.
    Johansson, Ulf
    School of Business and IT, University of Borås, Borås, Sweden.
    Linusson, Henrik
    School of Business and IT, University of Borås, Borås, Sweden.
    Modifications to p-Values of conformal predictors2015In: Statistical learning and data sciences, Springer, 2015, p. 251-259Conference paper (Refereed)
    Abstract [en]

    The original definition of a p-value in a conformal predictor can sometimes lead to too conservative prediction regions when the number of training or calibration examples is small. The situation can be improved by using a modification to define an approximate p-value. Two modified p-values are presented that converges to the original p-value as the number of training or calibration examples goes to infinity. Numerical experiments empirically support the use of a p-value we call the interpolated p-value for conformal prediction. The interpolated p-value seems to be producing prediction sets that have an error rate which corresponds well to the prescribed significance level.

  • 6.
    Dahlbom, Anders
    et al.
    Högskolan i Skövde.
    Riveiro, Maria
    Högskolan i Skövde, Institutionen för informationsteknologi. Högskolan i Skövde, Forskningscentrum för Informationsteknologi.
    König, Rikard
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Johansson, Ulf
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Brattberg, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Supporting Golf Coaching with 3D Modeling of Swings2014In: Sportinformatik X: Jahrestagung der dvs-Sektion Sportinformatik, Hamburg: Feldhaus Verlag , 2014, 10, p. 142-148Chapter in book (Refereed)
  • 7.
    Gabrielsson, Patrick
    et al.
    Department of Information Technology, University of Borås, Sweden.
    Johansson, Ulf
    Department of Information Technology, University of Borås, Sweden.
    High-frequency equity index futures trading using recurrent reinforcement learning with candlesticks2015In: Proceedings - 2015 IEEE Symposium Series on Computational Intelligence, SSCI 2015, IEEE, 2015, p. 734-741Conference paper (Refereed)
    Abstract [en]

    In 1997, Moody and Wu presented recurrent reinforcement learning (RRL) as a viable machine learning method within algorithmic trading. Subsequent research has shown a degree of controversy with regards to the benefits of incorporating technical indicators in the recurrent reinforcement learning framework. In 1991, Nison introduced Japanese candlesticks to the global research community as an alternative to employing traditional indicators within the technical analysis of financial time series. The literature accumulated over the past two and a half decades of research contains conflicting results with regards to the utility of using Japanese candlestick patterns to exploit inefficiencies in financial time series. In this paper, we combine features based on Japanese candlesticks with recurrent reinforcement learning to produce a high-frequency algorithmic trading system for the E-mini S&P 500 index futures market. Our empirical study shows a statistically significant increase in both return and Sharpe ratio compared to relevant benchmarks, suggesting the existence of exploitable spatio-Temporal structure in Japanese candlestick patterns and the ability of recurrent reinforcement learning to detect and take advantage of this structure in a high-frequency equity index futures trading environment.

  • 8.
    Gabrielsson, Patrick
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Co-Evolving Online High-Frequency Trading Strategies Using Grammatical Evolution2014Conference paper (Refereed)
    Abstract [en]

    Numerous sophisticated algorithms exist for discovering reoccurring patterns in financial time series. However, the most accurate techniques available produce opaque models, from which it is impossible to discern the rationale behind trading decisions. It is therefore desirable to sacrifice some degree of accuracy for transparency. One fairly recent evolutionary computational technology that creates transparent models, using a user-specified grammar, is grammatical evolution (GE). In this paper, we explore the possibility of evolving transparent entry- and exit trading strategies for the E-mini S&P 500 index futures market in a high-frequency trading environment using grammatical evolution. We compare the performance of models incorporating risk into their calculations with models that do not. Our empirical results suggest that profitable, risk-averse, transparent trading strategies for the E-mini S&P 500 can be obtained using grammatical evolution together with technical indicators.

  • 9.
    Gabrielsson, Patrick
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Evolving Hierarchical Temporal Memory-Based Trading Models2013Conference paper (Refereed)
    Abstract [en]

    We explore the possibility of using the genetic algorithm to optimize trading models based on the Hierarchical Temporal Memory (HTM) machine learning technology. Technical indicators, derived from intraday tick data for the E-mini S&P 500 futures market (ES), were used as feature vectors to the HTM models. All models were configured as binary classifiers, using a simple buy-and-hold trading strategy, and followed a supervised training scheme. The data set was partitioned into multiple folds to enable a modified cross validation scheme. Artificial Neural Networks (ANNs) were used to benchmark HTM performance. The results show that the genetic algorithm succeeded in finding predictive models with good performance and generalization ability. The HTM models outperformed the neural network models on the chosen data set and both technologies yielded profitable results with above average accuracy.

  • 10.
    Gabrielsson, Patrick
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Hierarchical Temporal Memory-based algorithmic trading of financial markets2012Conference paper (Refereed)
    Abstract [en]

    This paper explores the possibility of using the Hierarchical Temporal Memory (HTM) machine learning technology to create a profitable software agent for trading financial markets. Technical indicators, derived from intraday tick data for the E-mini S&P 500 futures market (ES), were used as features vectors to the HTM models. All models were configured as binary classifiers, using a simple buy-and-hold trading strategy, and followed a supervised training scheme. The data set was divided into a training set, a validation set and three test sets; bearish, bullish and horizontal. The best performing model on the validation set was tested on the three test sets. Artificial Neural Networks (ANNs) were subjected to the same data sets in order to benchmark HTM performance. The results suggest that the HTM technology can be used together with a feature vector of technical indicators to create a profitable trading algorithm for financial markets. Results also suggest that HTM performance is, at the very least, comparable to commonly applied neural network models.

  • 11.
    Johansson, Ulf
    et al.
    Department of Information Technology, University of Borås, Borås, Sweden.
    Ahlberg, Ernst
    Drug Safety and Metabolism, AstraZeneca Innovative Medicines and Early Development, Mölndal, Sweden.
    Boström, Henrik
    Department of Systems and Computer Sciences, Stockholm University, Stockholm, Sweden.
    Carlsson, Lars
    Drug Safety and Metabolism, AstraZeneca Innovative Medicines and Early Development, Mölndal, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Borås, Sweden.
    Sönströd, Cecilia
    Department of Information Technology, University of Borås, Borås, Sweden.
    Handling small calibration sets in mondrian inductive conformal regressors2015In: Statistical Learning and Data Sciences, Springer, 2015, p. 271-280Conference paper (Refereed)
    Abstract [en]

    In inductive conformal prediction, calibration sets must contain an adequate number of instances to support the chosen confidence level. This problem is particularly prevalent when using Mondrian inductive conformal prediction, where the input space is partitioned into independently valid prediction regions. In this study, Mondrian conformal regressors, in the form of regression trees, are used to investigate two problematic aspects of small calibration sets. If there are too few calibration instances to support the significance level, we suggest using either extrapolation or altering the model. In situations where the desired significance level is between two calibration instances, the standard procedure is to choose the more nonconforming one, thus guaranteeing validity, but producing conservative conformal predictors. The suggested solution is to use interpolation between calibration instances. All proposed techniques are empirically evaluated and compared to the standard approach on 30 benchmark data sets. The results show that while extrapolation often results in invalid models, interpolation works extremely well and provides increased efficiency with preserved empirical validity.

  • 12.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Conformal Prediction Using Decision Trees2013Conference paper (Refereed)
    Abstract [en]

    Conformal prediction is a relatively new framework in which the predictive models output sets of predictions with a bound on the error rate, i.e., in a classification context, the probability of excluding the correct class label is lower than a predefined significance level. An investigation of the use of decision trees within the conformal prediction framework is presented, with the overall purpose to determine the effect of different algorithmic choices, including split criterion, pruning scheme and way to calculate the probability estimates. Since the error rate is bounded by the framework, the most important property of conformal predictors is efficiency, which concerns minimizing the number of elements in the output prediction sets. Results from one of the largest empirical investigations to date within the conformal prediction framework are presented, showing that in order to optimize efficiency, the decision trees should be induced using no pruning and with smoothed probability estimates. The choice of split criterion to use for the actual induction of the trees did not turn out to have any major impact on the efficiency. Finally, the experimentation also showed that when using decision trees, standard inductive conformal prediction was as efficient as the recently suggested method cross-conformal prediction. This is an encouraging results since cross-conformal prediction uses several decision trees, thus sacrificing the interpretability of a single decision tree.

  • 13.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Linusson, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Regression conformal prediction with random forests2014In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 97, no 1-2, p. 155-176Article in journal (Refereed)
    Abstract [en]

    Regression conformal prediction produces prediction intervals that are valid, i.e., the probability of excluding the correct target value is bounded by a predefined confidence level. The most important criterion when comparing conformal regressors is efficiency; the prediction intervals should be as tight (informative) as possible. In this study, the use of random forests as the underlying model for regression conformal prediction is investigated and compared to existing state-of-the-art techniques, which are based on neural networks and k-nearest neighbors. In addition to their robust predictive performance, random forests allow for determining the size of the prediction intervals by using out-of-bag estimates instead of requiring a separate calibration set. An extensive empirical investigation, using 33 publicly available data sets, was undertaken to compare the use of random forests to existing stateof- the-art conformal predictors. The results show that the suggested approach, on almost all confidence levels and using both standard and normalized nonconformity functions, produced significantly more efficient conformal predictors than the existing alternatives.

  • 14.
    Johansson, Ulf
    et al.
    Department of Information Technology, University of Borås, Sweden.
    Konig, R.
    Department of Information Technology, University of Borås, Sweden.
    Brattberg, P.
    Department of Information Technology, University of Borås, Sweden.
    Dahlbom, A.
    School of Informatics, University of Skövde, Sweden.
    Riveiro, Maria
    Department of Information Technology, University of Borås, Sweden.
    Mining trackman golf data2016In: Proceedings - 2015 International Conference on Computational Science and Computational Intelligence, CSCI 2015, IEEE, 2016, p. 380-385Conference paper (Refereed)
    Abstract [en]

    Recently, innovative technology like Trackman has made it possible to generate data describing golf swings. In this application paper, we analyze Trackman data from 275 golfers using descriptive statistics and machine learning techniques. The overall goal is to find non-trivial and general patterns in the data that can be used to identify and explain what separates skilled golfers from poor. Experimental results show that random forest models, generated from Trackman data, were able to predict the handicap of a golfer, with a performance comparable to human experts. Based on interpretable predictive models, descriptive statistics and correlation analysis, the most distinguishing property of better golfers is their consistency. In addition, the analysis shows that better players have superior control of the club head at impact and generally hit the ball straighter. A very interesting finding is that better players also tend to swing flatter. Finally, an outright comparison between data describing the club head movement and ball flight data, indicates that a majority of golfers do not hit the ball solid enough for the basic golf theory to apply.

  • 15.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Linusson, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Rule Extraction with Guaranteed Fidelity2014Conference paper (Refereed)
    Abstract [en]

    This paper extends the conformal prediction framework to rule extraction, making it possible to extract interpretable models from opaque models in a setting where either the infidelity or the error rate is bounded by a predefined significance level. Experimental results on 27 publicly available data sets show that all three setups evaluated produced valid and rather efficient conformal predictors. The implication is that augmenting rule extraction with conformal prediction allows extraction of models where test set errors or test sets infidelities are guaranteed to be lower than a chosen acceptable level. Clearly this is beneficial for both typical rule extraction scenarios, i.e., either when the purpose is to explain an existing opaque model, or when it is to build a predictive model that must be interpretable.

  • 16.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Evolved Decision Trees as Conformal Predictors2013Conference paper (Refereed)
    Abstract [en]

    In conformal prediction, predictive models output sets of predictions with a bound on the error rate. In classification, this translates to that the probability of excluding the correct class is lower than a predefined significance level, in the long run. Since the error rate is guaranteed, the most important criterion for conformal predictors is efficiency. Efficient conformal predictors minimize the number of elements in the output prediction sets, thus producing more informative predictions. This paper presents one of the first comprehensive studies where evolutionary algorithms are used to build conformal predictors. More specifically, decision trees evolved using genetic programming are evaluated as conformal predictors. In the experiments, the evolved trees are compared to decision trees induced using standard machine learning techniques on 33 publicly available benchmark data sets, with regard to predictive performance and efficiency. The results show that the evolved trees are generally more accurate, and the corresponding conformal predictors more efficient, than their induced counterparts. One important result is that the probability estimates of decision trees when used as conformal predictors should be smoothed, here using the Laplace correction. Finally, using the more discriminating Brier score instead of accuracy as the optimization criterion produced the most efficient conformal predictions.

  • 17.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Increasing Rule Extraction Accuracy by Post-processing GP Trees2008In: Proceedings of the Congress on Evolutionary Computation, IEEE, 2008, p. 3010-3015Conference paper (Refereed)
    Abstract [en]

    Genetic programming (GP), is a very general and efficient technique, often capable of outperforming more specialized techniques on a variety of tasks. In this paper, we suggest a straightforward novel algorithm for post-processing of GP classification trees. The algorithm iteratively, one node at a time, searches for possible modifications that would result in higher accuracy. More specifically, the algorithm for each split evaluates every possible constant value and chooses the best. With this design, the post-processing algorithm can only increase training accuracy, never decrease it. In this study, we apply the suggested algorithm to GP trees, extracted from neural network ensembles. Experimentation, using 22 UCI datasets, shows that the post-processing results in higher test set accuracies on a large majority of datasets. As a matter of fact, for two setups of three evaluated, the increase in accuracy is statistically significant.

  • 18.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Using Imaginary Ensembles to Select GP Classifiers2010In: Genetic Programming: 13th European Conference, EuroGP 2010, Istanbul, Turkey, April 7-9, 2010, Proceedings / [ed] A.I. et al. Esparcia-Alcazar, Springer, 2010, p. 278-288Conference paper (Refereed)
    Abstract [en]

    When predictive modeling requires comprehensible models, most data miners will use specialized techniques producing rule sets or decision trees. This study, however, shows that genetically evolved decision trees may very well outperform the more specialized techniques. The proposed approach evolves a number of decision trees and then uses one of several suggested selection strategies to pick one specific tree from that pool. The inherent inconsistency of evolution makes it possible to evolve each tree using all data, and still obtain somewhat different models. The main idea is to use these quite accurate and slightly diverse trees to form an imaginary ensemble, which is then used as a guide when selecting one specific tree. Simply put, the tree classifying the largest number of instances identically to the ensemble is chosen. In the experimentation, using 25 UCI data sets, two selection strategies obtained significantly higher accuracy than the standard rule inducer J48.

  • 19.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Post-processing Evolved Decision Trees2009In: Foundations of Computational Intelligence / [ed] Ajith Abraham, Springer, 2009, p. 149-164Chapter in book (Other academic)
    Abstract [en]

    Although Genetic Programming (GP) is a very general technique, it is also quite powerful. As a matter of fact, GP has often been shown to outperform more specialized techniques on a variety of tasks. In data mining, GP has successfully been applied to most major tasks; e.g. classification, regression and clustering. In this chapter, we introduce, describe and evaluate a straightforward novel algorithm for post-processing genetically evolved decision trees. The algorithm works by iteratively, one node at a time, search for possible modifications that will result in higher accuracy. More specifically, the algorithm, for each interior test, evaluates every possible split for the current attribute and chooses the best. With this design, the post-processing algorithm can only increase training accuracy, never decrease it. In the experiments, the suggested algorithm is applied to GP decision trees, either induced directly from datasets, or extracted from neural network ensembles. The experimentation, using 22 UCI datasets, shows that the suggested post-processing technique results in higher test set accuracies on a large majority of the datasets. As a matter of fact, the increase in test accuracy is statistically significant for one of the four evaluated setups, and substantial on two out of the other three.

  • 20.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Linusson, H.
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Boström, H.
    Department of Computer and Systems Sciences, Stockholm University, Sweden.
    Model-agnostic nonconformity functions for conformal classification2017In: Proceedings of the International Joint Conference on Neural Networks, IEEE, 2017, p. 2072-2079Conference paper (Refereed)
    Abstract [en]

    A conformai predictor outputs prediction regions, for classification label sets. The key property of all conformai predictors is that they are valid, i.e., their error rate on novel data is bounded by a preset significance level. Thus, the key performance metric for evaluating conformal predictors is the size of the output prediction regions, where smaller (more informative) prediction regions are said to be more efficient. All conformal predictions rely on nonconformity functions, measuring the strangeness of an input-output pair, and the efficiency depends critically on the quality of the chosen nonconformity function. In this paper, three model-agnostic nonconformity functions, based on well-known loss functions, are evaluated with regard to how they affect efficiency. In the experimentation on 21 publicly available multi-class data sets, both single neural networks and ensembles of neural networks are used as underlying models for conformal classifiers. The results show that the choice of nonconformity function has a major impact on the efficiency, but also that different nonconformity functions should be used depending on the exact efficiency metric. For a high fraction of single-label predictions, a margin-based nonconformity function is the best option, while a nonconformity function based on the hinge loss obtained the smallest label sets on average.

  • 21.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Interpretable regression trees using conformal prediction2018In: Expert systems with applications, ISSN 0957-4174, E-ISSN 1873-6793, Vol. 97, p. 394-404Article in journal (Refereed)
    Abstract [en]

    A key property of conformal predictors is that they are valid, i.e., their error rate on novel data is bounded by a preset level of confidence. For regression, this is achieved by turning the point predictions of the underlying model into prediction intervals. Thus, the most important performance metric for evaluating conformal regressors is not the error rate, but the size of the prediction intervals, where models generating smaller (more informative) intervals are said to be more efficient. State-of-the-art conformal regressors typically utilize two separate predictive models: the underlying model providing the center point of each prediction interval, and a normalization model used to scale each prediction interval according to the estimated level of difficulty for each test instance. When using a regression tree as the underlying model, this approach may cause test instances falling into a specific leaf to receive different prediction intervals. This clearly deteriorates the interpretability of a conformal regression tree compared to a standard regression tree, since the path from the root to a leaf can no longer be translated into a rule explaining all predictions in that leaf. In fact, the model cannot even be interpreted on its own, i.e., without reference to the corresponding normalization model. Current practice effectively presents two options for constructing conformal regression trees: to employ a (global) normalization model, and thereby sacrifice interpretability; or to avoid normalization, and thereby sacrifice both efficiency and individualized predictions. In this paper, two additional approaches are considered, both employing local normalization: the first approach estimates the difficulty by the standard deviation of the target values in each leaf, while the second approach employs Mondrian conformal prediction, which results in regression trees where each rule (path from root node to leaf node) is independently valid. An empirical evaluation shows that the first approach is as efficient as current state-of-the-art approaches, thus eliminating the efficiency vs. interpretability trade-off present in existing methods. Moreover, it is shown that if a validity guarantee is required for each single rule, as provided by the Mondrian approach, a penalty with respect to efficiency has to be paid, but it is only substantial at very high confidence levels.

  • 22.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Producing Implicit Diversity in ANN Ensembles2012Conference paper (Refereed)
    Abstract [en]

    Combining several ANNs into ensembles normally results in a very accurate and robust predictive models. Many ANN ensemble techniques are, however, quite complicated and often explicitly optimize some diversity metric. Unfortunately, the lack of solid validation of the explicit algorithms, at least for classification, makes the use of diversity measures as part of an optimization function questionable. The merits of implicit methods, most notably bagging, are on the other hand experimentally established and well-known. This paper evaluates a number of straightforward techniques for introducing implicit diversity in ANN ensembles, including a novel technique producing diversity by using ANNs with different and slightly randomized link structures. The experimental results, comparing altogether 54 setups and two different ensemble sizes on 30 UCI data sets, show that all methods succeeded in producing implicit diversity, but that the effect on ensemble accuracy varied. Still, most setups evaluated did result in more accurate ensembles, compared to the baseline setup, especially for the larger ensemble size. As a matter of fact, several setups even obtained significantly higher ensemble accuracy than bagging. The analysis also identified that diversity was, relatively speaking, more important for the larger ensembles. Looking specifically at the methods used to increase the implicit diversity, setups using the technique that utilizes the randomized link structures generally produced the most accurate ensembles.

  • 23.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden.
    Calibrating probability estimation trees using Venn-Abers predictors2019In: SIAM International Conference on Data Mining, SDM 2019, Society for Industrial and Applied Mathematics, 2019, p. 28-36Conference paper (Refereed)
    Abstract [en]

    Class labels output by standard decision trees are not very useful for making informed decisions, e.g., when comparing the expected utility of various alternatives. In contrast, probability estimation trees (PETs) output class probability distributions rather than single class labels. It is well known that estimating class probabilities in PETs by relative frequencies often lead to extreme probability estimates, and a number of approaches to provide more well-calibrated estimates have been proposed. In this study, a recent model-agnostic calibration approach, called Venn-Abers predictors is, for the first time, considered in the context of decision trees. Results from a large-scale empirical investigation are presented, comparing the novel approach to previous calibration techniques with respect to several different performance metrics, targeting both predictive performance and reliability of the estimates. All approaches are considered both with and without Laplace correction. The results show that using Venn-Abers predictors for calibration is a highly competitive approach, significantly outperforming Platt scaling, Isotonic regression and no calibration, with respect to almost all performance metrics used, independently of whether Laplace correction is applied or not. The only exception is AUC, where using non-calibrated PETs together with Laplace correction, actually is the best option, which can be explained by the fact that AUC is not affected by the absolute, but only relative, values of the probability estimates. 

  • 24.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Overproduce-and-Select: The Grim Reality2013Conference paper (Refereed)
    Abstract [en]

    Overproduce-and-select (OPAS) is a frequently used paradigm for building ensembles. In static OPAS, a large number of base classifiers are trained, before a subset of the available models is selected to be combined into the final ensemble. In general, the selected classifiers are supposed to be accurate and diverse for the OPAS strategy to result in highly accurate ensembles, but exactly how this is enforced in the selection process is not obvious. Most often, either individual models or ensembles are evaluated, using some performance metric, on available and labeled data. Naturally, the underlying assumption is that an observed advantage for the models (or the resulting ensemble) will carry over to test data. In the experimental study, a typical static OPAS scenario, using a pool of artificial neural networks and a number of very natural and frequently used performance measures, is evaluated on 22 publicly available data sets. The discouraging result is that although a fairly large proportion of the ensembles obtained higher test set accuracies, compared to using the entire pool as the ensemble, none of the selection criteria could be used to identify these highly accurate ensembles. Despite only investigating a specific scenario, we argue that the settings used are typical for static OPAS, thus making the results general enough to question the entire paradigm.

  • 25.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Random Brains2013Conference paper (Refereed)
    Abstract [en]

    In this paper, we introduce and evaluate a novel method, called random brains, for producing neural network ensembles. The suggested method, which is heavily inspired by the random forest technique, produces diversity implicitly by using bootstrap training and randomized architectures. More specifically, for each base classifier multilayer perceptron, a number of randomly selected links between the input layer and the hidden layer are removed prior to training, thus resulting in potentially weaker but more diverse base classifiers. The experimental results on 20 UCI data sets show that random brains obtained significantly higher accuracy and AUC, compared to standard bagging of similar neural networks not utilizing randomized architectures. The analysis shows that the main reason for the increased ensemble performance is the ability to produce effective diversity, as indicated by the increase in the difficulty diversity measure.

  • 26.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Linusson, Henrik
    Högskolan i Borås, Department of Information Technology, Borås, Sweden.
    Boström, Henrik
    The Royal Institute of Technology (KTH), School of Electrical Engineering and Computer Science, Stockholm, Sweden.
    Efficient Venn Predictors using Random Forests2019In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 108, no 3, p. 535-550Article in journal (Refereed)
    Abstract [en]

    Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. In addition, a probabilistic classifier must, of course, also be as accurate as possible. In this paper, Venn predictors, and its special case Venn-Abers predictors, are evaluated for probabilistic classification, using random forests as the underlying models. Venn predictors output multiple probabilities for each label, i.e., the predicted label is associated with a probability interval. Since all Venn predictors are valid in the long run, the size of the probability intervals is very important, with tighter intervals being more informative. The standard solution when calibrating a classifier is to employ an additional step, transforming the outputs from a classifier into probability estimates, using a labeled data set not employed for training of the models. For random forests, and other bagged ensembles, it is, however, possible to use the out-of-bag instances for calibration, making all training data available for both model learning and calibration. This procedure has previously been successfully applied to conformal prediction, but was here evaluated for the first time for Venn predictors. The empirical investigation, using 22 publicly available data sets, showed that all four versions of the Venn predictors were better calibrated than both the raw estimates from the random forest, and the standard techniques Platt scaling and isotonic regression. Regarding both informativeness and accuracy, the standard Venn predictor calibrated on out-of-bag instances was the best setup evaluated. Most importantly, calibrating on out-of-bag instances, instead of using a separate calibration set, resulted in tighter intervals and more accurate models on every data set, for both the Venn predictors and the Venn-Abers predictors.

  • 27.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Empirically Investigating the Importance of Diversity2007Conference paper (Refereed)
  • 28.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Evaluating Standard Techniques for Implicit Diversity2008In: Advances in Knowledge Discovery and Data Mining, Springer, 2008, p. 613-622Conference paper (Refereed)
  • 29.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    The Importance of Diversity in Neural Network Ensembles: An Empirical Investigation2007Conference paper (Refereed)
    Abstract [en]

    When designing ensembles, it is almost an axiom that the base classifiers must be diverse in order for the ensemble to generalize well. Unfortunately, there is no clear definition of the key term diversity, leading to several diversity measures and many, more or less ad hoc, methods for diversity creation in ensembles. In addition, no specific diversity measure has shown to have a high correlation with test set accuracy. The purpose of this paper is to empirically evaluate ten different diversity measures, using neural network ensembles and 11 publicly available data sets. The main result is that all diversity measures evaluated, in this study too, show low or very low correlation with test set accuracy. Having said that, two measures; double fault and difficulty show slightly higher correlations compared to the other measures. The study furthermore shows that the correlation between accuracy measured on training or validation data and test set accuracy also is rather low. These results challenge ensemble design techniques where diversity is explicitly maximized or where ensemble accuracy on a hold-out set is used for optimization.

  • 30.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Sundell, Håkan
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). Department of Information Technology, University of Borås, Sweden.
    Venn predictors using lazy learners2018In: Proceedings of the 2018 International Conference on Data Science, ICDATA'18 / [ed] R. Stahlbock, G. M. Weiss & M. Abou-Nasr, CSREA Press, 2018, p. 220-226Conference paper (Refereed)
    Abstract [en]

    Probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. Venn predictors, which can be used on top of any classifier, are automatically valid multiprobability predictors, making them extremely suitable for probabilistic classification. A Venn predictor outputs multiple probabilities for each label, so the predicted label is associated with a probability interval. While all Venn predictors are valid, their accuracy and the size of the probability interval are dependent on both the underlying model and some interior design choices. Specifically, all Venn predictors use so called Venn taxonomies for dividing the instances into a number of categories, each such taxonomy defining a different Venn predictor. A frequently used, but very basic taxonomy, is to categorize the instances based on their predicted label. In this paper, we investigate some more finegrained taxonomies, that use not only the predicted label but also some measures related to the confidence in individual predictions. The empirical investigation, using 22 publicly available data sets and lazy learners (kNN) as the underlying models, showed that the probability estimates from the Venn predictors, as expected, were extremely well-calibrated. Most importantly, using the basic (i.e., label-based) taxonomy produced significantly more accurate and informative Venn predictors compared to the more complex alternatives. In addition, the results also showed that when using lazy learners as underlying models, a transductive approach significantly outperformed an inductive, with regard to accuracy and informativeness. This result is in contrast to previous studies, where other underlying models were used.

  • 31.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Locally Induced Predictive Models2011Conference paper (Refereed)
    Abstract [en]

    Most predictive modeling techniques utilize all available data to build global models. This is despite the wellknown fact that for many problems, the targeted relationship varies greatly over the input space, thus suggesting that localized models may improve predictive performance. In this paper, we suggest and evaluate a technique inducing one predictive model for each test instance, using only neighboring instances. In the experimentation, several different variations of the suggested algorithm producing localized decision trees and neural network models are evaluated on 30 UCI data sets. The main result is that the suggested approach generally yields better predictive performance than global models built using all available training data. As a matter of fact, all techniques producing J48 trees obtained significantly higher accuracy and AUC, compared to the global J48 model. For RBF network models, with their inherent ability to use localized information, the suggested approach was only successful with regard to accuracy, while global RBF models had a better ranking ability, as seen by their generally higher AUCs.

  • 32.
    Johansson, Ulf
    et al.
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Löfström, Tuwe
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Sundell, Håkan
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Gidenstam, Anders
    Department of Information Technology, University of Borås, Sweden.
    Boström, Henrik
    School of Information and Communication Technology, Royal Institute of Technology, Sweden.
    Venn predictors for well-calibrated probability estimation trees2018In: Conformal and Probabilistic Prediction and Applications / [ed] A. Gammerman, V. Vovk, Z. Luo, E. Smirnov, & R. Peeters, 2018, p. 3-14Conference paper (Refereed)
    Abstract [en]

    Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. The standard solution is to employ an additional step, transforming the outputs from a classifier into probability estimates. In this paper, Venn predictors are compared to Platt scaling and isotonic regression, for the purpose of producing well-calibrated probabilistic predictions from decision trees. The empirical investigation, using 22 publicly available data sets, showed that the probability estimates from the Venn predictor were extremely well-calibrated. In fact, in a direct comparison using the accepted reliability metric, the Venn predictor estimates were the most exact on every data set.

  • 33.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Sundström, Malin
    Högskolan i Borås, Akademin för textil, teknik och ekonomi.
    Sundell, Håkan
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Rickard, König
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Jenny, Balkow
    Högskolan i Borås, Akademin för textil, teknik och ekonomi.
    Dataanalys för ökad kundförståelse2016Report (Other (popular science, discussion, etc.))
  • 34.
    Johansson, Ulf
    et al.
    Department of Information Technology, University of Borås, Sweden.
    Sönströd, C.
    Department of Information Technology, University of Borås, Sweden.
    Linusson, H.
    Department of Information Technology, University of Borås, Sweden.
    Efficient conformal regressors using bagged neural nets2015In: Proceedings of the International Joint Conference on Neural Networks, IEEE, 2015Conference paper (Refereed)
    Abstract [en]

    Conformal predictors use machine learning models to output prediction sets. For regression, a prediction set is simply a prediction interval. All conformal predictors are valid, meaning that the error rate on novel data is bounded by a preset significance level. The key performance metric for conformal predictors is their efficiency, i.e., the size of the prediction sets. Inductive conformal predictors utilize real-valued functions, called nonconformity functions, and a calibration set, i.e., a set of labeled instances not used for the model training, to obtain the prediction regions. In state-of-the-art conformal regressors, the nonconformity functions are normalized, i.e., they include a component estimating the difficulty of each instance. In this study, conformal regressors are built on top of ensembles of bagged neural networks, and several nonconformity functions are evaluated. In addition, the option to calibrate on out-of-bag instances instead of setting aside a calibration set is investigated. The experiments, using 33 publicly available data sets, show that normalized nonconformity functions can produce smaller prediction sets, but the efficiency is highly dependent on the quality of the difficulty estimation. Specifically, in this study, the most efficient normalized nonconformity function estimated the difficulty of an instance by calculating the average error of neighboring instances. These results are consistent with previous studies using random forests as underlying models. Calibrating on out-of-bag did, however, only lead to more efficient conformal predictors on smaller data sets, which is in sharp contrast to the random forest study, where out-out-of bag calibration was significantly better overall. 

  • 35.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Chipper: A Novel Algorithm for Concept Description2008Conference paper (Refereed)
    Abstract [en]

    In this paper, several demands placed on concept description algorithms are identified and discussed. The most important criterion is the ability to produce compact rule sets that, in a natural and accurate way, describe the most important relationships in the underlying domain. An algorithm based on the identified criteria is presented and evaluated. The algorithm, named Chipper, produces decision lists, where each rule covers a maximum number of remaining instances while meeting requested accuracy requirements. In the experiments, Chipper is evaluated on nine UCI data sets. The main result is that Chipper produces compact and understandable rule sets, clearly fulfilling the overall goal of concept description. In the experiments, Chipper's accuracy is similar to standard decision tree and rule induction algorithms, while rule sets have superior comprehensibility.

  • 36.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Accurate and Interpretable Regression Trees using Oracle Coaching2014Conference paper (Refereed)
    Abstract [en]

    In many real-world scenarios, predictive models need to be interpretable, thus ruling out many machine learning techniques known to produce very accurate models, e.g., neural networks, support vector machines and all ensemble schemes. Most often, tree models or rule sets are used instead, typically resulting in significantly lower predictive performance. The over- all purpose of oracle coaching is to reduce this accuracy vs. comprehensibility trade-off by producing interpretable models optimized for the specific production set at hand. The method requires production set inputs to be present when generating the predictive model, a demand fulfilled in most, but not all, predic- tive modeling scenarios. In oracle coaching, a highly accurate, but opaque, model is first induced from the training data. This model (“the oracle”) is then used to label both the training instances and the production instances. Finally, interpretable models are trained using different combinations of the resulting data sets. In this paper, the oracle coaching produces regression trees, using neural networks and random forests as oracles. The experiments, using 32 publicly available data sets, show that the oracle coaching leads to significantly improved predictive performance, compared to standard induction. In addition, it is also shown that a highly accurate opaque model can be successfully used as a pre- processing step to reduce the noise typically present in data, even in situations where production inputs are not available. In fact, just augmenting or replacing training data with another copy of the training set, but with the predictions from the opaque model as targets, produced significantly more accurate and/or more compact regression trees.

  • 37.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Linusson, Henrik
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Dept. of Computer and Systems Sciences Stockholm University, Sweden.
    Regression Trees for Streaming Data with Local Performance Guarantees2014Conference paper (Refereed)
    Abstract [en]

    Online predictive modeling of streaming data is a key task for big data analytics. In this paper, a novel approach for efficient online learning of regression trees is proposed, which continuously updates, rather than retrains, the tree as more labeled data become available. A conformal predictor outputs prediction sets instead of point predictions; which for regression translates into prediction intervals. The key property of a conformal predictor is that it is always valid, i.e., the error rate, on novel data, is bounded by a preset significance level. Here, we suggest applying Mondrian conformal prediction on top of the resulting models, in order to obtain regression trees where not only the tree, but also each and every rule, corresponding to a path from the root node to a leaf, is valid. Using Mondrian conformal prediction, it becomes possible to analyze and explore the different rules separately, knowing that their accuracy, in the long run, will not be below the preset significance level. An empirical investigation, using 17 publicly available data sets, confirms that the resulting rules are independently valid, but also shows that the prediction intervals are smaller, on average, than when only the global model is required to be valid. All-in-all, the suggested method provides a data miner or a decision maker with highly informative predictive models of streaming data.

  • 38.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    One Tree to Explain Them All2011Conference paper (Refereed)
    Abstract [en]

    Random forest is an often used ensemble technique, renowned for its high predictive performance. Random forests models are, however, due to their sheer complexity inherently opaque, making human interpretation and analysis impossible. This paper presents a method of approximating the random forest with just one decision tree. The approach uses oracle coaching, a recently suggested technique where a weaker but transparent model is generated using combinations of regular training data and test data initially labeled by a strong classifier, called the oracle. In this study, the random forest plays the part of the oracle, while the transparent models are decision trees generated by either the standard tree inducer J48, or by evolving genetic programs. Evaluation on 30 data sets from the UCI repository shows that oracle coaching significantly improves both accuracy and area under ROC curve, compared to using training data only. As a matter of fact, resulting single tree models are as accurate as the random forest, on the specific test instances. Most importantly, this is not achieved by inducing or evolving huge trees having perfect fidelity; a large majority of all trees are instead rather compact and clearly comprehensible. The experiments also show that the evolution outperformed J48, with regard to accuracy, but that this came at the expense of slightly larger trees.

  • 39.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Oracle Coached Decision Trees and Lists2010Conference paper (Refereed)
    Abstract [en]

    This paper introduces a novel method for obtaining increased predictive performance from transparent models in situations where production input vectors are available when building the model. First, labeled training data is used to build a powerful opaque model, called an oracle. Second, the oracle is applied to production instances, generating predicted target values, which are used as labels. Finally, these newly labeled instances are utilized, in different combinations with normal training data, when inducing a transparent model. Experimental results, on 26 UCI data sets, show that the use of oracle coaches significantly improves predictive performance, compared to standard model induction. Most importantly, both accuracy and AUC results are robust over all combinations of opaque and transparent models evaluated. This study thus implies that the straightforward procedure of using a coaching oracle, which can be used with arbitrary classifiers, yields significantly better predictive performance at a low computational cost.

  • 40.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Obtaining accurate and comprehensible classifiers using oracle coaching2012In: Intelligent Data Analysis, ISSN 1088-467X, E-ISSN 1571-4128, Vol. Volume 16, no Number 2, p. 247-263Article in journal (Refereed)
    Abstract [en]

    While ensemble classifiers often reach high levels of predictive performance, the resulting models are opaque and hence do not allow direct interpretation. When employing methods that do generate transparent models, predictive performance typically has to be sacrificed. This paper presents a method of improving predictive performance of transparent models in the very common situation where instances to be classified, i.e., the production data, are known at the time of model building. This approach, named oracle coaching, employs a strong classifier, called an oracle, to guide the generation of a weaker, but transparent model. This is accomplished by using the oracle to predict class labels for the production data, and then applying the weaker method on this data, possibly in conjunction with the original training set. Evaluation on 30 data sets from the UCI repository shows that oracle coaching significantly improves predictive performance, measured by both accuracy and area under ROC curve, compared to using training data only. This result is shown to be robust for a variety of methods for generating the oracles and transparent models. More specifically, random forests and bagged radial basis function networks are used as oracles, while J48 and JRip are used for generating transparent models. The evaluation further shows that significantly better results are obtained when using the oracle-classified production data together with the original training data, instead of using only oracle data. An analysis of the fidelity of the transparent models to the oracles shows that performance gains can be expected from increasing oracle performance rather than from increasing fidelity. Finally, it is shown that further performance gains can be achieved by adjusting the relative weights of training data and oracle data.

  • 41.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    König, Rikard
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Using Genetic Programming to Obtain Implicit Diversity2009Conference paper (Refereed)
    Abstract [en]

    When performing predictive data mining, the use of ensembles is known to increase prediction accuracy, compared to single models. To obtain this higher accuracy, ensembles should be built from base classifiers that are both accurate and diverse. The question of how to balance these two properties in order to maximize ensemble accuracy is, however, far from solved and many different techniques for obtaining ensemble diversity exist. One such technique is bagging, where implicit diversity is introduced by training base classifiers on different subsets of available data instances, thus resulting in less accurate, but diverse base classifiers. In this paper, genetic programming is used as an alternative method to obtain implicit diversity in ensembles by evolving accurate, but different base classifiers in the form of decision trees, thus exploiting the inherent inconsistency of genetic programming. The experiments show that the GP approach outperforms standard bagging of decision trees, obtaining significantly higher ensemble accuracy over 25 UCI datasets. This superior performance stems from base classifiers having both higher average accuracy and more diversity. Implicitly introducing diversity using GP thus works very well, since evolved base classifiers tend to be highly accurate and diverse.

  • 42.
    Johansson, Ulf
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Sönströd, Cecilia
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Norinder, Ulf
    Boström, Henrik
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Using Feature Selection with Bagging and Rule Extraction in Drug Discovery2010Conference paper (Refereed)
    Abstract [en]

    This paper investigates different ways of combining feature selection with bagging and rule extraction in predictive modeling. Experiments on a large number of data sets from the medicinal chemistry domain, using standard algorithms implemented in theWeka data mining workbench, show that feature selection can lead to significantly improved predictive performance.When combining feature selection with bagging, employing the feature selection on each bootstrap obtains the best result.When using decision trees for rule extraction, the effect of feature selection can actually be detrimental, unless the transductive approach oracle coaching is also used. However, employing oracle coaching will lead to significantly improved performance, and the best results are obtainedwhen performing feature selection before training the opaque model. The overall conclusion is that it can make a substantial difference for the predictive performance exactly how feature selection is used in conjunction with other techniques.

  • 43.
    König, R.
    et al.
    Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Department of Information Technology, University of Borås, Borås, Sweden.
    Lindqvist, A.
    Brattberg, P.
    Department of Information Technology, University of Borås, Borås, Sweden.
    Interesting regression- and model trees through variable restrictions2015In: IC3K 2015 - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, SciTePress, 2015, p. 281-292Conference paper (Refereed)
    Abstract [en]

    The overall purpose of this paper is to suggest a new technique for creating interesting regression- and model trees. Interesting models are here defined as models that fulfill some domain dependent restriction of how variables can be used in the models. The suggested technique, named ReReM, is an extension of M5 which can enforce variable constraints while creating regression and model trees. To evaluate ReReM, two case studies were conducted where the first concerned modeling of golf player skill, and the second modeling of fuel consumption in trucks. Both case studies had variable constraints, defined by domain experts, that should be fulfilled for models to be deemed interesting. When used for modeling golf player skill, ReReM created regression trees that were slightly less accurate than M5s regression trees. However, the models created with ReReM were deemed to be interesting by a golf teaching professional while the M5 models were not. In the second case study, ReReM was evaluated against M5s model trees and a semi-automated approach often used in the automotive industry. Here, experiments showed that ReReM could achieve a predictive performance comparable to M5 and clearly better than a semi-automated approach, while fulfilling the constraints regarding interesting models.

  • 44.
    König, Rikard
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Rule Extraction using Genetic Programming for Accurate Sales Forecasting2014Conference paper (Refereed)
    Abstract [en]

    The purpose of this paper is to propose and evaluate a method for reducing the inherent tendency of genetic programming to overfit small and noisy data sets. In addition, the use of different optimization criteria for symbolic regression is demonstrated. The key idea is to reduce the risk of overfitting noise in the training data by introducing an intermediate predictive model in the process. More specifically, instead of directly evolving a genetic regression model based on labeled training data, the first step is to generate a highly accurate ensemble model. Since ensembles are very robust, the resulting predictions will contain less noise than the original data set. In the second step, an interpretable model is evolved, using the ensemble predictions, instead of the true labels, as the target variable. Experiments on 175 sales forecasting data sets, from one of Sweden’s largest wholesale companies, show that the proposed technique obtained significantly better predictive performance, compared to both straightforward use of genetic programming and the standard M5P technique. Naturally, the level of improvement depends critically on the performance of the intermediate ensemble.

  • 45.
    König, Rikard
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Niklasson, Lars
    Improving GP Classification Performance by Injection of Decision Trees2010Conference paper (Refereed)
    Abstract [en]

    This paper presents a novel hybrid method combining genetic programming and decision tree learning. The method starts by estimating a benchmark level of reasonable accuracy, based on decision tree performance on bootstrap samples of the training set. Next, a normal GP evolution is started with the aim of producing an accurate GP. At even intervals, the best GP in the population is evaluated against the accuracy benchmark. If the GP has higher accuracy than the benchmark, the evolution continues normally until the maximum number of generations is reached. If the accuracy is lower than the benchmark, two things happen. First, the fitness function is modified to allow larger GPs, able to represent more complex models. Secondly, a decision tree with increased size and trained on a bootstrap of the training data is injected into the population. The experiments show that the hybrid solution of injecting decision trees into a GP population gives synergetic effects producing results that are better than using either technique separately. The results, from 18 UCI data sets, show that the proposed method clearly outperforms normal GP, and is significantly better than the standard decision tree algorithm.

  • 46.
    König, Rikard
    et al.
    University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL). University of Borås, Borås, Sweden.
    Riveiro, Maria
    University of Skövde, Skövde, Sweden.
    Brattberg, Peter
    University of Borås, Borås, Sweden.
    Modeling golf player skill using machine learning2017In: Machine Learning and Knowledge Extraction, Springer, 2017, p. 275-294Conference paper (Refereed)
    Abstract [en]

    In this study we apply machine learning techniques to Modeling Golf Player Skill using a dataset consisting of 277 golfers. The dataset includes 28 quantitative metrics, related to the club head at impact and ball flight, captured using a Doppler-radar. For modeling, cost-sensitive decision trees and random forest are used to discern between less skilled players and very good ones, i.e., Hackers and Pros. The results show that both random forest and decision trees achieve high predictive accuracy, with regards to true positive rate, accuracy and area under the ROC-curve. A detailed interpretation of the decision trees shows that they concur with modern swing theory, e.g., consistency is very important, while face angle, club path and dynamic loft are the most important evaluated swing factors, when discerning between Hackers and Pros. Most of the Hackers could be identified by a rather large deviation in one of these values compared to the Pros. Hackers, which had less variation in these aspects of the swing, could instead be identified by a steeper swing plane and a lower club speed. The importance of the swing plane is an interesting finding, since it was not expected and is not easy to explain. © 2017, IFIP International Federation for Information Processing.

  • 47.
    Linusson, Henrik
    et al.
    Department of Information Technology, University of Borås, Borås, Sweden.
    Johansson, Ulf
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Boström, Henrik
    School of Electrical Engineering and Computer Science, Royal Institute of Technology, Kista, Sweden.
    Löfström, Tuve
    Jönköping University, School of Engineering, JTH, Computer Science and Informatics, JTH, Jönköping AI Lab (JAIL).
    Classification with reject option using conformal prediction2018In: Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3-6, 2018, Proceedings, Part I, Springer, 2018, p. 94-105Conference paper (Refereed)
    Abstract [en]

    In this paper, we propose a practically useful means of interpreting the predictions produced by a conformal classifier. The proposed interpretation leads to a classifier with a reject option, that allows the user to limit the number of erroneous predictions made on the test set, without any need to reveal the true labels of the test objects. The method described in this paper works by estimating the cumulative error count on a set of predictions provided by a conformal classifier, ordered by their confidence. Given a test set and a user-specified parameter k, the proposed classification procedure outputs the largest possible amount of predictions containing on average at most k errors, while refusing to make predictions for test objects where it is too uncertain. We conduct an empirical evaluation using benchmark datasets, and show that we are able to provide accurate estimates for the error rate on the test set. 

  • 48.
    Linusson, Henrik
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Boström, Henrik
    Dept. of Computer and Systems Sciences Stockholm University, Kista, Sweden.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Efficiency Comparison of Unstable Transductive and Inductive Conformal Classifiers2014Conference paper (Refereed)
    Abstract [en]

    In the conformal prediction literature, it appears axiomatic that transductive conformal classifiers possess a higher predictive efficiency than inductive conformal classifiers, however, this depends on whether or not the nonconformity function tends to overfit misclassified test examples. With the conformal prediction framework’s increasing popularity, it thus becomes necessary to clarify the settings in which this claim holds true. In this paper, the efficiency of transductive conformal classifiers based on decision tree, random forest and support vector machine classification models is compared to the efficiency of corresponding inductive conformal classifiers. The results show that the efficiency of conformal classifiers based on standard decision trees or random forests is substantially improved when used in the inductive mode, while conformal classifiers based on support vector machines are more efficient in the transductive mode. In addition, an analysis is presented that discusses the effects of calibration set size on inductive conformal classifier efficiency.

  • 49.
    Linusson, Henrik
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Johansson, Ulf
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Boström, Henrik
    Dept. of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Löfström, Tuve
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Reliable Confidence Predictions Using Conformal Prediction2016In: Lecture Notes in Computer Science, 2016, p. 77-88Conference paper (Refereed)
    Abstract [en]

    Conformal classiers output condence prediction regions, i.e., multi-valued predictions that are guaranteed to contain the true output value of each test pattern with some predened probability. In order to fully utilize the predictions provided by a conformal classier, it is essential that those predictions are reliable, i.e., that a user is able to assess the quality of the predictions made. Although conformal classiers are statistically valid by default, the error probability of the prediction regions output are dependent on their size in such a way that smaller, and thus potentially more interesting, predictions are more likely to be incorrect. This paper proposes, and evaluates, a method for producing rened error probability estimates of prediction regions, that takes their size into account. The end result is a binary conformal condence predictor that is able to provide accurate error probability estimates for those prediction regions containing only a single class label.

  • 50.
    Linusson, Henrik
    et al.
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Johansson, Ulf
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Löfström, Tuve
    Högskolan i Borås, Institutionen Handels- och IT-högskolan.
    Signed-Error Conformal Regression2014In: Advances in Knowledge Discovery and Data Mining 18th Pacific-Asia Conference, PAKDD 2014 Tainan, Taiwan, May 13-16, 2014 Proceedings, Part I, Springer, 2014, p. 224-236Conference paper (Refereed)
    Abstract [en]

    This paper suggests a modification of the Conformal Prediction framework for regression that will strengthen the associated guarantee of validity. We motivate the need for this modification and argue that our conformal regressors are more closely tied to the actual error distribution of the underlying model, thus allowing for more natural interpretations of the prediction intervals. In the experimentation, we provide an empirical comparison of our conformal regressors to traditional conformal regressors and show that the proposed modification results in more robust two-tailed predictions, and more efficient one-tailed predictions.

12 1 - 50 of 64
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf