Handling small calibration sets in mondrian inductive conformal regressorsShow others and affiliations
2015 (English)In: Statistical Learning and Data Sciences, Springer, 2015, p. 271-280Conference paper, Published paper (Refereed)
Abstract [en]
In inductive conformal prediction, calibration sets must contain an adequate number of instances to support the chosen confidence level. This problem is particularly prevalent when using Mondrian inductive conformal prediction, where the input space is partitioned into independently valid prediction regions. In this study, Mondrian conformal regressors, in the form of regression trees, are used to investigate two problematic aspects of small calibration sets. If there are too few calibration instances to support the significance level, we suggest using either extrapolation or altering the model. In situations where the desired significance level is between two calibration instances, the standard procedure is to choose the more nonconforming one, thus guaranteeing validity, but producing conservative conformal predictors. The suggested solution is to use interpolation between calibration instances. All proposed techniques are empirically evaluated and compared to the standard approach on 30 benchmark data sets. The results show that while extrapolation often results in invalid models, interpolation works extremely well and provides increased efficiency with preserved empirical validity.
Place, publisher, year, edition, pages
Springer, 2015. p. 271-280
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 9047
Keywords [en]
Extrapolation, Forecasting, Interpolation, Benchmark data, Confidence levels, Conformal predictions, Conformal predictors, Input space, Regression trees, Significance levels, Standard procedures, Calibration
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hj:diva-38120DOI: 10.1007/978-3-319-17091-6_22Scopus ID: 2-s2.0-84949798529Local ID: 0;0;miljJAILISBN: 9783319170909 (print)OAI: oai:DiVA.org:hj-38120DiVA, id: diva2:1163928
Conference
3rd International Symposium on Statistical Learning and Data Sciences, SLDS 2015; Egham; United Kingdom; 20 April 2015 through 23 April 2015
2017-12-082017-12-082019-08-23Bibliographically approved