Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Using machine learning to select variables in data envelopment analysis: Simulations and application using electricity distribution data
Jönköping University, Jönköping International Business School, JIBS, Statistics.
Lund University, Lund, Sweden.
Jönköping University, Jönköping International Business School, JIBS, Statistics. Jönköping University, Jönköping International Business School, JIBS, Centre for Entrepreneurship and Spatial Economics (CEnSE).ORCID iD: 0000-0002-4535-3630
Jönköping University, Jönköping International Business School, JIBS, Statistics.ORCID iD: 0000-0003-3144-2218
Show others and affiliations
2023 (English)In: Energy Economics, ISSN 0140-9883, E-ISSN 1873-6181, Vol. 120, article id 106621Article in journal (Refereed) Published
Abstract [en]

Agencies that regulate electricity providers often apply nonparametric data envelopment analysis (DEA) to assess the relative efficiency of each firm. The reliability and validity of DEA are contingent upon selecting relevant input variables. In the era of big (wide) data, the assumptions of traditional variable selection techniques are often violated due to challenges related to high-dimensional data and their standard empirical properties. Currently, regulators have access to a large number of potential input variables. Therefore, our aim is to introduce new machine learning methods for regulators of the energy market. We also propose a new two-step analytical approach where, in the first step, the machine learning-based adaptive least absolute shrinkage and selection operator (ALASSO) is used to select variables and, in the second step, selected variables are used in a DEA model. In contrast to previous research, we find, by using a more realistic data-generating process common for production functions (i.e., Cobb–Douglas and Translog), that the performance of different machine learning techniques differs substantially in different empirically relevant situations. Simulations also reveal that the ALASSO is superior to other machine learning and regression-based methods when the collinearity is low or moderate. However, in situations of multicollinearity, the LASSO approach exhibits the best performance. We also use real data from the Swedish electricity distribution market to illustrate the empirical relevance of selecting the most appropriate variable selection method.

Place, publisher, year, edition, pages
Elsevier, 2023. Vol. 120, article id 106621
Keywords [en]
Clustering algorithms, Commerce, Data envelopment analysis, Electric utilities, Regression analysis, Curse of dimensionality, Electricity distribution, Input variables, Least absolute shrinkage and selection operators, Machine-learning, Nonparametrics, Performance, Regulation, Relative efficiency, Variables selections, Machine learning, Variable selection
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:hj:diva-60028DOI: 10.1016/j.eneco.2023.106621ISI: 000972681000001Scopus ID: 2-s2.0-85150299788Local ID: HOA;intsam;868713OAI: oai:DiVA.org:hj-60028DiVA, id: diva2:1746114
Available from: 2023-03-27 Created: 2023-03-27 Last updated: 2023-05-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Duras, ToniMånsson, KristoferSjölander, Pär

Search in DiVA

By author/editor
Duras, ToniMånsson, KristoferSjölander, Pär
By organisation
JIBS, StatisticsJIBS, Centre for Entrepreneurship and Spatial Economics (CEnSE)
In the same journal
Energy Economics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 240 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf