RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
The Problem with Ranking Ensembles Based on Training or Validation Performance
Högskolan i Borås, Institutionen Handels- och IT-högskolan.ORCID-id: 0000-0003-0274-9026
Högskolan i Borås, Institutionen Handels- och IT-högskolan.ORCID-id: 0000-0003-0412-6199
2008 (Engelska)Ingår i: Proceedings of the International Joint Conference on Neural Networks, IEEE, 2008Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

The main purpose of this study was to determine whether it is possible to somehow use results on training or validation data to estimate ensemble performance on novel data. With the specific setup evaluated; i.e. using ensembles built from a pool of independently trained neural networks and targeting diversity only implicitly, the answer is a resounding no. Experimentation, using 13 UCI datasets, shows that there is in general nothing to gain in performance on novel data by choosing an ensemble based on any of the training measures evaluated here. This is despite the fact that the measures evaluated include all the most frequently used; i.e. ensemble training and validation accuracy, base classifier training and validation accuracy, ensemble training and validation AUC and two diversity measures. The main reason is that all ensembles tend to have quite similar performance, unless we deliberately lower the accuracy of the base classifiers. The key consequence is, of course, that a data miner can do no better than picking an ensemble at random. In addition, the results indicate that it is futile to look for an algorithm aimed at optimizing ensemble performance by somehow selecting a subset of available base classifiers.

Ort, förlag, år, upplaga, sidor
IEEE, 2008.
Nyckelord [en]
ensembles, diversity, Computer Science
Nyckelord [sv]
data mining
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:hj:diva-45816DOI: 10.1109/IJCNN.2008.4634255Lokalt ID: 2320/3973ISBN: 978-1-4244-1821-3 (tryckt)OAI: oai:DiVA.org:hj-45816DiVA, id: diva2:1348930
Konferens
IJCNN 2008, Hong Kong, June 1- 6, 2008
Anmärkning

Sponsorship:

This work was supported by the Information Fusion Research Program (University of Skövde, Sweden) in partnership with the Swedish Knowledge Foundation under grant 2003/0104.

Tillgänglig från: 2019-09-06 Skapad: 2019-09-06 Senast uppdaterad: 2019-09-06Bibliografiskt granskad

Open Access i DiVA

fulltext(125 kB)41 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 125 kBChecksumma SHA-512
91e449579abf37bcaf25916e7a0547ddd876e6117822d3f1aa8af03a763639e5c3f8f7410d0cdef1d466fb9c13949bd744265b44c221d67e698fe87ee0e790a1
Typ fulltextMimetyp application/pdf

Övriga länkar

Förlagets fulltext

Personposter BETA

Löfström, TuveJohansson, Ulf

Sök vidare i DiVA

Av författaren/redaktören
Löfström, TuveJohansson, Ulf
Data- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 41 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 73 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf