Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Textual content, cited references, similarity order, and clustering: an experimental study in the context of science mapping
Department of e-Resources, University Library, Stockholm University.
Jönköping University, The University Library.
2009 (English)In: Proceedings of the 12th International Conference on Scientometrics and Informetrics, 2009, p. 862-873Conference paper, Published paper (Refereed)
Abstract [en]

This paper deals with document-document similarity approaches, the issue of similarity order, and clustering methods, in the context of science mapping. Using two data sets of bibliographic records, associated with the fields of information retrieval and scientometrics, we investigate how well two document-document similarity approaches, a text-based approach and bibliographic coupling, agree with ground truth classifications (obtained by subject experts), under first-order and second-order similarities, and under four different clustering methods. The clustering methods are average linkage, complete linkage, Ward’s method and consensus clustering. The performance of first-order and second-order similarities is compared within the two document-document similarity approaches, and under each clustering method. We also compare the performance of the clustering methods. The results show that the text-based approach consistently outperformed bibliographic coupling with regard to the information retrieval data set, but performed consistently worse than the latter approach regarding the scientometrics data set. For the similarity order issue, second-order similarities performed better than first-order in 12 out of 16 cases. Average linkage had the best overall performance among the clustering methods, followed by consensus clustering. The main conclusion of the study is that second-order similarities seem to be a better choice than first-order in the science mapping context.

Place, publisher, year, edition, pages
2009. p. 862-873
Keywords [en]
Bibliometrics, Citation data, Text mining, Similarity order, Consensus clustering
National Category
Sociology
Identifiers
URN: urn:nbn:se:hj:diva-9771OAI: oai:DiVA.org:hj-9771DiVA, id: diva2:228971
Available from: 2009-08-10 Created: 2009-08-10 Last updated: 2011-02-02Bibliographically approved

Open Access in DiVA

No full text in DiVA

By organisation
The University Library
Sociology

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 599 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf