Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Utvärdering av olika språkmodeller för identifiering av information i svenska texter
Jönköping University, School of Engineering, JTH, Department of Computer Science and Informatics.
Jönköping University, School of Engineering, JTH, Department of Computer Science and Informatics.
2021 (Swedish)Independent thesis Basic level (degree of Bachelor), 180 HE creditsStudent thesisAlternative title
Evaluation of different language models for identifying entities in Swedish texts (English)
Abstract [en]

The purpose of the study was to investigate how different types of Natural Language Processing (NLP) models can be fine-tuned for use within Named Entity Recognition (NER) and how these models work differently. The aim of this was to identify which type of model works best when it comes to entity extraction. To test the models, they were fine-tuned using PyTorch and Huggingface Transformers libraries. Here, it was also investigated which other techniques are not based on machine learning that can be used to solve the problem of entity extraction. The results of these tests showed that the KB/BERT-base-swedish-cased model worked best. Also, there are other techniques that can be used to solve the problem of entity extraction but that it is machine learning that is the most effective. KB/BERT-base-swedish-cased was shown to be the best model but it also proved that it may be due to the purpose that you need to use the model within. The study limitations have been that there was no large data set that could be used for training and that there was a lack of time to produce a new dataset. 

Place, publisher, year, edition, pages
2021. , p. 45
Keywords [sv]
Maskininlärning, Named Entity Recognition, Natural Language Processing, Transformers, BERT, ALBERT, ELECTRA, HFST-SweNER
National Category
Computer Engineering Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:hj:diva-55101ISRN: JU-JTH-DTA-1-20210159OAI: oai:DiVA.org:hj-55101DiVA, id: diva2:1612215
External cooperation
Cybercom Group AB
Subject / course
JTH, Computer Engineering
Supervisors
Examiners
Available from: 2021-11-18 Created: 2021-11-17 Last updated: 2021-11-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

By organisation
JTH, Department of Computer Science and Informatics
Computer EngineeringLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 479 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf