Representative image selection for data efficient word spotting
2020 (English)In: Lecture Notes in Computer Science: Document Analysis Systems / [ed] X. Bai, D. Karatzas, D. Lopresti, Springer, 2020, Vol. 12116, p. 383-397Conference paper, Published paper (Refereed)
Abstract [en]
This paper compares three different word image representations as base for label free sample selection for word spotting in historical handwritten documents. These representations are a temporal pyramid representation based on pixel counts, a graph based representation, and a pyramidal histogram of characters (PHOC) representation predicted by a PHOCNet trained on synthetic data. We show that the PHOC representation can help to reduce the amount of required training samples by up to 69% depending on the dataset, if it is learned iteratively in an active learning like fashion. While this works for larger datasets containing about 1,700 images, for smaller datasets with 100 images, we find that the temporal pyramid and the graph representation perform better.
Place, publisher, year, edition, pages
Springer, 2020. Vol. 12116, p. 383-397
Keywords [en]
Active learning, Graph representation, PHOCNet, Sample selection, Word spotting, Knowledge representation, Graph-based representations, Handwritten document, Image selection, Synthetic data, Training sample, Graphic methods
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hj:diva-50615DOI: 10.1007/978-3-030-57058-3_27Scopus ID: 2-s2.0-85090096109ISBN: 9783030570576 (print)ISBN: 978-3-030-57058-3 (electronic)OAI: oai:DiVA.org:hj-50615DiVA, id: diva2:1466967
Conference
14th IAPR International Workshop on Document Analysis Systems, DAS 2020; Wuhan; China; 26 July 2020 through 29 July 2020
2020-09-142020-09-142021-03-15Bibliographically approved