A confidence-based interface for neuro-symbolic visual question answering
2022 (English)In: Combining learning and reasoning: Programming languages, formalisms, and representations: CLeaR-Workshop, 2022Conference paper, Published paper (Refereed)
Abstract [en]
We present a neuro-symbolic visual question answering (VQA) approach for the CLEVR dataset that is based on the combination of deep neural networks and answer-set programming (ASP), a logic-based paradigm for declarative problem solving. We provide a translation mechanism for the questions included in CLEVR to ASP programs. By exploiting choice rules, we consider deterministic and non-deterministic scene encodings. In addition, we introduce a confidence-based interface between the ASP module and the neural network which allows us to restrict the non-determinism to objects classified by the network with high confidence. Our experiments show that the non-deterministic scene encoding achieves good results even if the neural networks are trained rather poorly in comparison with the deterministic approach. This is important for building robust VQA systems if network predictions are less-than perfect.
Place, publisher, year, edition, pages
2022.
Keywords [en]
neuro-symbolic reasoning, visual-question answering, answer-set programming, deep learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hj:diva-63652OAI: oai:DiVA.org:hj-63652DiVA, id: diva2:1839599
Conference
CLeaR 2022, The First International Workshop on Combining Learning and Reasoning: Programming Languages, Formalisms, and Representations In conjunction with the 36th AAAI conference on artificial intelligence (AAAI-2022), February 22–March 1, 2022, Vancouver, BC, Canada
2024-02-212024-02-212024-02-21Bibliographically approved