👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization

[Šárka Ščavnická]: Multimodal Question Answering 13. 4. 2023

Abstract

Document question answering aims to provide users with accurate and efficient answers to their queries, thereby improving access to relevant information. This task has become increasingly important in recent years due to the vast amount of digital information available on the web.

In this presentation, we present the first document question-answering model trained on Czech invoices, and we will discuss different ways to ensure that the model is able to answer unknown questions, such as searching for new entities in the text that it has not yet been trained for.

Slides

Presentation slides: DVQA

Presentation recordings

Presentation recording

Readings

[1] Xu, Yiheng, et al. “LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding.” ArXiv abs/2104.08836 (2021)

Předchozí

Následující

👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization
- Nyní studovat
  
  [Richard Šoltis, Šárka Ščavnická, Dávid Meluš]: Topics and research of diploma thesis 23. 2. 2023
- Nyní studovat
  
  [Jakub Ryšavý]: Confidence Intervals 9. 3. 2023
- Nyní studovat
  
  [Katarína Grešová]: Using Attribution Sequence Alignment to Interpret Deep Learning Models for MiRNA Binding Site Prediction 16. 3. 2023
- Nyní studovat
  
  [Dávid Meluš]: Utilization of contextual information for post-OCR error correction using language models 23. 3. 2023
- Nyní studovat
  
  [Michal Štefánik et al.]: Intelligent Back Office: the past, present, and future 30. 3. 2023
- Nyní studovat
  
  [Šárka Ščavnická]: Multimodal Question Answering 13. 4. 2023
- Nyní studovat
  
  [Marek Kadlčík]: Can language models use external tools? 20. 4. 2023
- Nyní studovat
  
  [David Čechák]: Deep learning in DNA decay prediction 4. 5. 2023
- Nyní studovat
  
  [Michal Štefánik, Marek Kadlčík]: EACL breaking news 18. 5. 2023

Operace