👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization

[Michal Štefánik et al.]: Intelligent Back Office: the past, present, and future 30. 3. 2023

Abstract


Within Intelligent BackOffice (IBO), we develop a system allowing the automated processing of invoices.

We decompose the solution into two main components:


  1. Text extraction: Using Optical Character Recognition (OCR) techniques, we extract the text elements from the bitmap image representation of the original document

  2. Entities recognition: Based on the (i) contents of the text elements (ii) visual layout and (iii) relations between elements in the document, we classify the elements in the document into standard categories.

In this presentation, we will talk about the current state of the project and the options for further quality enhancements.

Slides

https://docs.google.com/document/d/1NfpNuTVCmrHrRqUIo684ssyMDP1qW_dl1fq1j_8d3Gk/edit?usp=sharing