Inv3D: a high-resolution 3D invoice dataset for template-guided single-image document unwarping

Hertlein, Felix 1; Naumann, Alexander 2; Philipp, Patrick
1 Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB), Karlsruher Institut für Technologie (KIT)
2 Institut für Fördertechnik und Logistiksysteme (IFL), Karlsruher Institut für Technologie (KIT)


Numerous business workflows involve printed forms, such as invoices or receipts, which are often manually digitalized to persistently search or store the data. As hardware scanners are costly and inflexible, smartphones are increasingly used for digitalization. Here, processing algorithms need to deal with prevailing environmental factors, such as shadows or crumples. Current state-of-the-art approaches learn supervised image dewarping models based on pairs of raw images and rectification meshes. The available results show promising predictive accuracies for dewarping, but generated errors still lead to sub-optimal information retrieval. In this paper, we explore the potential of improving dewarping models using additional, structured information in the form of invoice templates. We provide two core contributions: (1) a novel dataset, referred to as Inv3D, comprising synthetic and real-world high-resolution invoice images with structural templates, rectification meshes, and a multiplicity of per-pixel supervision signals and (2) a novel image dewarping algorithm, which extends the state-of-the-art approach GeoTr to leverage structural templates using attention. ... mehr

DOI: 10.1007/s10032-023-00434-x
Zitationen: 3
Zugehörige Institution(en) am KIT Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Institut für Fördertechnik und Logistiksysteme (IFL)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2023
Sprache Englisch
Identifikator ISSN: 1433-2833, 1433-2825
KITopen-ID: 1000159172
Erschienen in International Journal on Document Analysis and Recognition (IJDAR)
Verlag Springer
Vorab online veröffentlicht am 29.04.2023
Schlagwörter Document Unwarping, Dataset, Template, OCR, Transformer
