Learning graphic representation for digital document analysis

When considering the problem of extracting information from documents, several aspects need to be taken into account, such as document classification, text localization, OCR (optical character recognition), extraction table, key information detection. In this context, graphics-based approaches are interesting methods of document processing. In fact, graphics are a natural way to represent connections between objects (text, blocks, images, etc.) and are intended to reveal new and hidden insights from data. Text removed from scanned documents can be graphed to use the best features of its functions. On the other hand, understanding spatial relationships is crucial for text document retrieval results for some applications, such as account analysis. The goal is to capture the structural relationships between keywords (account number, date, amount) and fundamental value (desired information). An effective approach requires a combination of spatial and textual information.

The context

Reliable reading, also known as automatic document image processing, is an important task in various fields of application such as billing, subject review, prescription analysis, etc., and it presents significant business potential. Several approaches have been proposed in the literature, but the availability of data sets and the confidentiality of the data call this into question.

volunteer

Objective

This workshop is designed to bring together experts from industry, science and academia to share ideas and discuss current research in the field of training in graphical representation for the analysis of digitized documents.

We encourage the description of new image analysis problems or information research work applications that have emerged in recent years. In addition, we also encourage the development of new scanned document data sets for new applications.

Submission:

The workshop is open to original articles of a theoretical or practical nature. Articles must be formatted according to LNCS guidelines for authors. GLESDO 2021 will follow a double-blind review process. Authors should not include their names and affiliations anywhere in the manuscript. When referring to their previous work as a third party, authors should be careful not to reveal their identity implicitly and neglect approvals until the camera is ready. Articles must be submitted via the EasyChair workshop presentation page.

We welcome the following types of contributions:

Complete Research Papers (12-15 pages): Research and development work completed or combined to be included in one of the workshop topics.

Short documents (6-8 pages): open to discussion, work in progress with relevant preliminary results.

At least one author of each paper admitted to submit the article must be enrolled in the workshop. For more information, see the ICDAR 2021 page.