The biennial event on document analysis and recognition is now in its 14th year and brings the scientific community together to present methodological and technological advances, one of the most relevant being Deep Learning.

The 14th International Conference on Document Analysis and Recognition (ICDAR 2017) took place from 9 to 15 November. An event associated with the IAPR (International Association of Pattern Recognition), which represents the research community in this field. The conference continues to establish itself as a world reference, together with ICPR and DAS, as general conferences on document processing. Its objectives are still to analyse the state of the art in the field of document processing, to address future challenges and to strengthen the cohesion of the research community itself.

Serimag was keen to seize this new opportunity to find out about research and development firsthand and through leading figures in these fields. Our commitment to R&D allows us to participate in a world in which advances are happening at great speed.

Our ID badge at ICDAR 2017
Our ID badge at ICDAR 2017

The event took place at the Tessa Hall in Kyoto, and was attended by about 500 participants from around the world. Upon our arrival, we were greeted by a perfectly organised group of volunteers made up of teachers and students who gave us our identification badges, handouts to make sure we wouldn’t miss a thing, and even a T-shirt with the words “No character, no life” in Japanese. Fantastic!

Every day we had the opportunity to enjoy a keynote speech from a distinguished researcher in his or her field.  In the first of these speeches, Prof. Rangachar Kasturi from the University of South Florida outlined the developments that have taken place over the last few decades in the field of graphic recognition, before going on to describe the current state of the art. In the second speech, Prof. Andreas Dengel from the German Research Centre for Artificial Intelligence (DFKI) talked about the referential analysis of papers from a more emotionally intelligent perspective. The third speech was by Prof. Xiang Bai from Huazhong University of Science and Technology, who addressed the document analysis of scenes in which text needs to be detected, recognised and read in the real world.

Keynote speech by Prof. Xiang Bai
Keynote speech by Prof. Xiang Bai from Huazhong University of Science and Technology

There were also more than 50 lectures addressing a wide range of current issues, including document classification, word spotting, OCR, identification of the writer or signatory, recognition of scenes, processing old documentation, layout analysis, recognition of online and offline handwritten text and graphic recognition. In addition, more than 150 poster sessions by different researchers were held. Excellent papers were presented in three different rooms, where we had the chance to discuss them with their authors. Finally, a number of competitions were organised by ICDAR or by third parties. The Japanese cakes we ate during coffee breaks provided us with a much-needed rest after receiving such huge amounts of information.

Poster session in one of the rooms at ICDAR
Poster session in one of the rooms at ICDAR

This year’s event really opened our eyes to the number of definitions for the word “document”. At one time it described a sheet of paper with printed text only. Then images and graphics were added. Today the concept is much broader and concerns any text that must be retrieved and interpreted from a whole range of sources. A video or a photograph is as much a document as a sheet of paper. Hence, the term is used in fields such as robotics, augmented reality, autonomous driving, surveillance, etc.

Finally, although Deep Learning was still an innovative tool at the 2015 event held in Nancy, this year it has really established itself as the default processing method. But we mustn’t rest on our laurels. Deep Learning has allowed us to improve records that are obtained with old methodologies. However, it’s vital that we consider new horizons for this technology instead of simply using it to achieve improvements in current processes. There’s no doubt that we’ll see references in this area at upcoming conferences, such as ICML (the International Conference on Machine Learning). It will be interesting to observe how document processing can exploit such advances.

For Serimag, it was a fantastic experience to be able to share these advances with industry leaders. We have two years before the next event is held in Sydney. Perhaps we’ll present our own paper or lecture? We certainly have no shortage of knowledge, experience and desire!