In this paper we focus in this last concept. The linguistic model can range from simple n-grams (probabilities of character or word sequences), to sophisticated syntactic formalisms enriched with semantic information. The former is able to recognize the visual shape of characters or graphemes, and the second interprets them in their context based on some structural rules. Generally speaking, handwriting recognition relies on the combination of two models, the optical model and the linguistic model. The main difficulties are: paper degradation, differences in the handwriting style across centuries, and old vocabulary and syntax. Indeed, after decades of research, this task is still considered an open problem, specially when dealing with historical manuscripts. Within this field, one of the most challenging tasks is handwriting recognition , defined as the task of converting the text contained in a document image into a machine readable format. Document Image Analysis and Recognition (DIAR) is the pattern recognition research field devoted to the analysis, recognition and understanding of images of documents.
0 Comments
Leave a Reply. |