Handwritten Character Segmentation

Introduction and Objectives

Handwritten character segmentation is a topic of high interest and potential. Thanks to the recent progresses in the field of machine learning and computer vision, it is becoming a powerful tool for industry. Automatic post mail sorting system and recognition of handwritten texts in historical documents are a just a few of the real world applications. Figure 1 shows one real life application. It reduces a lot of human work to a few simple tasks done by computers. As a benefit, much time and human resources can be saved. In terms of reliability and consistency, it is a matter of finding the proper algorithm and training, to get satisfying results.

Figure 1. Address Recognition
Figure 1. Address Recognition

Project Methodology

Our project topic in general is wide and challenging. Researchers working on this topic have already achieved quite much, but still there is room for development. Our ultimate goal is to work on some particular problem and improve the existing results using the techniques available for automatically recognising hand-written characters.

The challenging part of the problem arises not only from the fact that different people have different writing styles, but also from the fact that a lot of work has to be done to understand the writing style of a single person. Usually, in written texts, neighbour characters are connected within a word. The challenge is to segment the word into individual characters. One common characteristic in the existing algorithms is that the character segmentation process is closely coupled with the recognition process.

Given the image of a horizontal text line, our goal will be to segment it into characters. For this process, we need to work on vertical feature vectors in order to perform a binary classification capable of finding optimal separations between two characters. After this step, we get connected components and recognise each character separately.

Our project next semester will deal with theory (25% of workload), development (50%) and experimental evaluation parts (25%). Even though we make a sharp distinction between these phases, they are closely related to each other. In the following section of our blogpost, we show each step in more detailed view.

In the first phase of the work, our plan is to study the necessary theoretical background in the field of character segmentation. More precisely, we will have insight into the existing libraries. Some of them will be provided by the our teaching assistants, but we will also look for others.This step is very crucial since it is the foundation of the project. In this groundwork phase, we are going to learn the main existing techniques of machine learning and computer vision that are applied to this task. Since it is a very actual and interesting field, there are many papers published on this topic that will be helpful for us to understand these techniques. The paper ‘Character segmentation in handwritten words-an overview’ by Yi Lu and M. Shridhar, is a good resource for us for beginning. As that paper discusses, textual processing includes 3 main stages that:

  1. determining the skew (any tilt at which the document may have been scanned);
  2. finding columns, paragraphs, text lines, and words; and
  3. performing optical character recognition (OCR).

In the end of this phase, we will make summary of important theoretical knowledge needed.

After clear understanding of the techniques, we will begin to work on the development phase. Most of the time (around 2 months) will be devoted to this part of the project. In this phase, we will begin to work with the tools (in particular, libraries) provided by the TA’s. We are planning to use C/C++ language for the software development part, as it is the most used language in this field. More specifically, until now there already exists a system, of artificial vision capable to locate and extract content from images of historical documents. This system has been developed mainly by the TA’s and works as follows:

It firstly improves the quality of the historic images by filtering the pixels of the text and removing all the other pixels belonging to the paper, such as background, dirtiness and other unnecessary effects. Then it extracts the lines of the text in the form of polygons circumscribing the lines. Our job will be to work with images having the shape of snake as shown in Figure 2 and we’ll apply ideally the following approach, which is more detailed work that we will do in this phase:

  1. compute the skeletal image of the line of text
  2. a window will flow horizontally (one pixel at once) along the baseline of the line of text
  3. extraction of a feature vector
  4. classification of the points of contact between two different characters (using projection profiles, and other features)
Figure 2. Written historical Document
Figure 2. Written historical Document

By following the procedure above and applying corresponding algorithms, we will come up with the solution that we want. We expect to have some kind of result product in the end second phase. This product will take handwritten text as an input and will output the result that the algorithm guesses. We will spend some time trying to make this application user friendly and good looking.

In the last step, we plan to assess this product. It will take about last 2-4 weeks to work on this phase of the project. TA’s will provide us corresponding test files and we will try to examine how our product works. The experimental evaluation phase will include improvement of the existing product if we encounter some problem. The following table summarises the project phases we discussed above: