Thursday, August 25, 2011

A12 - Preprocessing Text

Just some random thoughts... One of the very first things we learned when we started going to school was to write. Writing is a representation of a language through the use of a set of symbols (in our case, alphabet). Before computers became popular, most people hand-write texts, letters, etc.; and the nature of the hand-written text is unique for each individual.

With this concept, how do people understand other people's handwritten text especially if it's too "ugly"?
--> I guess it's our innate ability to read words not letter by letter but by the first and last letters only and decipher the exact word instantly.

In relation to the handwritten text I was talking about above, I'll show in this blog post how to extract handwritten text from an imaged document with lines.


The scanned image I will use is

 
Figure 1. Scanned document with handwritten and printed texts.

The first issue to address is the apparent tilt of the image. We need to find the appropriate angle to bring back the supposedly horizontal lines to the real horizontal orientation. One solution for this is through Fourier Transform.

For faster processing, we can crop figure 1 focusing on a part with handwritten and printed texts as shown

Figure 2. Cropped region of interest.

Taking the amplitude of the figure 2's Fourier Transform in log scale and binarizing to a chosen threshold

Figure 3. (a) Amplitude of FT of figure 2. (b) Binarized version with threshold = 0.6.

From Fourier Transform basics, we know that the Fourier Transform of a horizontal line is a vertical line in frequency space. But note that figure 3a and 3b show not straight vertical peaks. This is because the initial image is tilted and that the amount of tilt in real space will correspond to an equivalent tilt in frequency space. Thus, to know the tilt angle, we compare the results in figure 3 to the FT of a real horizontal line

Figure 4. (a) Horizontal line image. (b) Amplitude of FT of a.

Using the concept of trigonometry, we can measure the tile angle using inverse tangent with formula
 
Figure 5. Trigonometric formula to get image tilt angle.

We can obtain the values of x and y by comparing figures 3b and 4b to get an approximate tilt angle of 1.04 degrees.

We can then rectify the image to bring it back to its normal position using the mogrify(image, ['-rotate', angle]) function of Scilab as shown

Figure 6. Rotated version of figure 2 by 1.04 degrees.

The next objective is to remove the horizontal lines in figure 6. This can be done by taking its Fourier Transform and considering that the Fourier Transform of a horizontal line is a vertical line, we can remove the horizontal line by creating a vertical rectangular as shown below in figure 7b. Result is shown in figure 7c.

 
Figure 7. (a) Fourier Transform of figure 6. (b) Applying two vertical rectangular
strips. (c) Resulting inverse FFT of b.

Now, we binarize figure 7c and perform morphological operations as shown

Figure 8. Apply (a) binarization, (b) color inversion,
(c) opening, (d) closing, (e) erosion and (f) thinning

The steps performed in binarizing, cleaning and processing of figure 7c are shown in figure 8 in chronological order from (a) to (f). In step (a), figure 7c is binarized using a threshold which will remove most unnecessary pixels without distorting the letters of interest. This is followed by step(c) where the binarized image is just inverted (colors from black to white and vice versa). To be able to remove isolated unnecessary pixels, we can apply the opening morphological operator with figure 8c using a 2x1 structuring element. The closing operator is then applied shown in figure 8d to close the gaps in each letter. To separate the letters from each other and reduce the thickness of the letters, erosion is applied using a 1x2 structuring element as shown in figure 8e. Finally, thinning is applied to reduce each letter to one pixel thickness.

A final step would be to label (using bwlabel) the apparent letters formed and compare with the original image

Figure 9.  Each connected blob (act as single letter) labelled with different color.
Results in figure 9 shows that there 74 clusters formed but the number of actual letters in the cropped text used is 46. This difference tells us that there were letters disintegrated and the parts formed were considered as stand-alone letters also.

The resulting extracted handwriting is not as clear as perfect as what we wanted. The only almost clear text readable is the word "cable" at the lower portion .


As an extension of extracting texts in an imaged document, we can find words in a text using template matching. Since the most abundant word in figure 1 is the printed "DESCRIPTION" word, we will find all occurrences of it using the Scilab function imcorrcoef.



We first rotate figure 1 using the same angle used above (1.04 degrees) and binarize

 Figure 10. (a) Rotated version of figure 1. (b) Binarized version of a. 
(Occurrences of the "DESCRIPTION" are encircled)

We then find use the template of "DESCRIPTION" shown in figure 11a cropped from the original text. Since in using imcorrcoef, the template is required to be an nxn image (where n is odd), I resized figure 11a to figure 11b by adding white pixels surrounding it. The resized template is then binarized as shown in figure 11c and color inversion follows as shown in figure 11d.


 
Figure 11. (a) Template. (b) Resized to 83x83 image. (c) Binarized. 
(d) Colors inverted.

After the preliminary work of template creation, we can now proceed to using imcorrcoef. Result is shown in figure 12.


 
Figure 12. Correlation of the template with original image.

In figure 12, intensity is proportional to the degree of correlation, the brighter the spot is the higher is the degree of correlation at that position. We can see at the encircled portions that there exist three very bright spots suggesting that these are the locations where the template "DESCRIPTION" can be found. The result totally agrees with the real locations of the word "DESCRIPTION" as shown in figure 10a.




12 Activities done, 6-8 activities to go! 

As a summary for this activity, I was able to implement an automated technique to find the tilt angle of a tilted image using Fourier Transform. I was able to remove horizontal lines in an imaged document. I was able to separate texts from background, perform cleaning using multiple morphological operations and convert the extracted texts to one pixel thick. However, the resulting extraction were not perefectly good because improvement in readability did not happen for all the texts of interest. And finally I was able to use imcorrcoef function of Scilab to find the instances of the word "DESCRIPTION" in the given document.

For that, I think I would give myself a grade of 9.0.




1 comment:

  1. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows. Thanks for sharing such useful picture window Tips with us i really need that kind of Informations for my business please provide some Informations regarding Picture window installation, Installs and repairs windows.

    ReplyDelete