Just some random thoughts... One of the very first things we learned when we started going to school was to write. Writing is a representation of a language through the use of a set of symbols (in our case, alphabet). Before computers became popular, most people hand-write texts, letters, etc.; and the nature of the hand-written text is unique for each individual.
With this concept, how do people understand other people's handwritten text especially if it's too "ugly"?
--> I guess it's our innate ability to read words not letter by letter but by the first and last letters only and decipher the exact word instantly.
In relation to the handwritten text I was talking about above, I'll show in this blog post how to extract handwritten text from an imaged document with lines.
RECREATE, UNDERSTAND and FULFILL your imagined world through Image and Video Processing tools
Showing posts with label AP186. Show all posts
Showing posts with label AP186. Show all posts
Thursday, August 25, 2011
Thursday, August 18, 2011
A11 - Playing notes by image processing
Listening to music is an integral part of my everyday life. It relaxes me and gives me my private moment.
Artists and musicians play the music we hear using a score sheet. A sheet with musical notes serving as a guide to perform a piece of music. So how do they play the music derived from a score sheet? A common answer would be via an instrument/voice.
But do you know that we can actually play a music using image processing? Cool!
In this blog post, we will extract musical notes from a digitized score sheet and play these in Scilab with the proper frequency and duration.
Artists and musicians play the music we hear using a score sheet. A sheet with musical notes serving as a guide to perform a piece of music. So how do they play the music derived from a score sheet? A common answer would be via an instrument/voice.
But do you know that we can actually play a music using image processing? Cool!
In this blog post, we will extract musical notes from a digitized score sheet and play these in Scilab with the proper frequency and duration.
Wednesday, August 3, 2011
A10 - Binary Operations
In connection to my previous blog post on Morphological Operations, I will discuss the techniques involved in Binary Operations. As what you can infer from the name, it is all about operations involving binarization of an image.
Why and where is this technique useful then? Why binarize the image in the first place?
--> The answer is simple, binarizing an image makes it easier to separate the region of interest (ROI) from a background. If the separation is successful, we can then perform many processes in understanding the ROI. For example, in medical imaging, cancerous cells are often larger than normal cells, thus we can easily detect and separate them from the background and from normal cells by applying binary operations. Reading through this blog post will give you an idea on the procedure of this technique.
Why and where is this technique useful then? Why binarize the image in the first place?
--> The answer is simple, binarizing an image makes it easier to separate the region of interest (ROI) from a background. If the separation is successful, we can then perform many processes in understanding the ROI. For example, in medical imaging, cancerous cells are often larger than normal cells, thus we can easily detect and separate them from the background and from normal cells by applying binary operations. Reading through this blog post will give you an idea on the procedure of this technique.
Monday, July 11, 2011
A6 - Fourier Transform Model of Image Formation
If you would ask a physicist some terms or concepts commonly used in his/her field, Fourier Transform may have a good chance coming out. You know, Fourier Transform may sound so creepy to guys doing something far from physics, but it's just a piece of mathematical tool that does great wonder =)
Fourier transform decomposes a signal into its component frequencies. It has a wide range of application, from optics to signal and image processing and to real life devices such as spectroscopy to magnetic resonance imaging [2].
In strict mathematical definition, the Fourier Transform(FT) of an two dimensional image f(x,y) is
where fx and fy are the component frequencies along x and y. However, if you don't have a background in programming, you would ask how to implement equation 1 numerically. The answer would be to discretized equation 1 in the form

References:
[1] 'Fourier Transform Model of Image Formation', 2010 Applied Physics 186 manual by Dr. Maricor Soriano
[2] http://en.wikipedia.org/wiki/Fourier_transform#Applications
[3] http://fourier.eng.hmc.edu/e101/lectures/Image_Processing/node6.html
Fourier transform decomposes a signal into its component frequencies. It has a wide range of application, from optics to signal and image processing and to real life devices such as spectroscopy to magnetic resonance imaging [2].
In strict mathematical definition, the Fourier Transform(FT) of an two dimensional image f(x,y) is
Equation 1. Fourier transform
Equation 2. Discrete Fourier Transform
where xo and yo are spatial intervals between consecutive signal samples in the x and y [3]. In most programming software, the implementation is called a Fast Fourier Transform (FFT). In Scilab, the function is fft2().
The output of the FFT is a complex number array where the diagonals of its quadrant interchanged as shown below
Figure 1. Quadrant conversion from image to FFT output.
To see what fft2 in Scilab can do, we apply it to a 128x128 image of a white circle and letter "A".
Figure 2. (A) binary image of a circle. (B) circle FFT. (C) circle shifted FFT.
(D) circle inverse FFT. (E) binary image of "A". (F) "A" FFT.
(G) "A" shifted FFT. (H) "A" inverse FFT.
Figures 2A and 2E are the images of a circle and letter "A" created using Scilab and Paint. We then use the fft2() of Scilab to get the Fourier transform of these images as shown in 2B and 2F. However, since the output of an FFT is a complex array, we must take its intensity value instead using abs().
Taking note of the quadrant shifting described in figure 1, the results of 2B and 2F must be shifted back to original quadrant orientation using fftshift(). The results can be seen at figures 2C and 2G and it can be noted that 2C is actually verifying the analytical Fourier Transform of a circle. To check whether the obtained FFT results are correct, an inverse FFT is done by taking the FFT of the results of 2C and 2G. The corresponding results are shown in 2D and 2H which are obviously (well for letter "A") inverted.
Now, after familiarizing ourselves with FFT, we now move one step higher by introducing my friend Mr. Convolution. Convolution practically is just a combination of two functions producing a modified version of the two. Mathematically speaking, it is expressed as

Equation 3. Convolution integral
We now take use of this beautiful convolution integral by simulating an imaging device. Yes, an imaging device! Suppose you have a camera or something that has a lens in it, if you try covering parts of the lens, what effects do you observe in the image output? Hmmm... Verify your intuition by reading further.
So we create a 128x28 image of the letters "VIP" filling up approximately 50% of the space (figure 3) and a white circle same as in figure 2A. The letters will serve as our image to be captured and the circle as our aperture.
Figure 3. Created "VIP" image.
We can implement equation 3 by first taking the FFT fft2() of figure 3 and fftshift() of figure 2A. To get the convolved image, we get the inverse of the product of their FFT. If we do this for circles of different radii, the results will be as follow

Figure 4. (A-E) Circular apertures of radii 0.1, 0.3, 0.5, 0.7 and 0.9.
(F-J) Convolved images corresponding to apertures in A to E.
We may observe that the resulting images are inverted as compared to the original image figure 3. And as the aperture's radii decreases, the images' resolution decreases as well becoming "blurrier". This is practically true in real life as less light enters the device and thus producing low quality image.
After saying hi to Mr. Convolution, I want you to meet Mr. Correlation. From the word's English definition, correlation is indeed the measure of similarity between two objects or in our case two functions. If you're interested with its mathematical expression, it is

Equation 4. Correlation function p.
At first glance, it pretty looks like the same as the convolution integral, but it's not. The only chance they become equal is that when f and g are even functions. cool!
But where do we use Mr. Correlation? --> We can use it for pattern recognition or template matching. I'll tell you how.
We first create a 128x128 image of a text "THE RAIN IN SPAIN STAYS MAINLY IN THE PLAIN." We will try to recreate this text using 128x128 image of the letter "A" with the same font and font size as with the text. We then take the inverse FFT of the product of the FFT of "A" and the conjugate of the FFT of the text. In Scilab, the easier way to take a conjugate is by using conj(). If you got a bit confused, a snippet of the code is output = fft2(Fa.*(conj(Ftext))) where Fa is the FFT of "A" and Ftext is the FFT of the text. Results are

Figure 5. (A) Text. (B) letter "A". (C) output. (D) shifted FFT of C.
It can be seen in figure 5D that the text is somewhat mimic to some extent. Locations of letter "A" produced the highest correlation translating to a brighter image which indeed brought the idea that correlation can be actually used in finding words in a document.
Another template matching application is edge detection where the edge pattern is matched with an image. We can implement this by making a 3x3 matrix of an edge pattern where a value of negative -1 can be treated as an edge. (NOTE: total sum must be zero!). In Scilab we can create different edge patterns such as horizontal, vertical, diagonal and spot by using the matrices
pattern_hor = [-1 -1 -1; 2 2 2; -1 -1 -1];
pattern_ver = [-1 2 -1; -1 2 -1; -1 2 -1];
pattern_diag = [-1 -1 2; -1 2 -1; 2 -1 -1];
pattern_spot = [-1 -1 -1; -1 8 -1; -1 -1 -1];
These pattern matrices are convolved with the VIP image as in figure 3 using the Scilab function imcorrcoef(grayscale image, pattern). Results are shown below.
Figure 6. (A) horizontal pattern. (B) vertical pattern. (C) diagonal pattern.
(D) spot pattern. (E-H) Convolved image of VIP with A-D.
We can see from the results of figure 6, the resulting convolved image follows the edges feed into them. Horizontal pattern makes the horizontal edges of VIP more pronounced while vertical pattern highlights the vertical edges. The spot pattern in turn created the clearest edge detection result due to the fact that it somewhat combines the horizontal and vertical patterns.
Woohoo! This was a fun activity, I learned a lot... I already have thoughts of applying the methods in my future works.
I would probably give myself a grade of 10.0 for producing all the outputs needed in this activity and producing good figures to back my results.
References:
[1] 'Fourier Transform Model of Image Formation', 2010 Applied Physics 186 manual by Dr. Maricor Soriano
[2] http://en.wikipedia.org/wiki/Fourier_transform#Applications
[3] http://fourier.eng.hmc.edu/e101/lectures/Image_Processing/node6.html
Subscribe to:
Posts (Atom)