The Inside Realm of a Picture: AP186

In connection to my previous blog post on Morphological Operations, I will discuss the techniques involved in Binary Operations. As what you can infer from the name, it is all about operations involving binarization of an image.

Why and where is this technique useful then? Why binarize the image in the first place?

--> The answer is simple, binarizing an image makes it easier to separate the region of interest (ROI) from a background. If the separation is successful, we can then perform many processes in understanding the ROI. For example, in medical imaging, cancerous cells are often larger than normal cells, thus we can easily detect and separate them from the background and from normal cells by applying binary operations. Reading through this blog post will give you an idea on the procedure of this technique.

If you would ask a physicist some terms or concepts commonly used in his/her field, Fourier Transform may have a good chance coming out. You know, Fourier Transform may sound so creepy to guys doing something far from physics, but it's just a piece of mathematical tool that does great wonder =)

Fourier transform decomposes a signal into its component frequencies. It has a wide range of application, from optics to signal and image processing and to real life devices such as spectroscopy to magnetic resonance imaging [2].

In strict mathematical definition, the Fourier Transform(FT) of an two dimensional image f(x,y) is

Equation 1. Fourier transform

where fx and fy are the component frequencies along x and y. However, if you don't have a background in programming, you would ask how to implement equation 1 numerically. The answer would be to discretized equation 1 in the form

Equation 2. Discrete Fourier Transform

where xo and yo are spatial intervals between consecutive signal samples in the x and y [3]. In most programming software, the implementation is called a Fast Fourier Transform (FFT). In Scilab, the function is fft2().

The output of the FFT is a complex number array where the diagonals of its quadrant interchanged as shown below

Figure 1. Quadrant conversion from image to FFT output.

To see what fft2 in Scilab can do, we apply it to a 128x128 image of a white circle and letter "A".

Figure 2. (A) binary image of a circle. (B) circle FFT. (C) circle shifted FFT.

(D) circle inverse FFT. (E) binary image of "A". (F) "A" FFT.

(G) "A" shifted FFT. (H) "A" inverse FFT.

Figures 2A and 2E are the images of a circle and letter "A" created using Scilab and Paint. We then use the fft2() of Scilab to get the Fourier transform of these images as shown in 2B and 2F. However, since the output of an FFT is a complex array, we must take its intensity value instead using abs().

Taking note of the quadrant shifting described in figure 1, the results of 2B and 2F must be shifted back to original quadrant orientation using fftshift(). The results can be seen at figures 2C and 2G and it can be noted that 2C is actually verifying the analytical Fourier Transform of a circle. To check whether the obtained FFT results are correct, an inverse FFT is done by taking the FFT of the results of 2C and 2G. The corresponding results are shown in 2D and 2H which are obviously (well for letter "A") inverted.

Now, after familiarizing ourselves with FFT, we now move one step higher by introducing my friend Mr. Convolution. Convolution practically is just a combination of two functions producing a modified version of the two. Mathematically speaking, it is expressed as

Equation 3. Convolution integral

We now take use of this beautiful convolution integral by simulating an imaging device. Yes, an imaging device! Suppose you have a camera or something that has a lens in it, if you try covering parts of the lens, what effects do you observe in the image output? Hmmm... Verify your intuition by reading further.

So we create a 128x28 image of the letters "VIP" filling up approximately 50% of the space (figure 3) and a white circle same as in figure 2A. The letters will serve as our image to be captured and the circle as our aperture.

Figure 3. Created "VIP" image.

We can implement equation 3 by first taking the FFT fft2() of figure 3 and fftshift() of figure 2A. To get the convolved image, we get the inverse of the product of their FFT. If we do this for circles of different radii, the results will be as follow

Figure 4. (A-E) Circular apertures of radii 0.1, 0.3, 0.5, 0.7 and 0.9.

(F-J) Convolved images corresponding to apertures in A to E.

We may observe that the resulting images are inverted as compared to the original image figure 3. And as the aperture's radii decreases, the images' resolution decreases as well becoming "blurrier". This is practically true in real life as less light enters the device and thus producing low quality image.

After saying hi to Mr. Convolution, I want you to meet Mr. Correlation. From the word's English definition, correlation is indeed the measure of similarity between two objects or in our case two functions. If you're interested with its mathematical expression, it is

Equation 4. Correlation function p.

At first glance, it pretty looks like the same as the convolution integral, but it's not. The only chance they become equal is that when f and g are even functions. cool!

But where do we use Mr. Correlation? --> We can use it for pattern recognition or template matching. I'll tell you how.

We first create a 128x128 image of a text "THE RAIN IN SPAIN STAYS MAINLY IN THE PLAIN." We will try to recreate this text using 128x128 image of the letter "A" with the same font and font size as with the text. We then take the inverse FFT of the product of the FFT of "A" and the conjugate of the FFT of the text. In Scilab, the easier way to take a conjugate is by using conj(). If you got a bit confused, a snippet of the code is output = fft2(Fa.*(conj(Ftext))) where Fa is the FFT of "A" and Ftext is the FFT of the text. Results are

Figure 5. (A) Text. (B) letter "A". (C) output. (D) shifted FFT of C.

It can be seen in figure 5D that the text is somewhat mimic to some extent. Locations of letter "A" produced the highest correlation translating to a brighter image which indeed brought the idea that correlation can be actually used in finding words in a document.

Another template matching application is edge detection where the edge pattern is matched with an image. We can implement this by making a 3x3 matrix of an edge pattern where a value of negative -1 can be treated as an edge. (NOTE: total sum must be zero!). In Scilab we can create different edge patterns such as horizontal, vertical, diagonal and spot by using the matrices

pattern_hor = [-1 -1 -1; 2 2 2; -1 -1 -1];

pattern_ver = [-1 2 -1; -1 2 -1; -1 2 -1];

pattern_diag = [-1 -1 2; -1 2 -1; 2 -1 -1];

pattern_spot = [-1 -1 -1; -1 8 -1; -1 -1 -1];

These pattern matrices are convolved with the VIP image as in figure 3 using the Scilab function imcorrcoef(grayscale image, pattern). Results are shown below.

Figure 6. (A) horizontal pattern. (B) vertical pattern. (C) diagonal pattern.

(D) spot pattern. (E-H) Convolved image of VIP with A-D.

We can see from the results of figure 6, the resulting convolved image follows the edges feed into them. Horizontal pattern makes the horizontal edges of VIP more pronounced while vertical pattern highlights the vertical edges. The spot pattern in turn created the clearest edge detection result due to the fact that it somewhat combines the horizontal and vertical patterns.

Woohoo! This was a fun activity, I learned a lot... I already have thoughts of applying the methods in my future works.

I would probably give myself a grade of 10.0 for producing all the outputs needed in this activity and producing good figures to back my results.

References:
[1] 'Fourier Transform Model of Image Formation', 2010 Applied Physics 186 manual by Dr. Maricor Soriano
[2] http://en.wikipedia.org/wiki/Fourier_transform#Applications
[3] http://fourier.eng.hmc.edu/e101/lectures/Image_Processing/node6.html

The Inside Realm of a Picture

Thursday, August 25, 2011

A12 - Preprocessing Text

Thursday, August 18, 2011

A11 - Playing notes by image processing

Wednesday, August 3, 2011

A10 - Binary Operations

Monday, July 11, 2011

A6 - Fourier Transform Model of Image Formation