Wednesday, August 3, 2011

A10 - Binary Operations

In connection to my previous blog post on Morphological Operations, I will discuss the techniques involved in Binary Operations. As what you can infer from the name, it is all about operations involving binarization of an image.

Why and where is this technique useful then? Why binarize the image in the first place?

--> The answer is simple, binarizing an image makes it easier to separate the region of interest (ROI) from a background. If the separation is successful, we can then perform many processes in understanding the ROI. For example, in medical imaging, cancerous cells are often larger than normal cells, thus we can easily detect and separate them from the background and from normal cells by applying binary operations. Reading through this blog post will give you an idea on the procedure of this technique.




We first use a digitized image with scattered circular papers shown in figure 1. These circles can be thought of as the "normal cells" imaged on a microscope. The main thing to do with this image is to obtain the best estimate of the area (in pixel count) of the circles ("normal cells").

 
Figure 1. Scanned image of punched papers acting as 
"normal cells" in real life.

Figure 1 is a 658 x 823 image, in order to make the task faster and accurate, we divide it into subimages with size 256 x 256. By doing so, we can repetitively perform the area counting task for each subimage and obtain statistically good estimate. The cut version of figure 1 is shown in figure 2.

Figure 2. Cut version of figure 1. The size of each subimage is 256x256
with possible overlaps. The number at the upper left corner serves as the
subimage number for easy calling.

All 12 subimages are then converted to grayscale and subjected to binarization by examining its histogram to find the appropriate threshold. Binarization in Scilab can be done using im2bw(Img, threshold) where Img is the grayscale image. Results are shown in figure 3.

 
Figure 3. Binarized version of figure 2.

Notice that I intentionally used threshold values which do not perfectly remove the noise and dirts surrounding the ROI. The thresholds chosen are just enough to remove majority of the dirts and most importantly not distort or damage the integrity of the "cells". So how can we measure area accurately using automated tools if there exist dirts that can interfere the measurement? The answer is using morphological operators such as closing and opening.

Closing tends to shrink the background color holes and enlarge boundaries of foreground regions. While, opening is just the opposite [2]. Implementation wise, closing is an erosion after dilation and opening is just dilation after an erosion using the same structuring element (strel). In the SIP toolbox in Scilab 4.1.2, the closing and opening operators do not exist, so we can just create them as shown below.

//CLOSING
function im2 = closing(im, strel)
  im1 = dilate(im, strel);
  im2 = erode(im1, strel);
endfunction

//OPENING
function im2 = opening(im, strel)
  im1 = erode(im, strel);
  im2 = dilate(im1, strel);
endfunction


Therefore, we can clean the unwanted dirts in figure 3 using opening operator with a circular structuring element with size smaller than the "cells" but larger than the dirts.

Figure 4. Cleaned version of figure 3. 

And viola! The unwanted parts are now gone! This is to show that the choice of morphological operation and structuring element is correct. Technically, erosion, the first step in the opening operator, of the image by the chosen strel will decrease the size of all parts of the image. Since, the strel is larger than the floating pixels or dirts, they will be removed. However, the "cells" decreased in size due to erosion. This problem is remedied by the second step of the opening operator which is to dilate the resulting image using the same strel. By doing so,  the amount of size decrease done by erosion will be restored.

After successful cleaning of the subimages, we can now estimate the area of each "cell". An easier way is to use a function that will automatically decipher individual cell structures in each subimage. Luckily, a Scilab function does the same thing which is bwlabel. Bwlabel is a connected component labeling tool that numbers all the objects in a binary image. The sum of the pixels corresponding to that label can then be thought as the area of the cell.

The resulting distribution of the calculated areas of the structures from subimage 1 to subimage 12 is

 
Figure 5. Unnormalized histogram of the areas for all subimages with bin size 25.


Note the presence of very large areas in figure 5. This can be explained by the existence of large structures in figure 4 which are just overlapping cells. We do not want the areas of these structures, so we zoom in figure 5 to a region where the count is more than 2 (from 0 to 1000).

 
Figure 6. Zoomed version of figure 5 using a bin size of 10.

Same as the argument above, we focus the region wherein the count is more than 2 (from 400 to 600).

  
Figure 7. Zoomed version of figure 6 using a bin size of 1. 

The calculated areas must be concentrated on a region where each bin is as close as possible to other bins. Thus, in my opinion, the areas to be considered must be from 460 to 600 (excluding the first three bins with count). The resulting best estimate of the "cell" area is 538 +- 20 pixels^2. 

One may say that the best estimate was obtained with bias. To address this issue, we can choose subimage 10 and subimage 3 and perform the same procedure. These subimages, in particular, contain "cells" which are isolated or do not overlap with any other cells giving us an opportunity to not perform the histogram zooming technique as demonstrated above. 

 
Figure 8. Unnormalized histogram  of the area(s) for subimage: (a) 10 (b) 3.

The resulting area estimate for subimage 10 is 539 +- 26 pixels^2; while for subimage 3, the estimate is 555 pixels^2. It can be noted that the ranges of the three estimates overlap. This means that the method used presented above is successful.

We then move to the original goal of this activity, that is to use binary operations in isolating cancerous cells from the background and normal cells. We can simulate these abnormal cells by using again punched circular papers, but this time adding bigger circular papers which will act as the abnormal cells (shown in figure 9).

 
Figure 9. Scanned image of punched papers of two sizes.
(Notice the presence of 5 enlarged "cells")

Same as before, we convert this to grayscale and binarize it by picking the right threshold value. The resulting bianrized image is shown in figure 10.

 
Figure 10. Binarized version of figure 9 using  a 0.815 threshold.

To isolate the "enlarged cells", we perform the opening operator using a circular strel with area equal to the best estimate area of "normal cells" obtained above. The appropriate radius is then equal to sqrt[(539+26)/pi^2]. By doing so, the dirts and "normal cells" will be removed. The resulting isolated "cancerous cells" are shown below

Figure 11. Isolated "cancerous cells".


That's it folks! Honestly, I enjoyed this activity. I believe this is one of the activities wherein a direct real-life application is possible. I very much like doing things which will have a direct impact to our society.

With that, I want to give myself a grade of 10.0 for satisfactorily completing all the requirements of this activity and for successfully reaching the set objectives.

References:
[1] 'Binary Operations', 2010 Applied Physics 186 manual by Dr. Maricor Soriano
[2] http://homepages.inf.ed.ac.uk/rbf/HIPR2/close.htm


No comments:

Post a Comment