Thursday, September 29, 2011

A18 - Basic Video Processing

For all my past blog posts, I have done numerous tasks such as classification, playing musical notes, measuring areas, cleaning by Fourier transform, etc.

But what's the common thing about them? Yes, all of them are based on extracting features using an image only. So the question is, how about videos? Can the same techniques be employed?

In this blog post, I will show you a simple experiment on extracting features from a kinematic event captured using a video camera.


Videos and still images are much related. A video is just a collection of images, concatenated based on a frame rate constant measured in frames per second or fps. Most video cameras use a frame rate of 30 fps. This means that the video camera captures an image every 1/30 or 0.033 seconds.

I, together with my collaborators Kirby and TJ, took a video of a free falling white ball. Our main goal is to compute for the acceleration due to gravity using the center of mass of the ball. Since a video is just a set of images, we only chose 4 frames containing the significant data we need to reduce computational time. We first converted the captured image into .avi format using the program Stoik Video Converter. The conversion settings are at 160x120 frame size, 30 fps frame rate, video compressor to black and white and audio compressor to no sound. Using VirtualDub, the significant frames are marked. The resulting chopped video is shown below

 
Figure 1. Free-falling ball.
(Note: The images were combined in an animated gif 
using frame rate of 5 fps for illustration purposes)

We can now move on to processing each of the four images. As a start, we first binarize each image as shown below

 
Figure 2. Binarized version of figure 1.
(Note: The images were combined in an animated gif 
using frame rate of 5 fps for illustration purposes)
Using the same technique in my previous blog posts, we calculated for the center of mass of the ball. Since theoretically the locations of the center of mass along the x-axis must be the same, it will be much simpler to only take the center of mass locations along the y-axis. The obtained values are 21.609, 31.632, 42.261 and 53.803 pixels. 

Note that the obtained data are in pixel units, thus there is a need to use a conversion factor to convert pixel units to physical units. Using the same images and locating clear features that can serve as markers, we obtained a conversion factor of 36 pixels = 0.491 m. Thus, the y-values are now 0.295, 0.431, 0.576 and 0.734 m.

In addition, as mentioned above, the frame rate used is 30 fps. Thus, each image is 0.033 seconds after the previous image. Plotting the y-locations versus time, we obtained this plot
Figure 3. Dependence of distance with time.

Comparing the resulting regression fit of y = 4.6624*t^2 + 3.6097*t + 0.1694 with that of the analytic expression for the distance of a free-falling object (y = -0.5*g*t^2). We obtained a g value of -9.3248 m/s having a 4.85% deviation from the approximated g value of -9.8 m/s.


Woohoo!! Last blog post for the course (excluding the project of course)!!!

Anyway, for this activity I'm giving myself a 10.0 for successfully extracting features from a video and obtaining a small difference between the approximated true value of g and the video-processing based experimental value.

I would like to thank TJ Abregana and Kirby Cheng for being good groupmates for this activity. :)

References:
[1] 'Basic Video Processing', 2008 Applied Physics 186 manual by Dr. Maricor Soriano

No comments:

Post a Comment