Tuesday, September 20, 2011

A16 - Probabilistic Classification

Pattern recognition has been an interest of many people in the previous years. Many classification tools have been developed to perform this complex task.

In relation to my previous blog post about pattern recognition using the minimum distance classification, I will show you another tool using Linear Discriminant Analysis which is the process of finding a linear combination of features enabling separation of two or more classes of objects [2].


The fundamental formula governing Linear Discriminant Analysis [3] is

Equation 1. Linear discriminant analysis formula

where 
  • fi is the Linear Discriminant function of class i
  • mu_i is the mean of features xi in class i
  • Pi is the prior probability of class i such that

Equation 2. Prior probability. ni is the number of samples
of class i and n is the total number of samples from all classes.

  • C is the pooled covariance matrix with formula
 
Equation 3. Pooled covariance matrix

The components of the pooled covariance are the number of samples n (defined in equation 2) and the covariance matrix of class i, ci derived using the formula

Equation 4. Covariance matrix of class i

where xi^o is the mean corrected features data obtained by subtracting the mean of all samples per feature from the real feature data .

In this blog post, I used the same data from the pattern recognition activity here.

 Table 1. Training set features

which when plotted on an area versus perimeter plot will create a distinct separation between class 1 and class 2.
Figure 1. Features of training set.

Results of normalization, mean correction, and linear discriminant analysis formula application are shown in table 2.
 Table 2. Results for training set.

An object is assigned to the group yielding maximum discriminant function value. In table 2, the maximum values for each object are highlighted in yellow. This shows that for the training set, classification is perfect for all objects are correctly classified to the class they really belong.

Applying the obtained basis discriminant function to a test set (only changing the x value of equation 1), results are shown in table 3.
  Table 3. Results for test set.

Again discriminant functions highlighted with yellow for each object are maximum values. Comparison of the classification for each test object with the actual

  Table 4. Classification of test set.

Looking at the prediction results of table 4, the process obtained 100% classification accuracy. All test objects were correctly classified.

For this activity, I'm giving myself a 10.0 for successfully implementing Linear Discriminant Analysis to classify objects from two different classes. More importantly, I got a 100% correct classification.

References:
[1] 'Probabilistic Classification', 2007 Applied Physics 186 manual by Dr. Maricor Soriano
[2] http://en.wikipedia.org/wiki/Linear_discriminant_analysis 
[3] 'Linear Discriminant Analysis', prepared by Dr. S Marcos

No comments:

Post a Comment