Sunday, January 29, 2012

Parameter Selection in Hidden Markov Models

Hidden Markov Models have been used in Computer Vision, Speech Recognition extensively.

In this post, I discuss the criterion for selecting parameters of an Hidden Markov Model (HMM) for optimum training. If you still do not properly understand whats an HMM, please refer this article on HMMs. In order to classify data, one HMM is needed per class.

Assume that we want to have a Discrete HMM represented by λ( A, B, π). O be the set of observations and N to be number of states in a HMM and M be the number of unique symbols.

P(O|λ)  = P = likelihood that an observation matches the given HMM model.


The following are some of the criteria in selecting parameters in Hidden Markov Models (taken from Bell System Journal):
  1. Initial Estimates of A and B: We know that HMM classifier can be trained using Baum-Welch algorithm. For the given set of observations sequences (O1, O2, ......) in Baum-Welch algorithm, λ is re-estimated iteratively such that P is maximized. This depends on the initial parameters of HMM i.e.  λ. Different values of A, B and π gives different likelihoods. Different values of λ give different results. There is no one right way to select initial estimates. Hence, they can be of any random values( however, the matrices A, B and π should sum to 1 along the rows). 

  2. HMM structure and number of states: Structure of an HMM has an effect on performance of the HMM. For example in speech recognition, a HMM which is modeled from left to right is preferred. There also is no need for a state to traverse to each and every state in the model. i.e. Few elements of A can be zero. If N is too large reliable determination of A and B can become hard. If N is large, also Viterbi Decoding becomes hard. There is also no right way to choose N as it need not be related to a "observable phenomenon" in the problem.

  3. Multiple Observation sequences: Multiple observation sequences (from different training examples of same class) need to be used. Mixing training samples from different sources is a good idea.

  4. Constraints on A, B matrices during training: The matrices A, B and π  can take any values (if  they follow the constraints such as along each row of a matrix sum of elements needs to be 1). Its found that a HMM where, a state can traverse to all the states is not so flexible. At the same time, a state transition should also not occur "within" the state or "only" to next state. Such restrictions might constrain the level to which the model can be trained. For example in speech models, the transition is usually from left to right, so the matrix elements are zero below the diagonal. If the matrix B is unconstrained and the data set for training is limited, it is likely that bj(k)=0. There are some solutions for this problem in the journal.

  5. Multiple estimates of A, B and averaging: It is known that different initial values of λ is likely to give different likelihoods for a testing set. So, different values of λ needs to implemented and averaged (average values of matrix elements of AB and π). However, it was observed that results of the averaged model was not better than other best solutions computed. So, picking the model that has low memory constraint and gives good results should be picked.

Friday, January 20, 2012

Install Boost Libraries on Ubuntu

I've heard a lot about Boost Libraries. They are well known in the programming community as one of the most well written, quite handy C++ libraries one can use. You can download them here.

For a quick install of Boost libraries on Ubuntu, use the following command:


To quickly learn about Boost Library, click here for free online accessible book.

Tuesday, January 17, 2012

Kmeans clustering in OpenCV with C++

Kmeans clustering is one of the most widely used UnSupervised Learning Algorithms. If you are not sure what Kmeans is, refer this article. Also if you have heard about the term Vector Quantization, Kmeans is closely related to that (refer this article to know more about it). Autonlab has a great ppt on Kmeans Clustering.

First, I'll talk about the kmeans usage in OpenCV with C++ and then I'll explain it with a program. If you are not yet comfortable in OpenCV with  C++, please refer to this article and the pretty much everything else is the same as in C API (where you use IplImage*,etc).

Btw, My other programs in OpenCV will be posted here.

Function call in C++ API of OpenCV accepts the input in following format:
double kmeans(const Mat& samples, int clusterCount, Mat& labels, TermCriteria termcrit, int attempts, int flags, Mat* centers);

Parameters explained as follows:
  1. samples: It contains the data. Each row represents a Feature Vector. Each co lumn in a row represent a dimension. So, we can have multiple dimensions of data in the feature vector. Example if we have 50, 5 dimensional feature vector, we will have 50 rows, 5 colums of this matrix. One thing interesting which I've noticed is kmeans doesn't work with CV_64F type.
  2. clusterCount: It should be specified beforehand. We need to know how many clusters do we divide the data into. It is an integer.
  3. labels: It is an output Matrix. If we had a Matrix of above specified size (i.e 50 x 5 ), we will have 50 x 1 output Matrix. It determines which cluster the feature vector belongs. It starts with 0, 1, .... (number of clusters-1).
  4. TermCriteria: It determines the criteria in applying the algorithm. Max iterations, accuracy,etc. 
  5. attempts: number of attempts made with different initial labelling. Also refer documentation for elaborate information on this parameter.
  6. flags: It can be
    KMEANS_RANDOM_CENTERS   (for random initialization of cluster centers).
    KMEANS_PP_CENTERS   (for kmeans++ version of initializing cluster centers)
    KMEANS_USE_INITIAL_LABELS   (for user defined initialization).
  7. centers: Matrix holding center of each cluster. If we divide the 50 x 5 feature vector into 2 clusters, we will have 2 centers of each in 5 dimensions.
Sample program is explained as follows:


Friday, January 13, 2012

Access Pixels in C++ using OpenCV

I'm writing this blog post since, I've seen many people while moving from C API to C++ API have faced a lot of confusion in understanding it.

This post explains on how to use cv::Mat datatype in accessing pixels of a Image or elements of a Matrix. An example of cv::Mat_ is also shown.

Btw, My other programs in OpenCV will be posted here.

Many people still write code using IplImage* or CvMat* datatype. The sight of pointers scare a lot of people (except for some). With moving to C++ API, one can make use of C++ references which are more easier to understand and also one can use Object Oriented Programming techniques to bring in more easier to write code and flexibity in programming.

Moving to C++ API hasn't been that tough for me but, it definitely took a while. Now, my code looks more cleaner that before. I recommend everyone to move to C++ API using the cv::Mat type.

Sometimes, when you don't want to deal with the return type of the Matrix while accessing the elements (i.e. without the use of ".at" operator), you can use cv::Mat_ (which is a template sub class of cv::Mat).

Taking care of the data-type of Matrix elements is really important. I didn't care much about it while using the C API. But C++ API makes you more aware of the data-type of elements of the Matrix. This is for good as it encourages efficient usage of memory.

To use C++ API, you need the following headers and namespace.

Fore more faster ways of accessing pixels, refer OpenCV documentation.

The following code is used to:
  1. Declare.
  2. Assign.
  3. Print or Access
 values of pixels in Images or elements in Matrices of cv::Mat type in OpenCV:

Declaring Matrices: You have to know the data-type of elements of the Matrix before accessing or printing them.

For assigning values: For printing values (same as above):

Thursday, January 12, 2012

Installing GHMM

There seems to be some dependency issues while installing GHMM in Linux (Ubuntu). The following script installs GHMM (A HMM lib written in C with a python wrapper and a GUI).

I running Ubuntu 10.04 LTS.