Tuesday, March 29, 2011

How to write clean and testable code

This is an interesting talk I came across a few months back. I wanted to share it with all of you guys.

Sunday, March 27, 2011

Interesting study: Histogram of gradients vs Fast Human Detection Using a Cascade of Histograms of Oriented Gradients

I came across this article/report someone wrote reviewing Histogram of Gradients Approach and Fast Human Detection using a Cascade of Histograms of Oriented Gradients.


A good read for those who are dealing with object recognition. 

Thursday, March 24, 2011

Detect eyes in an Image

In this article we discuss on how to detect eyes in an image using OpenCV.

Btw, My other programs in OpenCV will be posted here

I am sure that many of you have tried to use the OpenCV's eye detector. It doesn't really give you good performance. I've been looking out on the web for many good eye detector Haar cascade. I've found one at ftp://mozart.dis.ulpgc.es/pub/misc/EyeDetecrs.zip (site is a little slow be patient. I downloaded the file using wget:

The above file is also available at Google Code. You can download it using:

I would suggest you to go ahead and download the file. The zip file contains ojoD.xml (right eye), ojoI.xml (left eye). There is another haar classifier i.e. parojos.xml which works for both eyes. This work pretty decently(the ojoD.xml and ojoI.xml misinterpret left as right and vice-versa many a times so they'r not reliable in detecting a specific eye whereas, the other which detects both eyes works perfectly).

Basically we do it in OpenCV using the following:

  • declare haarcascade, storage, image, camera capture structures.
  • initialize haarcascade to the xml file (downloaded) and other declared structures.
  • Capture the stream from the camera, apply cascade and detect eyes.
  • Draw a rectangle around the eyes and show it.

The following is the code is in C++:

Tuesday, March 22, 2011

reading .MAT files

There are many ways to read the .mat files generated in MATLAB. I'm sure that you  might have come accross such a requirement.

Here, in this article I show you on how to retrieve the MATRIX from the .mat file in

  • JAVA (using jmatio library)
  • PYTHON (using scipy library)
First we deal with Java. We download and install jmatio library (get the jmatio.jar file and include it in the class path of java). We initialize MatFileReader class and then convert the data into java compatible datatypes.

The code is explained as follows:


In the above line where I'm accessing matfilereader.getMLArray("atest"), I can also use getArray() instead of contentToString(). I haven't done it as it looks like jmatio cannot read multiple dimension arrays (more than 2 dimensions). I tried assigning the getArray() statement to to a double array, like

double[][][] s =.....getMLArray("atest").getArray(); and it didn't work. Moreover, it said it was expecting a 2 d double array but found a 3d array.

I think we need to write it into multiple dimensions by printing it into a string and using StringTokenizer to append it in a array of multiple dimensions.

Now we deal with Python. In python its really simple with scipy installed. Go ahead and install scipy if you haven't install it(in Ubuntu give the command $sudo apt-get install python-scipy).
Now the python code:

Simple isn't it? In 3 lines of code reading a .mat file!!

Sunday, March 20, 2011

simple tutorial on using LIBSVM

This article deals with on how to use LIBSVM and test the accuracy of the classifier.  Libsvm is a tool to
incorporate the concept of SVM in your project.

I'll be posting other tutorials/progarms using LibSVM here.

SVM's are used for classifying data in 1 or multiple dimensions into 2 or more classes. All they do is try to clearly separate two classes from each other in clustering.

labelled training data-------->SVM------->trained SVM

unlabelled/labeled testing data--------->trained SVM--------->predicted labels

They work on labelled data (for unlabelled data, you will need a ground truth to establish the accuracy of the SVM). I would actually recommend you to read LIBSVM documentation completely (as it is less than 16 pages). After reading that you can get some insight into what libsvm is about and how you can use it in your project.

Basically a SVM takes in a set of feature vectors (value in multiple dimensions) while training and outputs the labels in testing phase (and given labelled test data, we can measure the accuracy on how to better differentiate the classes).

If we were to differentiate a square from a rectangle. We have 2 dimension feature vector i.e. length, width.

The following logic defines our feature vector and thus our SVM.

If length = width its a square
else its a rectangle.

If this is the logic we want to cluster using the SVM, we would have to give the have data in following format:

1 1:2 2:2
1 1:3 2:3
1 1:4 2:4
1 1:5 2:5

-1 1:2 2:3
-1 1:1 2:2
-1 1:2 2:4
-1 1:3 2:1

I guess you must have figured out what 1 and -1 meant. In above data 1st column deals with the label (i.e square or rectangle. Square=1 and Rectangle=-1). The 2nd and 3rd column deals with length and width. 1: and 2: tell libsvm that they are 1st and 2nd dimension of data respectively.

You can have the above data in different ways. If you were have series of images as input. Depending on size and colorspace, you will have different input feature vector for each image.

If you have 640 x 480 length black and white image as input (grayscale colorspace). You will have data of 307200 dimensions (each dimension with a range 0 to 255 in grayscale). It would look something as follows:

1 1:255 2: 233 3:0 4:44 ........307719: 233
-1 1:55 2: 3 3:20 4:240 ........307719: 233
1 1:155 2: 123 3:50 4:42 ........307719: 233

Here 1 and -1 represents the class label (defined according to you). Each row represents an image. 

Now lets download libsvm and get started. I would also suggest you to install python.

Now as we have downloaded and installed LIBSVM, lets try to do a simple classification in LIBSVM.

Download the testing data and training data and put them in a folder.

In the downloaded data, newtraining.txt represents the training data and newtesting.txt represents the testing data. 

Copy the svm-predict, svm-train, easy.py, grid.py from the folder where you installed libsvm to a folder where you have testing and training data saved.

Now you can do this in 2 ways:
1) Manually:

Go to command prompt / Shell Terminal and give the following command:
$svm-train <training_data>

where <training_data> is the text file which contains the training data

Executing above command will output on how much accurate the training was.

Then issue the following command:

$svm-predict <testing_data> <training_data.model> outputlabels.txt

where <testing_data> is the text file which contains the testing data,
<training_data.model> is a text file that was generated from previous step (svm-train <training_data>),
outputlabels.txt is a text file that stores the respective output labels for the input feature vectors in <testing_data> file.

Executing above command will show the accuracy of the generated model (by libsvm).

If you observe the above procedure yeilds a accuracy of  51.85%.
This is considered to be low for a SVM. This is so because we need to scale paramaters, select best SVM kernel type for the given input data. In order to do so, we use the automatic way of using LIBSVM. Jump to the next part.

2) Automatically:
I assume that you have installed python and GNUPLOT. Edit the easy.py program using IDLE python editor or any other text editor by going to the line where it says the following:

    svmscale_exe = .....
svmtrain_exe = .....
svmpredict_exe = .......
grid_py = ......
gnuplot_exe = .....

Basically its the code which is pointing to svmtrain, svmpredict, etc executables. Edit the above to point to the svmtrain, svmpredict, etc executables.

If the above part is done,you just issue the command:

$python easy.py <training_data> <testing_data>

Here <training_data> and <testing_data> refers to the training and testing files. The above command automatically scales the data (you should know what scaling is if you've read libsvm documentation).

The above command automatically scales data, does cross-validation, selects optimal kernel for the SVM automatically without you having worry about the training and testing parameters to generate a reliable SVM.

The output on the terminal/command prompt shows the accuracy of the SVM. Also in the present folder are the files which have output labels of predicted data.

Using the automatic way, we get an accuracy of 74%.

The automatic way is the best way to use LIBSVM as it does cross validation, selects best svm kernel type, scales the test and training data.

Frame Differencing using OpenCV in C++

Frame differencing is a simple concept. Its basically a difference of two images. The regions on the image that differ are marked in the Final image.

Btw, My other programs in OpenCV will be posted here

We perform absolute difference between 2 images (usually by 255 for a white pixel or a 0 for a black pixel).

Its all basically how you define your frame differencing to be (you can customize it to suit your own requirement).

This concept is called Frame Differencing or rather Static Frame Differencing. This is very useful is detection motion from a static point of view(a stationary camera taking a video). However this not useful when the camera is in motion or when its moved. Such situations (where camera motion is to be considered) require different strategies.

The program is explained as follows:

In more cleaner C++ API:

More C like API:


Thursday, March 17, 2011

OpenCV installation with Ubuntu

I am sure that many of you have tried installing OpenCV with python support on Linux but found no success as executing a program gives you no module found error. I recently figured out on how to do it and fix all the errrors.

Btw, My other programs in OpenCV will be posted here

In this post I explain on how to install OpenCV on Ubuntu with python and ffmpeg support on

Follow the steps and your problem will be solved.

  1. First install python-dev package (with $sudo apt-get install python-dev )
  2. Then install ffmpeg required files with $sudo apt-get install libavformat-dev libavcodec-dev libavfilter-dev libswscale-dev
  3. Also in some systems, there might be a pre-requsite that you have to install gtk, etc. For those, execute the following:  $ sudo apt-get install libavformat-dev libgtk2.0-dev pkg-config cmake libswscale-dev bzip2
  4. Then, download the tar ball from the here. Also instead of this, you can run a svn checkout against latest version of opencv trunk.
  5. save the above file in a folder.
  6. Open Terminal and browse to above folder.
  7. Give the command $tar xvjf OpenCV-2.3.0.tar.bz2
  8. The files in tar ball will be extracted in a folder named OpenCV.... in the present folder.
  9. browse to the extracted folder, create a folder named release  by giving the following command and change the present directory to release folder. $ cd OpenCV-2.3.0; mkdir release; cd release
  10. Now we need to create configuration files using CMAKE (if cmake isn't installed install it by giving the command $sudo apt-get install cmake) using the following command: $cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_PYTHON_SUPPORT=ON ..
  11. Give  the following command to compile the source: $make
  12. Give the following command to install the libraries: $sudo make install
  13. Then we need to make the OS know that OpenCV is installed in a particular folder: so edit the LD_LIBRARY_PATH variable by giving the following command: $export LD_LIBRARY_PATH=/usr/local/lib/:$LD_LIBRARY_PATH
  14. Then give the following command to update the ld paths: $sudo ldconfig.
  15. Now we need to move cv.so file in site-packages folder to dist-packages folder $sudo mv /usr/local/lib/python2.6/site-packages/cv.so /usr/local/lib/python2.6/dist-packages/cv.so
  16. and we're done!! :-)
  17. And yeah! In order to compile use the following: $g++ `pkg-config opencv --cflags --libs` -g input.cpp -o output
  18. I usually use this alias in Ubuntu which makes my life easier: alias opencvcc="g++ `pkg-config opencv --cflags --libs` -g". Now I can just compile using the following command: $opencvcc input.cpp -o output

Or you can just execute the following script (which does same as the above):
Now you are ready to run the first program in python, click here a simple 14 line code in python.


Monday, March 14, 2011

String Tokenizer in C++

Hi guys,
I'm sure that many of you might have tried to Tokenize a String in cpp. There is built in support in cpp among the hassles of handling strings in C/C++. Its included in "string.h".

We use the char* strtok(char* string2btokenized, char* wildcardchars2bremoved) function which takes 2 parameters.

First one being the string to be tokenized and second being characters to be removed.

In this program I have passed those as the first 2 parameters to the main function and hence stored in argv[1] and argv[2]. Code is self explanatory.

Saturday, March 12, 2011

C++ List files in a directory

Hi guys, in this post we see on how to list files in a directory. Basically we include dirent.h which specifies the format of directory entries in Linux.

WINDOWS users click here

In Linux it also lists files with names "." and ".." which are related to OS and do not qualify as regular files that we are interested in so we skip those files by checking the length of the name of the file (to be <2). Change the code as you wish.

Human Computer Interaction redefined through Computer Vision

Computer Vision technologies are redefining the way we interact with computers. Microsoft Kinect has generated more hype for things it can do with a PC than the games we can play on it with Xbox. Like the one shown in this video, many fun Kinect applications are yet to come. 

With Microsoft releasing Kinect SDK, the possibilities are infinite. Fit cameras everywhere in your home and you have a smart home.

Android mobile phones such as Motorola Droid Bionic, Atrix 4G have biometric sensors that can sense and validate fingerprints (an Image Processing application, vaguely related to computer vision).

Computer Vision is being increasingly built into Operating Systems.

Computer Vision is increasingly assisting drivers in driving and parking lot:
  • Many applications with respect to driving mentioned here.
  • Google's self driving cars. Where I believe uses sensors, lasers and obstacle detections techniques of computer vision (using some kind of Haar classifier or a SVM ) coupled with sensing techniques with Google Maps navigation technology ( a Computer Vision, Image Processing Application available on the web).

  • Computer Vision and Face recognition can as well be built into cars which can recognize drivers and prevent Car Theft.
  • I have developed a simple to use personal Free Parking lot Finder coupled with a Android App (further developments is in the works with increased accuracy)
Computer Vision is one of the pillars of Robotics and Photograhy. 
  • Vision plays a very important role on how Robots interact with their environment including Object Recognition, Target Tracking. Human Computer Interaction, etc. 
  • Willow Garage is making significant strides in this field with its PR2 robot. 
  • An interesting review of the social implications of Human Robot interaction through Robot Vision Gaze has been studied here.
  • Face detection in digital cameras.
  • Image stitching features in digital cameras.
  • 3D scene reconstruction with using SLAM (Simultaneous Localization And Mapping).

Computer Vision is increasing changing the way we live and how we interact with computers. Still a long way to go for the benefit of humanity.

Tuesday, March 8, 2011

record data from Camera in Opencv

Hi all,

In this article I'll show how to record data from the camera (webcam) on your Computer running Ubuntu.
NOTE: Click here, If you are running Windows (recent version or Visual Studio 2008).

Btw, My other programs in OpenCV will be posted here.

I don't think you will be able to use Kinect with this code. For that, you might wan't to refer OpenNI API.

Basically we import cv.h, highgui.h and get a hold of the camera using cvCreateCameraCapture() function and extract frame from it using cvQueryFrame()

The program is as follows:


You can chage the program accordingly at cvCreateCameraCapture parameter (which is usually >=0) to capture your specific camera (if you have more than 1 Camera attached to your computer).

This program saves all the IMAGES in .jpg format in the present folder. You can as well record it into a .avi or any other video file, for that click here.

Sunday, March 6, 2011

Cannot include Python.h problem solved: Calling python functions from a C program in Ubuntu

Hi all. I'm sure many of you who have tried to call python functions from a C program have faced the above problem or you were trying to build/install something.

I'm writing this solution down as many have suggested to install python-dev2.5 package, etc. Installing more than 1 python package in Linux might create a confusion. So, I'm writing the solution down.

In a Ubuntu 10.4 OS, this problem can be solved by including the Python.h file in the folder /usr/include/python2.6

Don't panic if you don't find the file header i.e. Python.h in the folder mentioned above. You wanna install the python2.6-dev package. Simply go to your command prompt (err... Terminal) and execute the following command: sudo apt-get install python2.6-dev .This will install the Python.h file(and many more headers) in your /usr/include/python2.6 folder.

Then include the header by giving the command: g++ new.cpp -I /usr/include/python2.6 -pthread -lm -ldl -lutil -lpython2.6 -o new

Hope this solved your problem of including the header.

In order to embed the python function (program) in your C++ or C code, you might wanna refer this http://docs.python.org/extending/embedding.html

Wednesday, March 2, 2011

Implementing Face Tracer search engine: Part 1

Hi all. I'm trying to implement this paper of Face Tracer search engine from Columbia University. The concept looks very interesting, sometimes obvious but very Novel in terms of search engine implementation.

Presently, I've downloaded the database and looking ways to train SVM's for each of the attribute. The author has not provided the entire database on his website. He instead, asked us to write a program to download the urls mentioned in the text file. So, I've written a small python script, to parse the entire file and download the files one by one.

I've made use of the urllib module in python and used urlretreive method/function to download the file.

The database consists of 15,000 images, I tried to download all of them but many of them weren't there as the paper and the links were itself 3 years old. So in order to make things quicker, I divided the faceindex.txt file into 5 parts (faceindex1.txt, faceindex2.txt, faceindex3.txt, faceindex4.txt, faceindex5.txt), with 3000 images urls in each file. I tried to download it then, it became fast enough (downloaded it in 3 hours).

I also made use of PIL module to open an image and check if its valid, if an exception was raised it was considered to be a corrupt image and was logged into a file to make a list of files that were corrupt in the database.

I've arranged the dataset according to facelabels.txt file into different folders for convenience sake.

Now as I've got the images, I need to make use of Adaboost to select best classifiers, then train SVM to implement the search engine. Will keep you posted with the updates.