Saturday, November 26, 2011

Action Recognition Survey

The paper titled "Machine Recognition of Human Activities: A survey" is an interesting Journal Paper I've read recently.

It basically talks about the recent advances, open research problems, future directions of research in Action Recognition and different techniques employed to do Action Recognition.

Friday, November 25, 2011

Image Overlap in Android OpenCV

Its been recently that I have been playing around with OpenCV on Android.

Btw, My other programs in OpenCV will be posted here.

Basically, I've written a program that overlaps two images. One image exists in the "res" folder in android project and other image is the Image coming from the camera.

Image coming from the camera is available in array of bytes. We convert it into a Bitmap, then into a "Mat". The other image is read as a "Mat" directly using Utils.loadResource method.

Now since both images are in Mat format, I use Mat.getNativeObjAddr() method from android-opencv project to pass the image to JNI code. This is more useful as we can reconstruct cv::Mat in JNI code more easily than passing bytes (lookup samples from android-opencv project) and reconstructing them.

Also, one thing to take care in number of channels in the images/Mat being processed. We need to make sure that all the images being operated on should be of same number of channels. During this process, you might have to convert RGB to Gray or RGB to RGBA, etc.

In the sample program, I've written a code to send RGBA image, RGB image to JNI (OpenCV C++) code and then convert RGB image to RGBA using cvtColor function. Then do a "addweighted" function call on the 2 images and return it to Java code.

I got the following image on my App where I performed merging  of two images:

We can use more intense image processing stuff in C++ which is more faster than it is on Android (java).

If you are looking for the program to merge images in OpenCV, take a look at this program.

Sunday, November 20, 2011

Hidden Markov Models for Dummies

This article is a great collection of the best resources available on the web which explain Hidden Markov Models and their applications.

I think there is never a "best place" to learn all the points of a new concept/idea. But, you need to go through a lot of sources (books, webpages, jounals, etc) to understand something.

If you are done with learning what HMMs can do, you can checkout this post which discusses about selecting optimal parameters for training of a HMM classifier.

  1. Here is a really good explanation of what HMMs are all about. You need some background (Bayesian Networks) with Probability though. The above website Autonlab (from Carnegie Mellon University) has really good explanation/slides for Data Mining, Machine Learning, Pattern Recognition which is helpful to Math, Statistics, Computer Science researchers.
  2. Here is another link (from University of Otago, they make really good tutorials which start from basics) that explains HMMs with basics of Probability and Naive C/C++ implementation.
  3. Here is another link from UC-Berkley which explains Hidden Markov Models (in Practical Machine Learning class). This one clearly states learning, testing/classification without much deep explanation (on how they came up with the equations) and with quick formulae to start with. This is very much unlike other tutorials on the web which seemed to me confusing.
  4. Another tutorial here (Utah State University) describes HMM and also talks about issues in implementing HMM (floating point underflow problem).
  5. Matlab documentation website does a good job in explaining Hidden Markov Models in a basic manner along with the code (to use this code, you need "Statistics" toolbox in Matlab).
  6. This article (University of Cambridge) compares Hidden Markov Models with Dynamic Bayesian Networks. Also covers other stuff on Computer Vision applications using these Stochastic Models.
  7. Most referred tutorial in writing other tutorials is "A tutorial on Hidden Markov Models and selected applications in speech recognition".

Here are set of videos on Youtube that explain Hidden Markov Models in a more mathematical way! Other videos from that guy have excellent Machine Learning examples explained very mathematically and clearly!

There are 3 problems to solve in Hidden Markov Model namely, State EstimationDecoding or Most Probable Path (MPP) and Training/Learning HMM.

The above 3 problems are solved using the following techniques:

  1. State Estimation: Forward-Backward technique is used for State Estimation (what will be the next state, given set of observations). This step is also known as Classification. This can be used in testing the classifier built.
  2. Decoding or Most Probable Path: Viterbi Decoding technique is used for estimating Most Probable Path (given a set of observations, what is the most probable path that is taken that best explains the observations). This step is also known as Decoding. This can be used in Testing the classifier built.
  3. Training/Learning HMM: Baum-Welch (Expectation Maximization) technique is used for Learning HMM. If we are given a set of observations, we can predict the maximum likelihood HMM that may have produced the observations (adjust the HMM model that fits the data). This is also known as Training.

For Computer Vision guys, these slides explains how to apply it to action recognition.

Once you are comfortable with basics of HMM, you might want to look into this paper. Here authors describe on how to select initial parameters of HMM.

A background in Statisctical Pattern Recognition, Stochastics will definitely help in understanding Hidden Markov Models. Hidden Markov Models are widely used in Speech Recognition, Computer Vision (Gesture Recognition and Action Recognition).

There are 3rd party libraries available on the web for use in your project. Also, there is a Matlab toolkit available

Wednesday, November 16, 2011

Move to the Cloud

This is an interesting article and video from UC Berkley, concerning Cloud Computing. It discusses about the need for Cloud for building Scalable Services, opportunities for Startups, advantages of Cloud, concerns about Cloud Computing!

It also talks about present concerns about Cloud Computing, possible solutions. Its basically the way UC Berkley would want to see the Cloud Services evolve.

Wednesday, November 9, 2011

Possibilities with Array Cameras

This is a interesting video on usage of Array Cameras to avoid Occlusions. Array Cameras are a set of Cameras whose Images are Co-Planar .There are a lot of possibilities with usage of Array Cameras in Surveillance, Occlusion Avoidance.

The link for this research paper from UCSD is here.

Wednesday, November 2, 2011

Apple's Siri and Android's Voice Control

Its been a while since Apple's iPhone 4S was released and everyone going GAGA over Siri. It seems very funny looking over claims of Apple fanboi's to be able to talk to their phones and it replying it back. That feature has been in Android for almost an year.

However, Siri is absolutely amazing.. Apple has once again shown us that better implementation always attracts positive attention.

Presently, Android's voice feature is not as extensive as of Siri. For example, I can say "weather in Morgantown" in Android and it will give me a weather prediction on a Google search page. Siri will give me the weather directly by speaking to me (which is really cool).

We need to specify the input properly in Android. For example I can't say "drive to Pittsburgh" in Android (this opens up searching "drive to pittsburgh" in google search) rather, I would have to say "Navigate to Pittsburgh" which is not so natural (to me).

However, translate feature is awesome on Android.

Android 4.0 features updated text to speech feature, making the API more cleaner. I think this move will encourage more developers to bring in more apps to Android market. Previously, I believe developers would have to write it in C++ (unofficial API), now there is no need for that.

No matter, Siri is a great feature but I'm a little skeptical over its success as its not a natural interface to talk to phone. Google voice search, and voice dialer are great features on Android which were present since the days of Gingerbread. But, I see very few people talking about it or using it extensively.

Andy Rubin also claims the same. They want to wait and see how the public reception of this feature will be.

But on the other hand, maybe Apple did the right thing by giving voice control in iPhone 4S a name ("Siri") and calling it an personal assistant or it is an marketing tactic made by Apple.

I guess its time for Google to make the same move.