Wednesday, July 8, 2015

Pre-reqs for learning deep learning, unsupervised feature learning, neural networks

So, you are interested in understanding unsupervised feature learning, deep learning, convolutional networks, etc? You have tried various tutorials but you seem to have not understood what the author was talking about. This post lists most of the pre-reqs you will need to understand (not an exhaustive list).

Best starting place on the web is Machine Learning on coursera. It will get you started on what Machine Learning is all about. Next try to understand topics in . I think the ufldl website isn't completely edited for the new version. So, feel free to browse both old and the new versions. If you are not so great in math, I would recommend this book as a good starting point to get basic calculus used in neural networks. This is also a good starting point to understand backpropogation.

This is the order in which I studied them. If you know some of them already, feel free to skip them.

Pre-reqs to learn (try to learn the theory and how they work in practice):
  1. Linear regression (including square error terms, mean error terms, concept of regularlization, L1 and L2 distance).
  2. Gradient Descent (intuition of difference between conjugate, stochastic, mini-batch, batch/regular gradient descent). Awareness of bfgs, lbfgs optimization algorithms (you don't need to understand the details).
  3. Activation functions (sigmoid, tan h, ReLU) and their derivatives.
  4. Logistic regression (binary and multi-class classification).
  5. Softmax regression (multinomial logistic regression).
  6. Artificial Neural Networks (including feed forward back-propagation).
  7. SVM. (with intuition of different kernels)
  8. Dimensionality reduction using LDA, PCA.
  9. Autoencoder (including the idea of denoising autoencoder and sparse autoencoder).
  10. Autoencoder with linear decoder (doesn't use a sigmoid like in regular autoencoder).
  11. Greedy layer wise training in deep learning.
  12. Stacked auto-encoder (how to combine multiple autoencoders with a softmax regression final layer).
  13. Convolution operation, Pooling (max, mean, stochastic).
  14. Convolutional Neural Networks.
  15. Restricted Boltzman Machines.
  16. Deep Belief Networks.

Convolution with filters obtained from Autoencoders (unsupervised learning) help Convolutional Neural Networks in unsupervised feature learning. Sometimes, convolution filters used are simple gabor filters.

Other interesting links:

Tuesday, December 16, 2014

OpenCV essentials Review

OpenCV essentials provides a good overview of what sort of applications can be built with 2.4.9 release of OpenCV. There are a few references to what's coming in 3.0 release. According to the official website of OpenCV, 3.0 is to be released at the end of 2014. I like the way that the author has provided a fair description and code for detecting different 2-D feature descriptors in OpenCV. Cascade classification is also covered. The author goes on to describe what can be used as cascade classifiers. Latent SVM has also been included in this description. Tracking, background subtraction using OpenCV is also covered along with code samples.

Machine Learning is also provided in this book along with code samples. Code samples for classifiers such as KNN, Random Forest, SVM are provided.

I'm glad that steps to configure GPU (CUDA), a couple of GPU optimized OpenCV programs using NVIDIA CUDA are also provided.

If you are looking for a decent book for OpenCV with most recent stable release (2.4.9) covered, this is a good buy.

You can purchase it here.

Sunday, December 14, 2014

Programming a simple self driving robot

I had a lot of fun doing CpE 493M: Mobile Robotics this semester. We explored on how to use various sensors and program them to build an autonomous robot that drives straight.

The hardware was pre-built and all we had to do was to program the robot using a MATLAB toolbox. More details on the hardware platform are provided here.

The robot has a bunch of ultra-sonic sensors, IMU sensor, kinect sensor, magnetometer and all the sensors(wheel encoders, bump sensors) in a programmable roomba. The processing power is provided by an 11-inch asus X202e laptop on a Intel i-3 processor.

We were a group of 5 students who were given the task of navigating the hallway of our building at WVU. We wrote a program to drive across the hallway. The hallway wasn't exactly straight. So, even if we did dead-reckoning, the robot would steer into a wall or go out of the path. This is due to uneven friction on the wheels and other factors (such as cracks on the floor tiles).

My primary task was to design the module that would take images from the Kinect Sensor and give the right heading direction (to steer left or right). This data would be provided to a PID controller that controls the heading direction of the robot. The robot has a differential drive mechanism. This way we are able to control the heading direction by changing speeds of the wheels.

This is what the kinect cameras sees:

Then, we filter the red color from the rest of the image:

Then, we apply blob detection on the image, we then filter the blobs and identify the locations of the blobs. The following blob labels, along with their location are obtained:

Then, we obtain a path by drawing 2 lines from the 2 cups from bottom-left and bottom-right and find an intersection point. Its a naive technique, but works well most of the time.

So, we do path correction after every few frames (5 frames). The data is collected at 20-30 frames per second.

We then are able to correct the path of the robot every few seconds. I'm sorry for the shaky video. Here is a video of our robot driving