Monday, August 10, 2015

very basic intro to Caffe

This tutorial just gives you a basic idea of what Caffe can do (not for advanced use of Caffe through defining your own layers, using C++/Python API) for defining your own network to solve a Computer Vision task.

Caffe is a framework that can build different architectures of neural networks where you can train, test and deploy your models. It can serve as testing ground for deep neural nets (if you want try different combinations of existing neural net models).

I'm assuming you already have a background in Machine Learning, Statistical Pattern Recognition and have built Neural Networks in the past using some programming language.

Caffe lets you create, test neural nets of different styles using prototxt files. You write text files specifying the architecture of a neural net, how fast a net should solve, where should it pick the data from, what should it generate as output, etc.

To use Caffe (not Caffe API) you need to create data file, network definition and solver parameter file.

Step1:
Here is how you create a dataset for Caffe from (there are multiple ways of doing this). If you want a black box technique, use this technique.

This was done in Ubuntu 14.04 with a NVIDIA GPU.

Things to do:
  1. Put all your train images in a folder. It contains images from all classes.
  2. Put all your test images in a folder. It should contain images from all classes.
  3. Create train.txt and test.txt with files within respective folders. These two files should contain (per line) "name_of_the_file.jpg <space> label_or_class_id" 

You then pass this data to caffe's convert_imageset binary.


Step2:
Now, you define your network in layers using protobuf format. More on that is available here. You will need to define the network for training, testing, deployment (change input layer and output layer). These three need to have their own network definition defined in 3 separate files.

In the train, test network file, you will mention where the source is (the train, test lmdb image files you created, also image mean files). You can also scale images for better performance (make the range of images from 0-255). Using image mean files drastically improves the performances compared to not using them.

Step3:
Now, you define your network in solver. It contains parameters which tell caffe to use what kind of gradient descent, how many iterations, learning rate, etc. How frequently should the progress of the training/testing be outputted to screen (to see if network configuration is performing as expected). In the solver file (if you want to) you also mention the specific network file you want to use.


Once all the three steps are completed, you can see the performance of the network as per the configuration defined in step 3.

In the end, you will be given a .caffemodel file. You can use this file, deployment network file (defined in prototxt), load your own data using a simple C++/Python program to use. For the sample python interface, you can look at sampel ipynb file provided in Caffe examples folder.

If you want to visualize your drawn graph, you can use the python script inside <Caffe_dir>/draw_net.py

You can do it as follows:
P.S. If this seems too complex for you, you could install NVIDIA DIGITS software. It uses a GUI through a web-browser interface (and lets you do the above tasks). It lets you download the trained models in the end. You can set parameters for training, solver files, scale data while giving it to input, etc. You can also finetune models, customize popularly available models (like LeNet, AlexNet, GoogLENet

No comments:

Post a Comment