## Monday, May 23, 2011

### KNN classification algorithm

Hi all, in this post we discuss on what 'K- Nearest Neighbors Algorithm' is all about. If you ever come across a classification problem, this solution (K- Nearest Neighbors Algorithm) might be the most simplest of all the classification algorithms you could possibly apply.

Implementation of Algorithm in python is available here.

credit: Wikipedia

The idea behind KNN is very intuitive. Assume we have 3 classes to classify (squares, circles, triangles) and we need to classify the test data to one of these classes. So what we do is:

1. Assume size of k=3.
2. Select a point from test data (green circle in above image)
3. Find 3 nearest neighbors.
4. Assign the test data to the class which occurs the most in the 3 nearest neighbors (here it is 2 red triangles).

Now, let us look the above described method in algorithm form.

In order for the classification to occur:
• we assume, availability of training data.
• size of k.
Algorithm:

Step2: initialize value of k
Step3: for each training data item:
for every training data item:
calculate distance
arrange distances in ascending order and put them in a list.
select first k distances in sorted list.
assign the class which occurs the most in the sorted list.

Pros:
• Intuitive.
• Simple to implement.
• Simple to understand.

Cons:

• Computationally complex.