Tuesday, May 24, 2011

KNN classifier Python

I believe you might have read my previous article on KNN classifier. So, this is the next part of that where we are dealing with implementation of it in Python.

My other machine learning articles will be posted here.

The required data set to run this program can be found here: train.txt and test.txt .


DISCLAIMER: I DON'T OWN THE DATASET.

The above data set was DERIVED from the famous Iris Flower dataset .


This program uses Matplotlib and Numpy.

If you have not installed (on Ubuntu OS), install it using the following commands:

sudo apt-get install python-matplotlib
sudo apt-get install python-numpy
sudo apt-get install python-scipy


The program is as follows:



Output:



Plot:



If you observe, you can see that as value of k increases, accuracy increases to some extent and then doesn't increase after that (in fact decreases). 

Please leave a comment if you find this article useful.

4 comments:

  1. Very nicely written code and good detailed comments.

    However, the plot is misleading since it starts from k=0 for which the accuracy=0. I feel it would be better to start the plot from k=1.

    ReplyDelete
  2. Thank you for noticing it. I will fix it!

    ReplyDelete
  3. Hey can u explain me the following code:
    for everyvalue in mn_list:
    (ss,sss)=everyvalue

    if(sss<=25):
    c1=c1+1

    elif(sss>25 and sss<=50):
    c2=c2+1

    elif(sss>50 and sss<=75):
    c3=c3+1

    what does the values 25,50,75 mean?

    Thank You

    ReplyDelete
  4. If you open and look at the data file. First 25 feature vectors are of class 1, next 25 are of class 2, next 25 are of class 3.

    ReplyDelete