Graphical representation of a sample K Means Clustering classifier

Moving to the second lesson of this tutorial, i’ve learnt about the K Means Clustering classifier. Basically, We’re giving the algorithm some points of the space and it will partition the elements in K different sets. The algorithm is really easy, I suggest you to read the tutorial for further information.

The only thing I want to explain here about this algorithm is that, given a certain dataset (in our case a set of points) you should already know in how many sets you should partition it. Otherwise, the algorithm will get to a solution which may be inaccurate. To better understand this, try to use my little implementation (the link is below) making 6 sets of close points, and try to run the algorithm with K different from 6. You’ll understand why it’s important to have an accurate guess of the K value.

I modified my recent implementation of the K Nearest Neighbour to use this algorithm.

Here’s the link.