K-means VS K-NN and 手肘法

1. The difference between classification and clustering. from here.

Classification: supervised learning with labels.

Clustering: unsupervised learning without labels. 

Classification and Clustering are the two types of learning methods which characterize objects into groups by one or more features. These processes appear to be similar, but there is a difference between them in the context of data mining. The prior difference between classification and clustering is that classification is used in supervised learning technique where predefined labels are assigned to instances by properties, on the contrary, clustering is used in unsupervised learning where similar instances are grouped, based on their features or properties.

2. The difference between k-means and k-NN. from here.

k-means: an unsupervised algorithm used for clustering.

k-NN: a supervised algorithm used for classification. 

K-means VS K-NN and 手肘法

3. K-NN algorithm

K-nearest neighbours needs labelled data to train on. With the given data, KNN can classify new, unlabelled data by analysis of the k number of the nearest data points. 

K-means VS K-NN and 手肘法

Steps

  1. 计算测试数据与各个训练数据之间的距离;
  2. 按照距离的递增关系进行排序;

  3. 选取距离最小的K个点;

  4. 确定前K个点所在类别的出现频率;

  5. 返回前K个点中出现频率最高的类别作为测试数据的预测分类。

4. K-means algorithm

Steps

  1. Initially, randomly pick k centroids/cluster centers. Try to make them near the data but different from one another.
  2. Then assign each data point to the closest centroid.
  3. Move the centroids to the average location of the data points assigned to it.
  4. Repeat the preceding two steps until the assignments don’t change, or change very little.

 

上一篇:Java面向对象-多态Polymorphism


下一篇:LeetCode 2016. Maximum Difference Between Increasing Elements【数组】简单