Skip to content

Latest commit

 

History

History
57 lines (27 loc) · 1.83 KB

File metadata and controls

57 lines (27 loc) · 1.83 KB

Images Compressing With The K-Means & Clustering With Scipy

Dataset

  • For the first one we use two images image.jpg and image2.jpg for testing image compressing.
  • For the second example we use fromage.txt, it's a small dataset for testing the clustering using scipy packages.

K-Means Algorithm

K-means algorithm is an iterative algorithm that tries to partition the dataset into Kpre-defined distinct non-overlapping subgroups (clusters) where each data point belongs to only one group. It tries to make the inter-cluster data points as similar as possible while also keeping the clusters as different (far) as possible. It assigns data points to a cluster such that the sum of the squared distance between the data points and the cluster’s centroid (arithmetic mean of all the data points that belong to that cluster) is at the minimum. The less variation we have within clusters, the more homogeneous (similar) the data points are within the same cluster.
The way kmeans algorithm works is as follows :

  • Specify number of clusters K.
  • Initialize centroids by first shuffling the dataset and then randomly selecting K data points for the centroids without replacement.
  • Keep iterating until there is no change to the centroids. i.e assignment of data points to clusters isn’t changing.

Requirements

  • Python 3.7 64bit

Installation

  • Get the package from PyPi :
  • All requirement that you will need its exist in requirements.txt so you just need to run this command :
!pip install -r requirements.txt

Test

Congratulation.

  • Open the notebook main.ipynb and edit it as you want.

  • main.ipynb

Authors

Thank you.