Skip to content

K-means and EM from scratch. A short discussion of their differences and performance.

Notifications You must be signed in to change notification settings

derekwayne/clustering

Repository files navigation

Clustering Methods

Purpose

The goal of this project is to look deeper into the most common methods of grouping objects on the basis of their similarity. The K-means and EM algorithm can both be used for this general purpose but differ in their strengths and weaknesses.

Packages

The comparison is done using generated data from the bivariate normal distribution. Below is a list of the packages used.

  1. MASS: For generating bivariate data.
  2. ggplot2: For beautiful plots.
  3. cluster: Functions for clustering.
  4. factoextra: ggplot2 compatible silhoutte plots.

Most of the important functions were written by myself for instructive purposes. To view the project in a browser, visit the link beside the repository description in the code tab.

Releases

No releases published

Packages

 
 
 

Languages