Clustering

Welcome to the clustering zone!

Upload a .csv file containing the data you would like to fit.

Data can be in any number of columns.

This module will automatically try to detect the best number of clusters for a given dataset and then fit clustering to it.

The "best number of clusters" is found using the heuristic "the point is where the WCSS stops reducing by at least 25%". Totally arbitrary!

Note that since this is just a demo, it will probably not produce great clustering results. Clustering often requires fine-tuning and some subjective "art".

Look for the sharpness of the elbow on the WCSS (within-cluster-sum-of-squares) plot after submitting data. If it is not very sharp, your clusters are probably not very great.

Only numeric values will be used, and categorical variables which would be one-hot-encoded in a real life problem are just dropped.

Ain't got no data?

Module may take a few seconds to run - be patient please!