Skip to content

Effectively visualizing cluster flows and sizes for sequential cluster analyses using matplotlib.

Notifications You must be signed in to change notification settings

johannesuhl/sequential_clustering_viz

Repository files navigation

Visualizing sequential clustering results

java 8 and prio java 8  array review example

Various applications require cluster analysis applied to sequential or longitudinal data. While there are numerous approaches for sequential clustering, visual-analytical methods to illustrate clustering results are sparse. The script sequential_cluster_flows.py reads longitudinal data, and exemplarily generates clusters for each temporal cross-section of the data. The number of instances per cluster and per point in time, as well as the number of clusters transitioning between clusters in subsequent points in time are then visualized using a network-based visualization technique, based on matplotlib.

The data used for demonstration of the visualization are 19 demographic characteristics reported for approx. 200 countries, from 1950 to present, and projected up to the year 2100, in 5-year intervals (see United Nations 2019, https://population.un.org/wpp/Download/Files/1_Indicators%20(Standard)/CSV_FILES/WPP2019_Period_Indicators_Medium.csv). BIRCH clustering was used to derive the clusters (and number of clusters) for each cross-section, using a range of thresholds dictating the granlarity of the cluster sequences (i.e., 0.1,0.2,0.3):

java 8 and prio java 8  array review examplejava 8 and prio java 8  array review examplejava 8 and prio java 8  array review example

References:

United Nations, Department of Economic and Social Affairs, Population Division (2019). World Population Prospects 2019, Online Edition. Rev. 1. https://population.un.org/wpp/Download/Standard/CSV/

Zhang, T., Ramakrishnan, R., & Livny, M. (1996). BIRCH: an efficient data clustering method for very large databases. ACM sigmod record, 25(2), 103-114.

Cluster sequences for BIRCH threshold of 0.2 enlarged:

java 8 and prio java 8  array review example

About

Effectively visualizing cluster flows and sizes for sequential cluster analyses using matplotlib.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages