Skip to content

Commit

Permalink
cleanup: ensure clustering doesnt run on small datasets
Browse files Browse the repository at this point in the history
  • Loading branch information
densumesh authored and skeptrunedev committed Aug 13, 2024
1 parent 6fa7e50 commit 9960930
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions docker/clustering-script/get_clusters.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,10 @@ def insert_centroids(
# Fetch data
data = fetch_dataset_vectors(client, dataset_id[0], 3000)

if len(data) < 30:
print(f"Skipping dataset {dataset_id[0]} due to insufficient data")
continue

# Perform spherical k-means clustering
hdbscan = hdbscan_clustering(data)

Expand Down

0 comments on commit 9960930

Please sign in to comment.