Skip to content

🌟 Stardex: Explore GitHub Stars Intelligently. Stardex is a powerful web app that lets you search, filter, and cluster any GitHub user's starred repositories. Discover hidden patterns and find your next favorite project with intelligent, AI-powered exploration.

License

Notifications You must be signed in to change notification settings

BjornMelin/stardex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

b735053 Β· Jan 27, 2025

History

40 Commits
Jan 26, 2025
Jan 27, 2025
Jan 26, 2025
Jan 15, 2025
Jan 26, 2025
Jan 26, 2025
Jan 26, 2025
Jan 25, 2025

Repository files navigation

⭐ Stardex - Explore GitHub Stars Intelligently

πŸš€ Discover patterns in your GitHub stars through machine learning

Next.js FastAPI scikit-learn TypeScript Python TailwindCSS GitHub MIT License React

Stardex helps you explore and understand your GitHub starred repositories through advanced machine learning clustering and interactive visualizations.

πŸ“š Table of Contents

✨ Features

  • πŸ” Smart Analysis: Machine learning-based clustering of repositories
  • πŸ“Š Interactive Visualization: Dynamic D3.js visualization of repository clusters
  • ⚑ Real-time Processing: Fast data processing and clustering
  • πŸ”„ Efficient Data Flow: Optimized communication between services
  • πŸ›‘οΈ Type Safety: Full TypeScript and Python type coverage
  • 🎨 Modern UI: Clean, responsive interface with Tailwind CSS
  • πŸ“± Mobile Ready: Fully responsive design for all devices

πŸ› οΈ Technology Stack

  • Frontend

    • Next.js 13 with App Router
    • React 18 with TypeScript
    • TanStack Query for data management
    • D3.js for visualizations
    • Tailwind CSS for styling
    • Shadcn/ui components
  • Backend

    • FastAPI for REST API
    • scikit-learn for ML operations
    • Poetry for dependency management
    • Pydantic for data validation

πŸ”Ž Detailed Features

Search & Filtering

  • Real-time repository search
  • Language-based filtering
  • Star count range filtering
  • Topic-based filtering
  • Date range filtering

AI Clustering

  • Multi-algorithm clustering approach:
    • K-means for broad repository grouping
    • Hierarchical clustering for detailed relationships
    • PCA + Hierarchical clustering for large datasets
  • TF-IDF vectorization for text analysis
  • Configurable clustering parameters
  • Performance metrics tracking
  • Efficient processing of large datasets

Visualization

  • Interactive D3.js force-directed graph
  • Cluster-based coloring
  • Zoom and pan capabilities
  • Repository details on hover
  • Smooth animations and transitions

πŸ—οΈ Architecture

The application is structured as a monorepo with two main services:

🎨 Frontend Service (Next.js)

  • Located in /frontend
  • Built with Next.js, React, and TypeScript
  • Uses TanStack Query for data fetching
  • Implements a responsive UI with Tailwind CSS
  • Visualizes repository clusters using D3.js

βš™οΈ Backend Service (FastAPI)

  • Located in /backend
  • Built with FastAPI and Python
  • Implements advanced clustering using scikit-learn
  • Provides RESTful API endpoints
  • Efficient data processing with sparse matrices
  • Parallel processing capabilities

πŸš€ Getting Started

  1. Clone & Install:

    # Install root dependencies
    npm install
    
    # Install frontend dependencies
    cd frontend
    npm install
    
    # Install backend dependencies
    cd ../backend
    poetry install
  2. Environment Setup:

    # Frontend (.env.local)
    NEXT_PUBLIC_API_URL=http://localhost:8000
  3. Development:

    # Run both services
    npm run dev
    
    # Or run individually
    npm run dev:frontend
    npm run dev:backend

πŸ”Œ API Reference

πŸ”„ POST /api/cluster

Clusters GitHub repositories based on their features.

Request Body
{
  "repositories": [
    {
      "id": number,
      "name": string,
      "full_name": string,
      "description": string | null,
      "html_url": string,
      "stargazers_count": number,
      "forks_count": number,
      "open_issues_count": number,
      "size": number,
      "watchers_count": number,
      "language": string | null,
      "topics": string[],
      "owner": {
        "login": string,
        "avatar_url": string
      },
      "updated_at": string
    }
  ]
}
Response
[
  {
    "repo": {
      // Repository data (same as input)
    },
    "cluster_id": number,
    "coordinates": [number, number]
  }
]

πŸ₯ GET /health

Health check endpoint.

{
  "status": "healthy"
}

πŸ§ͺ Development

πŸ”¬ Technical Implementation

The clustering process follows these steps:

  1. πŸ“Š Feature Extraction

    • TF-IDF vectorization for text data
    • Repository metadata processing
    • Language and topic encoding
  2. πŸ“‰ Dimensionality Reduction

    • PCA for high-dimensional data
    • Configurable number of components
    • Efficient sparse matrix operations
  3. 🎯 Clustering

    • K-means for initial grouping
    • Hierarchical clustering with Ward linkage
    • PCA-enhanced hierarchical clustering for large datasets
  4. 🎨 Visualization

    • Interactive D3.js rendering
    • Cluster-based coloring
    • Smooth animations

πŸ› οΈ Code Quality

  • πŸ“ Style Guides

    • Frontend: ESLint + Prettier
    • Backend: Black + isort
  • βœ… Testing

    • Frontend: Jest + React Testing Library
    • Backend: pytest
  • πŸ”„ Git Workflow

    • Feature branches
    • Pull request reviews
    • Semantic versioning

πŸ“ˆ Performance

⚑ Backend Optimizations

  • Efficient sparse matrix operations
  • Parallel processing capabilities
  • Memory-optimized data structures
  • Request validation & caching

πŸš€ Frontend Optimizations

  • Optimized D3.js rendering
  • React Query data caching
  • Component lazy loading

πŸ‘¨β€πŸ’» Author

Bjorn Melin

πŸ“š How to Cite

If you use Stardex in your research or project, please cite it as follows:

@software{melin2024stardex,
  author = {Melin, Bjorn},
  title = {Stardex: GitHub Stars Explorer},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/BjornMelin/stardex},
  version = {1.0.0},
  description = {A machine learning-powered tool for exploring and understanding GitHub starred repositories through clustering and interactive visualizations}
}

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.


Built with ❀️ by [Bjorn Melin](https://bjornmelin.io)

About

🌟 Stardex: Explore GitHub Stars Intelligently. Stardex is a powerful web app that lets you search, filter, and cluster any GitHub user's starred repositories. Discover hidden patterns and find your next favorite project with intelligent, AI-powered exploration.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published