Skip to content

πŸ₯Š AI-HEXAGON: An objective way to evaluate neural network architectures

License

Notifications You must be signed in to change notification settings

JirkaKlimes/AI-HEXAGON

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

64 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI-HEXAGON

⚠️ Early Development: This project is currently in its early development phase and not accepting external architecture submissions yet. Star/watch the repository to be notified when we open for contributions.

πŸ“Š View Live Leaderboard & Results

DOI

AI-HEXAGON is an objective benchmarking framework designed to evaluate neural network architectures independently of natural language processing tasks. By isolating architectural capabilities from training techniques and datasets, it enables meaningful and efficient comparisons between different neural network designs.

🎯 Motivation

Traditional neural network benchmarking often conflates architectural performance with training techniques and dataset biases. This makes it challenging to:

  • Isolate true architectural capabilities
  • Iterate quickly on design changes
  • Compare models fairly

AI-HEXAGON solves these challenges by:

  • πŸ” Pure Architecture Focus: Tests that evaluate only the architecture, removing confounding factors like tokenization and dataset-specific optimizations
  • ⚑ Rapid Iteration: Enable quick testing of architectural changes without large-scale training
  • πŸ› οΈ Flexible Testing: Support both standard benchmarking and custom test suites

🌟 Key Features

  • πŸ“Š Pure Architecture Evaluation: Tests fundamental capabilities independently
  • βš–οΈ Controlled Environment: Fixed parameter budget and raw numerical inputs
  • πŸ“ Clear Metrics: Six independently measured fundamental capabilities
  • πŸ” Transparent Implementation: Clean, framework-agnostic code
  • πŸ€– Automated Testing: GitHub Actions for fair, manipulation-proof evaluation
  • πŸ“ˆ Live Results: Real-time benchmarking results at ai-hexagon.dev

πŸ“ Metrics (The Hexagon)

Each architecture is evaluated on six fundamental capabilities:

Metric Description
🧠 Memory Capacity Store and recall information from training data
πŸ”„ State Management Maintain and manipulate internal hidden states
🎯 Pattern Recognition Recognize and extrapolate sequences
πŸ“ Position Processing Handle positional information within sequences
πŸ”— Long-Range Dependency Manage dependencies over long sequences
πŸ“ Length Generalization Process sequences longer than training examples

πŸ“ Project Structure

ai-hexagon/
β”œβ”€β”€ ai_hexagon/
β”‚   └── modules/          # Common neural network modules
└── results/              # Model implementations and results
    β”œβ”€β”€ suite.json        # Default test suite configuration
    └── transformer/
        β”œβ”€β”€ model.py      # Transformer implementation
        └── modules/      # Custom modules (if needed)

βš™οΈ Parameter Budget

The default suite enforces a 4MB parameter limit for fair comparisons:

Precision Parameter Limit
Complex64 0.5M params
Float32 1M params
Float16 2M params
Int8 4M params

🀝 Contributing

We welcome contributions once the project is ready for external input. To contribute:

  1. Fork: Create your own fork of the project
  2. Install: Run poetry install (optionally with --with dev,cuda12) to get the ai-hex command
  3. Implement: Add your model in results/your_model_name/
  4. Document: Include comprehensive docstrings and references
  5. Submit: Create a pull request following our guidelines
  6. Wait: CI will automatically evaluate your model and update the leaderboard

Use ai-hex tests list to see available tests, ai-hex tests show test_name to view test schema, and ai-hex suite run ./path/to/model.py to run your model against the suite.

πŸ”§ Technical Stack: JAX and Flax

We chose JAX and Flax for their:

  • 🧩 Functional Design: Clear architecture definitions with immutable state
  • ⚑ Custom Operations: Comprehensive support through jax.numpy
  • 🎯 Reproducibility: First-class random number handling

πŸ“ Code Style: Using einops

We mandate einops for complex tensor operations to enhance readability. Compare:

# Traditional approach - hard to understand the transformation
x = x.reshape(batch, x.shape[1], x.shape[-2]*2, x.shape[-1]//2)
x = x.transpose(0, 2, 1, 3)

# Using einops - crystal clear intent
x = rearrange(x, 'b t (h d) c -> b (h t) (d c)')

πŸ“– Example Model Implementation

import flax.linen as nn
from einops import rearrange

class Transformer(nn.Module):
    """
    Transformer Decoder Stack architecture from 'Attention Is All You Need'.
    Reference: https://arxiv.org/abs/1706.03762
    """
    hidden_dim: int = 256
    num_layers: int = 4
    num_heads: int = 4

    @nn.compact
    def __call__(self, x):
        # Architecture implementation
        return x

πŸ” Test Suite Configuration

Test suites use a JSON configuration format:

{
    "name": "General 1M",
    "description": "General architecture performance evaluation",
    "metrics": [
        {
            "name": "Memory Capacity",
            "description": "Information storage and recall capability",
            "tests": [
                {
                    "weight": 1.0,
                    "test": {
                        "name": "hash_map",
                        "seed": 0,
                        "key_length": 8,
                        "value_length": 64,
                        "num_pairs_range": [32, 65536],
                        "vocab_size": 1024
                    }
                }
            ]
        }
    ]
}

πŸ“ˆ Results are automatically generated via GitHub Actions to ensure fairness. The leaderboard is updated in real-time at ai-hexagon.dev.

Star History

Star History Chart

Support the Project

If you find AI-HEXAGON helpful, consider buying me a coffee!

Buy Me A Coffee

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“š Citation

If you use AI-HEXAGON in your research, please cite it as:

@software{ai_hexagon_2024,
  author       = {Jirka Klimes},
  title        = {AI-HEXAGON: Neural Architecture Benchmarking Framework},
  month        = feb,
  year         = 2024,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.14060642},
  url          = {https://doi.org/10.5281/zenodo.14060642}
}

About

πŸ₯Š AI-HEXAGON: An objective way to evaluate neural network architectures

Resources

License

Stars

Watchers

Forks

Packages

No packages published