Skip to content

asjadsyed/AnalyticsMesh

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AnalyticsMesh

Overview

Analyzing data spread across many computers is inherently difficult. AnalyticsMesh solves this problem by providing real-time insights using a local summary of your distributed data, which eliminates network round-trips and enables offline operations.

With AnalyticsMesh, nodes analyze data independently and seamlessly merge their insights into a unified view. This collaborative approach enables analysis of massive datasets and data streams while ensuring data remains accessible even when networks or devices fail.

Technical Details

A distinctive aspect of AnalyticsMesh is its novel enhancement of HyperLogLog, a probabilistic data structure, into a CRDT (Conflict-Free Replicated Data Type). This transformation enables efficient and scalable approximate count-distinct operations, which are essential for big data analytics.

Key features of AnalyticsMesh include:

  • Horizontally Scalable Architecture: AnalyticsMesh allows for easy scaling by adding more nodes, enhancing capacity and performance without disrupting existing operations.

  • Fault Tolerance: AnalyticsMesh withstands network partitions, disruptions, delays, and node failures, ensuring continuous data processing without data loss.

  • Strong Eventual Consistency: Leveraging HyperLogLog's CRDT properties, AnalyticsMesh guarantees eventual consistency in data across all distributed nodes.

  • Tunable Durability and Atomicity: AnalyticsMesh allows tuning data durability and atomicity guarantees, adapting to various operational requirements.

  • Containerization Support: Containerization simplifies scalable deployment by providing consistent, isolated environments across platforms.

Getting Started

Without Docker

Prerequisites:

  • Git
  • Apache Thrift compiler
  • Make
  • Python >= 3.12
  1. Install prerequisites (Ubuntu/Debian):
sudo apt update
sudo apt install -y git thrift-compiler make python3.12
  1. Clone the repository and navigate to the project directory:
git clone https://github.com/asjadsyed/AnalyticsMesh
cd AnalyticsMesh
  1. Install Python dependencies:
python3 -m pip install -r requirements.txt
  1. Build the project:
make
  1. Run the application:
python3 ./src/main.py --help

With Docker

Prerequisites:

  • Git
  • Docker
  • Docker Compose
  1. Install prerequisites (Ubuntu/Debian):
sudo apt update
sudo apt install -y git
sudo snap install docker
  1. Clone the repository and navigate to the project directory:
git clone https://github.com/asjadsyed/AnalyticsMesh
cd AnalyticsMesh
  1. Build Docker containers:
docker-compose build
  1. Run the application:
docker-compose run analytics-mesh --help

License

AnalyticsMesh is licensed under the Apache License 2.0. Refer to the LICENSE file for details.

About

Distributed Cardinality Tracking

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published