A high-performance distributed task scheduler leveraging gRPC to facilitate efficient communication across multiple nodes. The system optimizes task distribution and execution in a scalable, fault-tolerant, and dynamic manner.
- Low Latency and High Throughput: Using gRPC and shared memory for efficient communication and data sharing.
- Scalability: Horizontal scaling by adding worker nodes.
- Advanced Scheduling Algorithms: Ensuring optimal task allocation and resource utilization.
- Fault Tolerance: Robust mechanisms for fault detection and task reassignment.
The system architecture includes the following components:
- Client: Submits tasks and handles asynchronous task responses.
- Worker: Executes tasks and sends heartbeat signals to the Register.
- Process Lifecycle Manager (PLM): Manages task queue and task-worker mappings.
- Main Scheduler: Allocates tasks using advanced scheduling algorithms.
- Register: Maintains a registry of clients and workers, monitoring system health.

- Task Submission: Client submits tasks to PLM using
submitTask
gRPC method. - Task Assignment: Main Scheduler assigns tasks to workers.
- Task Execution and Result Reporting: Workers execute tasks and report results via heartbeat messages.
- Result Delivery: PLM sends task results to the client.
- Failure Handling: Register detects worker failures and informs PLM for task reassignment.
- Shared Memory: Used for task and worker heaps, enabling quick access and updates.
- gRPC Communication: Provides efficient, low-latency communication between components.
- Initialization: Using POSIX system calls.
- Data Structures: Priority queues for task and worker heaps.
- Synchronization: Mutexes and semaphores ensure data consistency.
- Memory Barriers: Prevent CPU memory operation reordering.
- Multiple Services: Dedicated to specific communication pathways.
- Designed initial architecture and implemented basic task submission and worker assignment using gRPC.
- Enhanced task ID assignment and queue management, integrated advanced scheduling algorithms.
- Replaced MPI with shared memory reading, introduced centralized management, enhanced locking mechanisms.
- Moved task assignment to Main Scheduler, optimized shared memory usage.
- Implemented comprehensive gRPC communication, real-time monitoring, and sophisticated task reassignment strategies.
g++ -o plm plm.cpp heap.cpp -lpthread
./plm
- Tanenbaum, A. S., & Van Steen, M. (2007). Distributed Systems: Principles and Paradigms (2nd ed.). Prentice Hall.
- The gRPC Authors. gRPC: A High-Performance, Open-Source Universal RPC Framework. Google. Available: gRPC
- Stevens, I. (1999). UNIX Network Programming, Volume 2: Interprocess Communications (2nd ed.). Prentice Hall.
- Kerrisk, M. (2010). The Linux Programming Interface: A Linux and UNIX System Programming Handbook. No Starch Press.
- Google Inc. gRPC Core Concepts, Architecture and API. Available: gRPC Documentation
- Love, R. (2013). Linux System Programming: Talking Directly to the Kernel and C Library (2nd ed.). O'Reilly Media.
For further details, refer to the full project report and codebase.