UOMI Node AI is a crucial component for running a validator node on the UOMI blockchain. This service manages the AI model required for blockchain validation operations.
This repository contains the necessary components to set up and run the AI service required for UOMI blockchain validation.
- Automated installation of dependencies via install script
- Systemd service integration for reliable operation
- CUDA-optimized AI model execution
- Deterministic model outputs for consistent validation
- RESTful API endpoint for model interactions
- Real-time logging and monitoring capabilities
- CUDA-capable GPU(s)
- Ubuntu/Debian-based system
- Conda package manager
- Systemd (for service management)
- Minimum 64GB RAM recommended
- CUDA Toolkit 11.x or higher
sudo apt-get purge nvidia-*
sudo apt-get update
sudo apt-get autoremove
sudo apt install libnvidia-common-530
sudo apt install nvidia-driver-530
# Reboot
nvidia-smi
- Clone the repository:
git clone https://github.com/your-repo/uomi-node-ai
cd uomi-node-ai
- Run the installation script:
chmod +x install.sh
./install.sh
- Configure the systemd service:
sudo cp uomi-ai.service /etc/systemd/system/
sudo nano /etc/systemd/system/uomi-ai.service # Edit paths as needed
- Enable and start the service:
sudo systemctl enable uomi-ai
sudo systemctl start uomi-ai
UOMI Node AI includes built-in monitoring capabilities that can send real-time performance data to a WebSocket endpoint.
To enable monitoring, set the following environment variables:
# Required: WebSocket URL to send monitoring data to
export MONITORING_WEBSOCKET_URL="ws://your-monitoring-server.com:8080/monitoring"
# Optional: Monitoring interval in seconds (default: 10)
export MONITORING_INTERVAL_SECONDS=15
The monitoring service sends the following data every interval:
- System metrics: CPU usage, memory usage, disk usage
- CUDA metrics: GPU memory allocation, device information
- Request statistics: Total requests, average response time, tokens per second
- Service uptime: Days, hours, minutes, seconds since startup
- Garbage collection: Memory cleanup statistics
{
"type": "monitoring",
"timestamp": "2025-05-28T10:30:00.123456",
"data": {
"uptime": {
"total_seconds": 3600,
"days": 0,
"hours": 1,
"minutes": 0,
"seconds": 0
},
"system": {
"cpu_percent": 45.2,
"memory": {
"total_gb": 64.0,
"used_gb": 32.1,
"percent": 50.2
}
},
"cuda": {
"device_count": 2,
"devices": [
{
"name": "NVIDIA RTX 4090",
"memory_allocated": 12.5
}
]
},
"requests": {
"total_requests": 150,
"average_request_time": 2.34,
"average_tokens_per_second": 45.6
}
}
}
Use the included test WebSocket server:
# Install websockets dependency for testing
pip install websockets
# Start test server
./test_monitoring.sh
# In another terminal, start uomi-ai with monitoring
export MONITORING_WEBSOCKET_URL="ws://localhost:8080"
python uomi-ai.py
The service exposes an HTTP endpoint at http://localhost:8888/run
accepting POST requests with the following JSON structure:
{
"model": "casperhansen/mistral-small-24b-instruct-2501-awq",
"input": {
"messages": [
{
"role": "system",
"content": "System message here"
},
{
"role": "user",
"content": "User input here"
}
]
},
"enable_thinking": true
}
The enable_thinking
parameter is optional and defaults to true
. It controls whether the model should use thinking mode in the chat template. You can also specify it within the input object:
{
"model": "deepseek-ai/DeepSeek-R1-0528-Qwen3-8B",
"input": {
"messages": [
{
"role": "system",
"content": "System message here"
},
{
"role": "user",
"content": "User input here"
}
],
"enable_thinking": false
}
}
Note: The only model that supports enable_thinking
is deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
. For other models, this parameter will be ignored.
The service is configured for optimal performance with:
- Deterministic model execution
- CUDA optimization settings
- Automatic GPU device selection
- Fixed random seeds for reproducibility
- The service runs on port 8888 by default
- Implement appropriate firewall rules if exposing the service
Contributions are welcome! Please feel free to submit a Pull Request.
To run tests, execute the following command:
python -m unittest discover -s tests -p "*_test.py"
If you encounter issues:
- Check CUDA availability:
nvidia-smi
- Verify conda environment:
conda env list
- Check service logs for errors:
journalctl -xe -u uomi-ai