You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To enhance observability and performance monitoring of the AI Subnet, we are implementing a Grafana Dashboard for Gateway Metrics. This dashboard offers critical insights into the AI-job broadcasting operations performed by the Gateway, which relays AI-job requests to orchestrators across the network. Livepeer Cloud currently provides a free-to-use gateway, enabling users to test the network's capabilities. Metrics from this pull request are already being collected by the public gateway and displayed on a Grafana dashboard, accessible to the entire community. 🚀
This dashboard also provides orchestrators with an approximate view of the full network traffic, given its status as the largest gateway. Additionally, Livepeer Cloud is developing a more comprehensive system-wide metrics portal, which can be explored further here.
We are calling on the community to help implement this crucial part which increases the visibility of network activity. The implementation of this dashboard is crucial for monitoring and optimizing various aspects of the AI-job requests and orchestrator performance, ensuring that the ai-subnet operates efficiently and effectively. 🔥
Experience with Prometheus for metrics collection.
Bounty Requirements
Implementation: Develop a user-friendly Grafana Dashboard that is easy to set up and tailored for Gateways within the network. Following create a pull request to the Livepeer grafana dashboards repository.
Functionality: The dashboard should include the following metrics:
ai_models_requested: Number of requests per model per pipeline.
ai_request_latency_score: Latency score per pipeline to assess orchestrator performance. This metric should indicate the average time taken for a request (e.g., a 1024x1024 image with 25 time steps).
ai_request_price: The price per unit charged by orchestrators for processing jobs.
ai_request_errors: The number of errors per pipeline, providing insight into the reliability of the pipelines.
Ticket_value_sent: The value of the AI tickets sent to the Orchestrators.
Tickets_send: The total number of tickets sent to the Orchestrators.
These dashboards will provide comprehensive insights into the public gateway requests on the AI network, enabling better optimization and resource allocation.
Scope Exclusions
This bounty does NOT cover the addition of new metrics such as livepeer_current_ai_sessions_total. These additional metrics can be considered in a future bounty.
Implementation Tips
To understand how to work with Gateway metrics, you can refer to a recent pull request that deals with related functionality:
Leverage Existing Metrics: Review existing metrics related to AI requests on the Gateway, which are already implemented on the Cloud Gateway Dashboard. This will provide an initial template for a starting point.
Utilize Documentation: Check out livepeer forum for basic instructions on how to get started.
How to Apply
Express Your Interest: Comment on this issue to indicate your interest and explain why you're the ideal candidate for the task.
Wait for Review: Our team will review expressions of interest and select the best candidate.
Get Assigned: If selected, we'll assign the GitHub issue to you.
Start Working: Dive into your task! If you need assistance or guidance, comment on the issue or join the discussions in the #developer-lounge channel on our Discord server.
Submit Your Work: Create a pull request in the relevant repository and request a review.
Notify Us: Comment on this GitHub issue when your pull request is ready for review.
Receive Your Bounty: We'll arrange the bounty payment once your pull request is approved.
Gain Recognition: Your valuable contributions will be showcased in our project's changelog.
Thank you for your interest in contributing to our project! 💛
Warning
Please wait for the issue to be assigned to you before starting work. To prevent duplication of effort, submissions for unassigned issues will not be accepted.
The text was updated successfully, but these errors were encountered:
Not quite - the dashboard I published is for Orchestrator node operators. For gateways we'll want to omit machine info and probably stick to more detailed panels for pipelines/models. They'll certainly have some shared panels and of course I'll be happy to take this one on
Overview
To enhance observability and performance monitoring of the AI Subnet, we are implementing a Grafana Dashboard for Gateway Metrics. This dashboard offers critical insights into the AI-job broadcasting operations performed by the Gateway, which relays AI-job requests to orchestrators across the network. Livepeer Cloud currently provides a free-to-use gateway, enabling users to test the network's capabilities. Metrics from this pull request are already being collected by the public gateway and displayed on a Grafana dashboard, accessible to the entire community. 🚀
This dashboard also provides orchestrators with an approximate view of the full network traffic, given its status as the largest gateway. Additionally, Livepeer Cloud is developing a more comprehensive system-wide metrics portal, which can be explored further here.
We are calling on the community to help implement this crucial part which increases the visibility of network activity. The implementation of this dashboard is crucial for monitoring and optimizing various aspects of the AI-job requests and orchestrator performance, ensuring that the ai-subnet operates efficiently and effectively. 🔥
Required Skillset
Bounty Requirements
Implementation: Develop a user-friendly Grafana Dashboard that is easy to set up and tailored for
Gateways
within the network. Following create a pull request to the Livepeer grafana dashboards repository.Functionality: The dashboard should include the following metrics:
These dashboards will provide comprehensive insights into the public gateway requests on the AI network, enabling better optimization and resource allocation.
Scope Exclusions
livepeer_current_ai_sessions_total
. These additional metrics can be considered in a future bounty.Implementation Tips
To understand how to work with Gateway metrics, you can refer to a recent pull request that deals with related functionality:
Pull Request #3087
Additionally, make sure to:
How to Apply
#developer-lounge
channel on our Discord server.Thank you for your interest in contributing to our project! 💛
Warning
Please wait for the issue to be assigned to you before starting work. To prevent duplication of effort, submissions for unassigned issues will not be accepted.
The text was updated successfully, but these errors were encountered: