Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache gpu_alloc_map in Redis, and Add RescanGPUAllocMaps mutation #3392

Open
jopemachine opened this issue Jan 8, 2025 — with Lablup-Issue-Syncer · 0 comments
Open
Assignees
Milestone

Comments

@jopemachine
Copy link
Member

Motivation  

  • Since gpu_alloc_map exists on the agent, querying this field requires an RPC call.
    In a production environment with multiple agents, repeatedly querying this field is inefficient and can significantly slow down response time.

Required Features

  • Let's cache the gpu_alloc_map in Redis and create a new mutation based on a background task to update the cache.

Impact  

  • This will allow us to monitor the GPU usage of all agents with less overhead (while updates are necessary for an accurate understanding of the situation)

Testing Scenarios  

  • We need to ensure that the background task is correctly triggered after the cache update mutation is executed and verify that the gpu_alloc_map for all agents is cached once the task is completed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant