Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

miner.py: implement checks and error handling for failing axons #105

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mvds00
Copy link

@mvds00 mvds00 commented Aug 13, 2024

The miner template seemingly assumes that starting an axon never fails. In e2e tests (that use this template as their miner) the axon failed to start, due to mixing next_asyncio in bittensor and regular asyncio in uvicorn. It would be better to terminate the miner process when the axon never starts at all.

This patch addresses this by:

  • wrapping run() in a try/except (this is a must in any Python threading application)
  • signalling exceptions from worker to main thread in a thread safe manner
  • terminating the miner if starting the axon fails
  • monitoring and reporting on whether the axon still runs

Whether to keep the miner running if axon issues arise later is another question; the code as-is indicates this is indented behavior: "# In case of unforeseen errors, the miner will log the error and continue operations." so this is not changed.

This patch depends on another patch to bittensor that adds .is_running() and .exception to class axon.

The miner template seemingly assumes that starting an axon never fails.
In e2e tests the axon failed to start, due to mixing next_asyncio in
bittensor and regular asyncio in uvicorn. It would be better to
terminate the miner process when the axon never starts at all.

This patch addresses this by:
- wrapping run() in a try/except (this is a must in any Python threading
  application)
- signalling exceptions from worker to main thread in a thread safe
  manner
- terminating the miner if starting the axon fails
- monitoring and reporting on whether the axon still runs

Whether to keep the miner running if axon issues arise later is another
question; the code as-is indicates this is indented behavior: "# In case
of unforeseen errors, the miner will log the error and continue
operations." so this is not changed.

This patch depends on another patch to bittensor that adds .is_running()
and .exception to class axon.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant