-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blueprint_executor returned Zpool Not Found
error when processing an expunged disk
#7500
Comments
In case the rack has been reset by the time this ticket gets looked at, here are some data I dumped from the various database tables that captures dataset information:
|
The relevant lines from the sled-agent log:
The complete log file can be found at |
I think this a bug when we convert from a omicron/nexus/types/src/deployment.rs Lines 952 to 963 in 6dbd673
This doesn't filter out expunged datasets, so if the blueprint has a mix of in-service and expunged datasets (which the most recent blueprint on this system does, although it's kinda hard to confirm due to #7303), I believe the executor is trying to tell sled-agent about the expunged datasets even though it should skip them. It looks like the
Specifically on that second point: the original blueprint "720e2adf-a837-49af-9015-68af73122654": {
"disposition": "in_service",
"identity": {
"vendor": "1b96",
"model": "WUS4C6432DSP3X3",
"serial": "A079DDE7"
},
"id": "720e2adf-a837-49af-9015-68af73122654",
"pool_id": "3ef8333c-94a5-4650-a62a-9e440a1eabc6"
}, I'm not sure off the top of my head why this disk is gone instead of present with |
Yikes, it looks like dogfood is hitting this same problem:
This zpool matches the disk expunged in #7501. |
#7308 is related - I thought the |
I updated the dublin environment to a build that has #7308. It is now throwing a different 500 error:
I regenerated a new blueprint and set it as target and it ran to successful completion. I'm not sure if this is the expected course of action. Ah, I just saw #7505 so I'll leave this ticket open and not regenerate/re-target blueprints on rack2, assuming that this other fix will take care of the bad blueprint. |
I ran into this issue after upgrading the dublin racklet to a relatively recent omicron commit and attempted to expunge a disk that has a crucible and a pantry zone. The rack hasn't gone through any blueprint update previously.
The text was updated successfully, but these errors were encountered: