-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(robot-server): maintain correct order of protocol analyses #14762
fix(robot-server): maintain correct order of protocol analyses #14762
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## edge #14762 +/- ##
==========================================
+ Coverage 67.17% 67.19% +0.02%
==========================================
Files 2495 2495
Lines 71483 71405 -78
Branches 9020 8992 -28
==========================================
- Hits 48020 47984 -36
+ Misses 21341 21305 -36
+ Partials 2122 2116 -6
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay!
# We only check for matching RTP values if the given protocol ID | ||
# (& hence its summary) exists. So this assertion should never be false unless | ||
# somehow a protocol resource is created without an analysis record. | ||
assert len(analyses) > 0, "This protocol has no analysis associated with it." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can happen, can't it?
- Upload a protocol. This starts an analysis in the background.
- Shut off the robot before the analysis completes.
- Power on the robot. The protocol will exist without any analyses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks right. Previously, that would result in the responses providing an empty list of analyses while now it would error out. I feel like raising an error is slightly better than just returning an empty list, but both approaches make the protocol resource unusable without having a way to force reanalysis.
Hmm.. it might be better to return False
here so that it triggers a new analysis. There shouldn't be any risk of corrupting the database if we do that, ya?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, returning False
and triggering a new analysis seems like a good idea. No, I don't see it causing any database problems.
assert ( | ||
last_analysis.status != AnalysisStatus.PENDING | ||
), "Protocol must not already have a pending analysis." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also trigger, I think?
- Upload a protocol. This starts an analysis in the background.
- Before analysis completes, quickly upload the same protocol.
[Edit: Oh, yeah, and this is causing this.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK so correct me if I'm wrong, but as far as I can tell, the background is this: You don't want to start a redundant analysis. To determine if an analysis is redundant, you need the overrides supplied to the last analysis, and the protocol's default values when no overrides are supplied. To get the protocol's default values, you need some analysis to have completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's right, then it seems like a solvable problem.
What we want, ideally, is for the server to be able to get a protocol's available parameters and their default values quickly—say, within 1–2 seconds—without having to do a full analysis with aspirate()
s and dispense()
s and everything.
So when a client does POST /protocols
:
- If there are no analyses for that protocol yet:
- Start one.
- If the last analysis is completed:
- And it has the same runtime parameters as this
POST
request:- Keep that one.
- And it has different runtime parameters from this
POST
request:- Start a new one.
- And it has the same runtime parameters as this
- If the last analysis is pending:
- Wait until we have details about the protocol's runtime parameters, which should only take 1–2 seconds. Then, handle as above.
This is all seems readily possible to me. The run-time parameters Python Protocol API was designed for it, with its isolated add_parameters()
function, which we can call without calling run()
.
Does that seem doable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. Maybe we can aim for it in the long term (or even medium term) but it's not doable in the current architecture of the protocol runner and analyzer. Even though add_parameters()
is separate from run()
, for a few reasons discussed previously, we decided to make handling the actual parameters parsing and setting as part of protocol execution (inside the simulating runner in case of analysis) as opposed to handling it inside an intermediate layer like the protocol reader. Which means that the RTP information is not available until the analysis is complete.
Let's say we refactor the runner to make the protocol's RTP values accessible before the analysis is completed, we still have the protocol analysis & analysis store working on this binary concept of 'pending' vs 'completed' analysis where a protocol's info is only available in the CompletedAnalysisStore. For the 'pending with RTP values parsed' state, we will need to introduce a new functionality for either the pending store or just redo this concept of the binary states for analysis store.
Once we make even that change so that the server can now successfully cross-check against the most recent RTP values, we will have to decide what happens if the new request used different RTP values than the ones in a pending request; do you start another analysis right away and keep the last one running too? Or do you cancel the previous analysis?
So there's a lot of changes that we will need to make in order to implement the behavior you are describing.
I agree that it's a better behavior though. I'm going to make a ticket for it. But until then, I think it's totally fine to say that you will need to wait for the current analysis to complete before reuploading the protocol. I believe the app makes it nearly impossible to re-upload a protocol anyway because of how runs are created. So I don't think it's a case you would run into easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, your call!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, though:
- If you think this is going to be the HTTP-client-facing behavior for the medium term, I would try to get it to return an intentional 503 busy response instead of a 500 internal server error.
- In any case, I would change this to not be an
assert
, since that implies this condition will never happen except byrobot-server
bugs, whereas it really depends on client behavior.- If you're switching the response to HTTP 503 and settling in for this to be the HTTP API, a tri-state "yes"/"no"/"busy" return value seems appropriate.
- If you're keeping the response as an HTTP 500 and calling this a known bug, I actually think a
NotImplementedError
would be correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that. I'll go with the 503 busy response.
robot-server/tests/integration/http_api/protocols/test_analyses.tavern.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question on another assert
and on a Tavern test. Otherwise, this looks good to me. Thanks!
robot-server/tests/integration/http_api/protocols/test_analyses.tavern.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeet
Closes AUTH-229 # Overview Updates the `/protocols` endpoints to always maintain the order of list of analyses as most-recently-started-analysis last, making sure to verify if a new analysis needs to be triggered because of new run-time-parameter values for a previously uploaded protocol. # Risk assessment Medium. Does database update and fixes the analysis order that was broken by #14688 --------- Co-authored-by: Max Marrone <[email protected]>
Closes AUTH-229
Overview
Updates the
/protocols
endpoints to always maintain the order of list of analyses as most-recently-started-analysis last, making sure to verify if a new analysis needs to be triggered because of new run-time-parameter values for a previously uploaded protocol.To achieve that, this PR does the following things:
Test Plan
(Can use test protocol from this PR)
Review requests
Risk assessment
Medium. Does database update and fixes the analysis order that was broken by #14688