-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit logging of soft errors (draft) #245
base: split-arch
Are you sure you want to change the base?
Conversation
- Revised earlier changes to limit logging of ALREADY_EXISTS errors. Defined ALREADY_EXISTS and NOT_FOUND as soft errors, and implemented IsSoftError() and IsSoftTdiError() predicates to check whether a status code is a soft error. Signed-off-by: Derek Foster <[email protected]>
Signed-off-by: Derek Foster <[email protected]>
Signed-off-by: Derek Foster <[email protected]>
- Renamed filtering macro to RETURN_IF_ERROR_WITH_FILTERING(). - Add filtering to calls to GetTableEntry(). Signed-off-by: Derek Foster <[email protected]>
Signed-off-by: Derek Foster <[email protected]>
|
||
// Checks whether a TDI status code is a soft error. | ||
static inline bool IsSoftTdiError(tdi_status_t err) { | ||
return (err == TDI_ALREADY_EXISTS) || (err == TDI_OBJECT_NOT_FOUND) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the definition of a soft error in this implementation? Why is "object/table not found" considered a soft error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
When I reviewed the updated HSD-ES ticket, I discovered that it contained log messages like this:
E20231201 00:28:53.970028 5140 tdi_table_manager.cc:624] Return Error: tdi_sde_interface_->GetTableEntry(device_, session, table_id, table_key.get(), table_data.get()) failed with StratumErrorSpace::ERR_ENTRY_NOT_FOUND: 'table->entryGet(*real_session->tdi_session_, *dev_tgt, flags, *real_table_key->table_key_, real_table_data->table_data_.get())' failed with error message: Object not found. E20231201 00:28:53.970144 5140 es2k_node.cc:278] StratumErrorSpace::ERR_AT_LEAST_ONE_OPER_FAILED: One or more read operations failed. E20231201 00:28:53.970456 5140 p4_service.cc:353] Failed to read forwarding entries from node 1: One or more read operations failed.
-
The commentary showed that they were also bothered about these issues (i.e., my previous commit had addressed only half their problem).
-
ALREADY_EXISTS and NOT_FOUND are two sides of the same coin. The conditions are easily recoverable, which is what makes them "soft" errors.
-
The customer is complaining about the fact that the intermediate code is logging, not just ALREADY_EXISTING, but ALREADY_EXISTING and NOT_FOUND. They are therefore complaining about our logging soft errors.
-
When TDI status codes are converted to Stratum status codes, TDI_TABLE_NOT_FOUND and TDI_OBJECT_NOT_FOUND are both mapped to ERR_ENTRY_NOT_FOUND. I therefore treat them as identical.
-
The idea is to reduce the amount of special-purpose code by treating all errors with this classification identically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point taken. I've added some additional documentation to the header file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More to the point, there is code (in ovsp4rt, and perhaps elsewhere) that checks for the existence of a table entry before it proceeds. TDI does not provide an EntryExists() method, so the code does a read and checks the status code.
And think about it. If "entry already exists" is a soft error, why wouldn't "entry does not exist" be a soft error? They're complementary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chief effect of this change is to go from saying that a specific status code (ALREADY_EXISTS) deserves special handling to saying that there's a class of status codes that deserve special handling, and ALREADY_EXISTS and NOT_FOUND are both members.
The other effect is to call these soft errors instead of appending "(may not be an error)" to the message text. I suppose I could have changed the text to "(note that the application may not consider this to be an error)", but calling it a "soft error" is more elegant.
Signed-off-by: Derek Foster <[email protected]>
|
||
// Checks whether a TDI status code is a soft error. | ||
static inline bool IsSoftTdiError(tdi_status_t err) { | ||
return (err == TDI_ALREADY_EXISTS) || (err == TDI_OBJECT_NOT_FOUND) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More to the point, there is code (in ovsp4rt, and perhaps elsewhere) that checks for the existence of a table entry before it proceeds. TDI does not provide an EntryExists() method, so the code does a read and checks the status code.
And think about it. If "entry already exists" is a soft error, why wouldn't "entry does not exist" be a soft error? They're complementary.
|
||
// Checks whether a TDI status code is a soft error. | ||
static inline bool IsSoftTdiError(tdi_status_t err) { | ||
return (err == TDI_ALREADY_EXISTS) || (err == TDI_OBJECT_NOT_FOUND) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chief effect of this change is to go from saying that a specific status code (ALREADY_EXISTS) deserves special handling to saying that there's a class of status codes that deserve special handling, and ALREADY_EXISTS and NOT_FOUND are both members.
The other effect is to call these soft errors instead of appending "(may not be an error)" to the message text. I suppose I could have changed the text to "(note that the application may not consider this to be an error)", but calling it a "soft error" is more elegant.
|
||
// Special version of RETURN_IF_ERROR() that suppresses logging if this | ||
// is a soft error. | ||
#define RETURN_IF_ERROR_WITH_FILTERING(expr) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RETURN_IF_ERROR_LOG_IF_NOT_SOFT()
?
I keep trying to come up with a good name for this macro, to no avail.
Revised and expanded on the changes made in PR #240.
IsSoftError()
predicates to check whether a status code is a soft error.