Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reusable details combo items #12

Merged
merged 18 commits into from
Jan 21, 2018

Conversation

randallsquared
Copy link
Contributor

@randallsquared randallsquared commented Jan 18, 2018

This is a suggestion for collapsing the health response format and the details object format into the same definition, as I mentioned over in #8 .

It builds on and includes #7 .

@inadarei
Copy link
Owner

Sorry, I don't get how this is different from #7? The example JSON seems to be the same?

@randallsquared
Copy link
Contributor Author

It is similar, for sure. Here are the differences:

  • There's a "self" link on a details object which could be separately queried.
  • The connections and uptime have been converted to details, since they are metrics like CPU and memory usage.
  • The uptime metric was changed to be a number, since it has a metricUnits of "s" for "seconds", now.
  • The JSON name release_id was changed to releaseID for consistency with other camelCasing, including serviceID.
  • The JSON name componentId was changed to componentID for consistency with other camelCasing, including serviceID (while writing this, when I realized it was different).
  • The metricUnit values were changed to be standard abbreviations.
  • The typo of type for componentType was corrected.

For the spec:

  • The Details specification was merged with the base specification.
  • JSON names that were duplicated between them were removed.
  • The details
  • 4xx was removed from the mentioned HTTP status codes.
  • Definition of custom abbreviations like "mb" and "pb" was removed.

The visible similarity to your original reusable-details JSON is intentional, since I wasn't trying to make major changes to output, only simplify.

@inadarei
Copy link
Owner

Thank you, Randall. Looking at the rendered representation of your version to get a better sense: http://rawgit.com/randallsquared/rfc-healthcheck/reusable-details-combo-items/index.html

I think what bothers me a lot in details (vs. top-level properties) is how cumbersome it is to parse. If I want to find "uptime" I have to traverse entire array of details to find which one has the name "uptime"... OK, that has an obvious cure, tho, right? :)

If instead of an array, detail is a JSON object where object keys are "componentName:metricName" (and it can also be a single string for simple things like uptime, where component name and metric name are the same thing), the details object IMHO becomes way more usable.

To keep building on what you did, how does this look instead:

{
  "status": "pass",
  "version": "1",
  "releaseID": "1.2.2",
  "notes": [""],
  "output": "",
  "details": {
    "cassandra:responseTime": [
      {
        "componentId": "dfd6cf2b-1b6e-4412-a0b8-f6f7797a60d2",
        "componentType": "datastore",
        "metricValue": 250,
        "metricUnit": "ms",
        "status": "pass",
        "time": "2018-01-17T03:36:48Z",
        "output": ""
      }
    ],
    "cassandra:connections": [
      {
        "componentId": "dfd6cf2b-1b6e-4412-a0b8-f6f7797a60d2",
        "type": "datastore",
        "metricValue": 75,
        "status": "warn",
        "time": "2018-01-17T03:36:48Z",
        "output": "",
        "links": {"self": "http://api.example.com/dbnode/dfd6cf2b/health"}
      }
    ],
    "uptime": [
      {
        "componentType": "system",
        "metricValue": 1209600.245,
        "metricUnit": "s",
        "status": "pass",
        "time": "2018-01-17T03:36:48Z"
      }
    ],
    "cpu:utilization": [
      {
        "componentId": "6fd416e0-8920-410f-9c7b-c479000f7227",
        "node": 1,
        "componentType": "system",
        "metricValue": 85,
        "metricUnit": "percent",
        "status": "warn",
        "time": "2018-01-17T03:36:48Z",
        "output": ""
      },
      {
        "componentId": "6fd416e0-8920-410f-9c7b-c479000f7227",
        "node": 2,
        "componentType": "system",
        "metricValue": 85,
        "metricUnit": "percent",
        "status": "warn",
        "time": "2018-01-17T03:36:48Z",
        "output": ""
      }
    ],
    "memory:utilization": [
      {
        "componentId": "6fd416e0-8920-410f-9c7b-c479000f7227",
        "node": 1,
        "componentType": "system",
        "metricValue": 8.5,
        "metricUnit": "GiB",
        "status": "warn",
        "time": "2018-01-17T03:36:48Z",
        "output": ""
      },
      {
        "componentId": "6fd416e0-8920-410f-9c7b-c479000f7227",
        "node": 2,
        "componentType": "system",
        "metricValue": 5500,
        "metricUnit": "MiB",
        "status": "pass",
        "time": "2018-01-17T03:36:48Z",
        "output": ""
      }
    ]
  },
  "links": {
    "about": "http://api.example.com/about/authz",
    "http://api.example.com/rel/thresholds": "http://api.example.com/about/authz/thresholds"
  },
  "serviceID": "f03e522f-1f44-4062-9b55-9587f91c9c41",
  "description": "health of authz service"
}

I admit - I like this last version a lot :)

This was referenced Jan 20, 2018
@randallsquared
Copy link
Contributor Author

This does make it slightly easier to get a specific details object, assuming the clent already knows the component name and metric name they're looking for. It seems more complex, introducing a number of new rules for a slight bit of usability in a specific case where we do not want to iterate over the entire details. If that's an important use case, a separate JSON object could be used to index into the details object, providing the same ease of use, but with fewer rules the client must understand, since it could be ignored:

{
  "details_index": {
    "cassandra:responseTime": [0],
    "cassandra:connections": [1],
    ...
  }
}

(to be clear, I'm not advocating this; I don't really think it's needed)

If we don't mind iterating over the details, then it's just a few lines to build such an index or an object like the one described. It therefore seems to me that switching details to an object is only compelling if the details array is quite large.

"What new rules," you ask?

We'd need to describe the name and value structures, including

  • whether a leading colon is permissible,
  • whether a trailing colon is permissible,
  • how to handle a component name that has colons (since a component name is probably out of scope of this spec),
  • how to handle a metric name that has colons,
  • whether order of the detail objects conveys any meaning,
  • whether clients encountering a non-array-wrapped details object as the value should recover or fail,
  • and probably others I'm not thinking of in the moment.

@inadarei
Copy link
Owner

Yes, I hear you and agree with several of the concerns you listed. That said, I think we won't be able to design something simple and user-friendly without making some compromises. The underlying point is still that this is very specifically a health-check format, optimized for ease of use and interoperability. I understand that some edge-cases may get unattended, but is that bigger concern than making parsing more involved?

I think changes in details were steps in the right direction but as we made format more powerful, we have to also pay the tax of keeping it simple and keyed object vs.array seems to hit the balance. It is relatively of similar complexity to retrive "response.cpu" and response.details.["cpu:utilization"], but not the same: to traverse array and fish for needed value.

We certainly can keep debating this issue, but what I would like to do is to:

  1. accept this PR
  2. Start introducing keyed object

so we have that version to look at without this big issue overshadowing everything else. We are still in early-enough draft to move quickly.

@inadarei inadarei merged commit 394f2dc into inadarei:master Jan 21, 2018
@dret
Copy link
Contributor

dret commented Feb 22, 2018

any progress on this one? i am really interested because the issues of naming conventions, how to handle extensibility, and how to handle namespaces are pretty much always the same for many of these formats, but seem to be tackled from the ground up every single time.

@randallsquared randallsquared deleted the reusable-details-combo-items branch February 23, 2018 13:37
@randallsquared
Copy link
Contributor Author

@dret Since this PR was merged/closed, let's discuss this in some issue that's more focused. I see that #10 is already about registries and/or defining what extensibility means for healthcheck; maybe we need a separate issue to discuss naming and how names can be added?

@dret
Copy link
Contributor

dret commented May 14, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants