Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch out /service/etcd service endpoint for /service/storage/status to support multiple backends #43

Merged
merged 12 commits into from
Aug 6, 2024

Conversation

synackd
Copy link
Collaborator

@synackd synackd commented Aug 5, 2024

Introduces one of multiple fixes for #20.

BSS used to support only the EtcD backend, so having a /service/etcd endpoint made sense. However, with the recent support of a PostgreSQL backend, it makes more sense to make this more generic.

This endpoint is now changed to /data/service/storage/status, and the response JSON now includes which backend is being used (i.e. "etcd" or "postgres") as well as the status of the connection to the printed backend.

So instead of:

# curl -H "Authorization: Bearer $ACCESS_TOKEN" https://foobar.openchami.cluster/boot/v1/service/etcd

One now queries /boot/v1/service/data. Here is what a successful query looks like:

# curl -H "Authorization: Bearer $ACCESS_TOKEN" https://foobar.openchami.cluster/boot/v1/service/data
{"bss-storage-backend":{"name":"postgres","status":"connected"}}

In the logs (with debug enabled):

2024/08/05 17:04:06 [bss/pMoC8hss4a-000288] "GET http://foobar.openchami.cluster/boot/v1/service/data HTTP/1.1" from 172.16.0.253 - 200 65B in 886.28µs                                                           
2024/08/05 17:04:06 Test access to postgres using BootParamsGetAll() succeeded
2024/08/05 17:04:06 DEBUG: Boot parameters returned: [{[] [b4:2e:99:a6:06:47] [] nomodeset ro root=live:http://10.100.0.1:9000/boot-images/compute/slurm/rocky8.10-compute-slurm-base-latest ip=dhcp overlayroot=tmpfs overlayroot_cfgdisk=disabled apparmor=0 selinux=0 console=tty0 console=ttyS0,115200 ip6=off ds=nocloud-net;s=http://10.100.0.1:8000/cloud-init/compute/slurm/ http://10.100.0.1:9000/boot-images/efi-images/compute/slurm/vmlinuz-4.18.0-553.5.1.el8_10.x86_64 http://10.100.0.1:9000/boot-images/efi-images/compute/slurm/initramfs-4.18.0-553.5.1.el8_10.x86_64.img {map[] map[] {      }}}] 

And here is what an unsuccessful query looks like, e.g. if Postgres goes down:

# curl -H "Authorization: Bearer $ACCESS_TOKEN" https://foobar.openchami.cluster/boot/v1/service/data
{"bss-storage-backend":{"name":"postgres","status":"error"}}

In the log:

2024/08/05 17:04:59 [bss/pMoC8hss4a-000300] "GET http://foobar.openchami.cluster/boot/v1/service/data HTTP/1.1" from 172.16.0.253 - 500 61B in 1.443017ms                                                         
2024/08/05 17:04:59 Test access to postgres failed: BootParamsGetAll(): postgres.GetBootParamsAll: Unable to query database: dial tcp: lookup postgres on 127.0.0.11:53: no such host

This PR also adds back in the /service/all endpoint to get all statuses at once:

# curl -H "Authorization: Bearer $ACCESS_TOKEN" https://foobar.openchami.cluster/boot/v1/service/all | jq
{
  "bss-version": "1.31.1",
  "bss-status": "running",
  "bss-status-hsm": "connected",
  "bss-storage-backend": {
    "name": "postgres",
    "status": "connected"
  }
}

BSS used to support only the EtcD backend, so having an /etcd endpoint
made sense. However, with the recent support of a PostgreSQL backend, it
makes more sense to make this more generic.

This endpoint is now changed to /data, and the response JSON now
includes which backend is being used (i.e. "etcd" or "postgres") as well
as the status of the connection to the printed backend.
@synackd synackd self-assigned this Aug 5, 2024
@synackd synackd added the not ready Not ready to merge label Aug 5, 2024
@synackd synackd marked this pull request as ready for review August 5, 2024 17:44
@synackd synackd added needs testing and removed not ready Not ready to merge labels Aug 5, 2024
@davidallendj
Copy link
Contributor

That makes sense to make the endpoint to something that's agnostic to the storage backend. Having it to something like /data means that we wouldn't have to change it again in the future.

Just curious, does the endpoint just return general data in storage or is it returning something specific?

@synackd
Copy link
Collaborator Author

synackd commented Aug 5, 2024

It essentially just returns just the connection status to whichever backend BSS is using. The current implementation doesn't return any other details besides this.

@davidallendj
Copy link
Contributor

It essentially just returns just the connection status to whichever backend BSS is using. The current implementation doesn't return any other details besides this.

I'm wondering if it should be changed to something to reflect that or if we want to leave it as /data with the intent to put more stuff there later.

@synackd
Copy link
Collaborator Author

synackd commented Aug 5, 2024

I like the idea of being able to add to it later, though I'll admit I'm not sure what else would be useful to add at this point.

You bring up a good point, /data could be interpreted as retrieving data BSS is storing or some metadata, etc. Maybe something like /storage?

@davidallendj
Copy link
Contributor

I like the idea of being able to add to it later, though I'll admit I'm not sure what else would be useful to add at this point.

You bring up a good point, /data could be interpreted as retrieving data BSS is storing or some metadata, etc. Maybe something like /storage?

Yeah, I think /storage or /storage/info or something similar is more informative about the expected response. We can change or add another endpoint for /data if we do decide later to add more stuff.

@synackd
Copy link
Collaborator Author

synackd commented Aug 5, 2024

/storage/status would be consistent with e.g. /service/status. Though /service has other endpoints besides, /status like /version. Even though we don't have any other endpoints to add under /storage right now, do you think it would be a good idea to add /status to it anyway, that way more could be added later if desired?

@davidallendj
Copy link
Contributor

/storage/status would be consistent with e.g. /service/status. Though /service has other endpoints besides, /status like /version. Even though we don't have any other endpoints to add under /storage right now, do you think it would be a good idea to add /status to it anyway, that way more could be added later if desired?

Yeah, I think that would be reasonable. I definitely prefer a more consistent and predictable API when possible.

@synackd
Copy link
Collaborator Author

synackd commented Aug 5, 2024

That makes sense. I'll change it to that, then.

@synackd
Copy link
Collaborator Author

synackd commented Aug 5, 2024

OK, instead of /service/data, the endpoint is now /service/storage/status.

There is also an endpoint documented in Swagger as /service/all, but there wasn't a route for it in routers.go. I also added that back and updated the docs to include the new storage backend struct.

@synackd synackd changed the title Switch out /etcd service endpoint for /data to support multiple backends Switch out /service/etcd service endpoint for /service/storage/status to support multiple backends Aug 5, 2024
@synackd
Copy link
Collaborator Author

synackd commented Aug 5, 2024

I updated the initial description to add the re-introduction of the /service/all endpoint:

# curl -H "Authorization: Bearer $ACCESS_TOKEN" https://foobar.openchami.cluster/boot/v1/service/all | jq
{
  "bss-version": "1.31.1",
  "bss-status": "running",
  "bss-status-hsm": "connected",
  "bss-storage-backend": {
    "name": "postgres",
    "status": "connected"
  }
}

Copy link
Contributor

@davidallendj davidallendj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of small string-related things to look at.

@davidallendj davidallendj self-requested a review August 6, 2024 00:01
Copy link
Contributor

@davidallendj davidallendj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@synackd synackd merged commit 26ce114 into OpenCHAMI:main Aug 6, 2024
1 check passed
@synackd synackd deleted the storage-endpoint branch August 6, 2024 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants