Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor API for cleaner RESTful idioms and support pagination and batching #20

Merged
merged 7 commits into from
Mar 13, 2024

Conversation

caguado
Copy link
Collaborator

@caguado caguado commented Jan 25, 2024

This PR contains 4 stacked commits to modify the current API proposal to:

  • Model locations and sessions as REST resources: Removes the action from the URI for the resource to allow individual referencing and collections of nested resources.
  • Model authorization errors when accessing a resource: The current API namespace is shared across all the callers so responses will be filtered by visibility of the caller or denied in the absence of authorization to use the passed input parameters.
  • Add pagination to listing of locations and sessions: Implement a paginator over the list of sessions visible to a caller. The paginator is opportunistic as it will return an upper bounded set using the client hint, and opaque as it will make no commitments on the content of the pagination token.
  • Disambiguate items in batch processing for sessions: The creation of sessions allows to submit a set of them to treat them as an aggregate when requesting to operate on a certain peering posture. This commit adds a field to the input and output series such that the client can univocally identify which sessions have been processed successfully or not without relying on implicit ordering of the input or result set.

Remove the action from the URI for the resource to allow individual
referencing and collections of nested resources.
The current API namespace is shared across all the callers so responses
will be filtered by visibility of the caller or denied in the absence of
authorization to use the passed input parameters.
Implement a paginator over the list of sessions visible to a caller.
The paginator is opportunistic as it will return an upper bounded set
using the client hint, and opaque as it will make no commitments on the
content of the pagination token.
The creation of sessions allows to submit a set of them to treat them as
an aggregate when requesting to operate on a certain peering posture.
This commit adds a field to the input and output series such that the
client can univocally identify which sessions have been processed
successfully or not without relying on implicit ordering of the input or
result set.
Copy link

github-actions bot commented Jan 25, 2024

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@caguado
Copy link
Collaborator Author

caguado commented Jan 25, 2024

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Jan 25, 2024
@caguado
Copy link
Collaborator Author

caguado commented Jan 25, 2024

I originally thought of opening 4 smaller PRs with one topic branch per change to comment/reject independently, but I noticed it complicates the parallelization of this merge, hence only one PR. I hope this is the expectation. If any comments require changes, please let me know if you want them added as newer commits, or prefer amending the offending ones.

@caguado caguado linked an issue Jan 25, 2024 that may be closed by this pull request
@grizz
Copy link
Member

grizz commented Jan 25, 2024

@caguado looks good, thanks! I have a couple questions for clarification.

I'm confused on the difference between the item_id and session_id? Could you explain?

  • session_id (of individual session generated by the server. Client provides a item ID to track the session within the request, but the server's session ID will be the final definitive ID)

I just found that, if it's just the server vs initiator, I think we should name it accordingly so it's more intuitive, thoughts?

How is next_token supposed to work? I see:

    * next_token (opaque string to hint the query and last result returned when fetching a new page)

but don't see anything else referencing that.

api/openapi.yaml Show resolved Hide resolved
api/openapi.yaml Show resolved Hide resolved
@jramseyer
Copy link
Collaborator

I'm confused on the difference between the item_id and session_id? Could you explain?

  • session_id (of individual session generated by the server. Client provides a item ID to track the session within the request, but the server's session ID will be the final definitive ID)

I just found that, if it's just the server vs initiator, I think we should name it accordingly so it's more intuitive, thoughts?

What if the server always issued both ids (for individual sessions and for batch configuration requests)? It might be simpler if the server were always the grantor of IDs, and the client the recipient of IDs (which they can map back internally as needed).

If the client grants the individual item_ids, how do we handle the case where the server proposes to add additional sessions within a request? Should the server ID those sessions? Should they be ID-less, until the client agrees to configure them?

@caguado
Copy link
Collaborator Author

caguado commented Jan 25, 2024

I'm confused on the difference between the item_id and session_id? Could you explain?

  • session_id (of individual session generated by the server. Client provides a item ID to track the session within the request, but the server's session ID will be the final definitive ID)

I just found that, if it's just the server vs initiator, I think we should name it accordingly so it's more intuitive, thoughts?

What if the server always issued both ids (for individual sessions and for batch configuration requests)? It might be simpler if the server were always the grantor of IDs, and the client the recipient of IDs (which they can map back internally as needed).

If the client grants the individual item_ids, how do we handle the case where the server proposes to add additional sessions within a request? Should the server ID those sessions? Should they be ID-less, until the client agrees to configure them?

I understood we had discarded that behaviour in the call 2 from Jan 10th where if the server wants to propose sessions outside of those requested would either fall the request or we would add another API resource for it.

Implement a method to depeer by deleting existing sessions. The
batched creation method does not create an obvious collection resource
on the server side that may be used for a depeering situation therefore
this commit only implements individual session deletion.

To tackle a situation where depeering is desired for a whole metro or
IX while respecting this RESTful interface, we will need to define a
separate aggregate resource that can be used to operate atomically on
the set of target sessions.
This commit adds the soft deleted semantics described in the original
spec where a set of sessions is requested for deletion and the response
will include a status in the deletion of the request.

This commit also recreates these semantics in the individual session
deletion.
@caguado
Copy link
Collaborator Author

caguado commented Feb 8, 2024

Regarding comments around IDs from the last call. I'm adding a description of the intent for the existing IDs, and the role of the additional ones. In every case, it is assumed that IDs are used to univocally identify the resources they refer to in their scope of existence.

Existing IDs in the current spec proposed:

  • session_id: reflects the server-side generated identifier of the BGP session resource it represents on the receiver system (resource that is owned by the initiator, read ownership from an authz sense, ie. in the OAuth2 sense). It is invariant for the lifetime of the session and unique within the relationship between initiator and receiver. Should the underlying resource cease to exist, the identifier remains unique in the relationship with the initiator. This constraint is to guarantee stability of usage on resource identifiers over time, should any of the parts assign additional semantics on either side (such as usage permissions, ACLs, etc). It is up to the receiver's implementation to do full/soft deletions, and to create a namespace where IDs include owner's ID or are globally unique across time. So in that sense, this identifier is also opaque to the client.
  • item_id: reflects a single element within a set of candidate resources, to be used during batched creation requests. This identifier is unique within the batch and for the lifecycle of the batch request. The identifier allows initiators to univocally identify items from responses, should any of them have not resulted in a new resource on the receiver side.

Discussed additional identifiers:

  • session_name (or similar): reflects the client provided identifier for the new resource to be created on the receiver. As it is under the initiator's control, it is not invariant and only unique for the lifetime of the session, not during the relationship between initiator and receiver. This name carries meaning for the initiator, similar to a description field. It may have a different format to the session_id. Given that the client considers as an additional identity of the session resource, it could also be used to replace the item_id specified above under the request scope.
  • idempotency_token (or similar nonce client-generated request ID): reflects the identifier of the network call to the receiver's API. This identifier is optional and generated by the client to help the receiver disambiguate duplicate requests in the presence of network partitions, reactions to timeouts, etc. and ensure the API remains idempotent for all mutating uses. Assuming batch create is already implemented with idempotency (e.g. via indexing the tuple (initiator IP, receiver IP, location), it will help disambiguate calls that are not idempotent in nature such as resource updates, unless versioning/optimistic locking is also added to the API spec.
  • request_id: reflects the server-side identifier provided by the receiver to identify the network call and help during troubleshooting, audit logging.

So to concrete my proposal, what if we:

  • keep session_id as is
  • add session_name to univocally identify resources from the client perspective and during the session creation (replacing item_id)
  • add idempotency_token (or any other name) as optional attribute to help on disambiguation of retries

@jramseyer
Copy link
Collaborator

Thanks for the updates!

A few questions:

It is invariant for the lifetime of the session and unique within the relationship between initiator and receiver. Should the underlying resource cease to exist, the identifier remains unique in the relationship with the initiator.

Let's say ClientA and ServerB want to peer. ClientA sends a request to peer on client_ip with server_ip. Server configures the request, and assigns it id_1.
Later, both sides delete the session.
A year later, ClientA re-requests peering on the same ips. Is it correct then that the server will return a new id, id_2, for this case?
I suppose my question is: what is the lifetime of the session? Does it end when the session is deleted, even if we later re-establish?

I would assume it ends when the session is deleted, and the server should give a new id, but wanted to confirm.

item_id

This is different from the session_id, in that, if (later, when the API supports it) the client wants to change the prefix limit on session (session_id_1), that request would be assigned item_id_1?

session_name

I don't think we need to define this--this identifier is exclusive to the client, so the client can choose to use it or not. I would vote we leave it out and let clients do as they wish. I don't see this getting passed back and forth between the client and server.

(I guess the use case would be that the client could send a request for a session with the session_name attached, and the server would reply with session_id, and thus the client could easily match between session_name and session_ids?)

idempotency_token

Is this necessary? Should we not just force the server to check for duplicate requests on their end?
I suppose it makes it easier for the server to check for duplicate requests, but I worry that we set ourselves up for further confusion here, if the client is badly configured
example:
client sends request with token A
client has error and thinks that requestA was not sent
client sends second request with tokenB
Server gets requests A and B, and naively filters on tokens instead of request contents. Thus, server configures requests A and B, because the tokens were different.
I understand that the above is an edge case, but I don't see what we get out of adding an additional token, since we should be uniquing by request contents anyway?

request_id

Makes sense to me.

Thanks for the detailed proposal!

api/openapi.yaml Outdated Show resolved Hide resolved
api/openapi.yaml Outdated Show resolved Hide resolved
@grizz
Copy link
Member

grizz commented Feb 18, 2024

Thanks for all of the added details Carlos!

session_id

It sounds like we all agree on this and it that it MUST be globally unique for currently enabled sessions. :)

I think having it unique over time should be up to the server, but I can see the argument for making it required if you all feel strongly about it.

The rest mostly seem unnecessary to me.

session_name

I don't think we need to define this--this identifier is exclusive to the client, so the client can choose to use it or not. I would vote we leave it out and let clients do as they wish. I don't see this getting passed back and forth between the client and server.

(I guess the use case would be that the client could send a request for a session with the session_name attached, and the server would reply with session_id, and thus the client could easily match between session_name and session_ids?)

I agree, at least in it the current implementation. If we were to add a request for get_id_from_session_name, that could make sense -- but I can't see a use case for that.

A unique way to identify the session is always going to be local/remote IPs (and local/remote ASNs if a network has multiple ones) as there can be only one, so I don't think we need idempotency_token. It's also how we're going to be able to generate session_ids for existing sessions.

item_id

I'm struggling to see the usefullness of this, we could do array offset of the request if we need to, otherwise we can still always fall back to local/remote IP to figure these out.

request_id

Again, it could be handy for troubleshooting, but users could just as easily to IP or ASN and timestamp, right?

I think we should keep the initial API spec as simple as possible, it's a lot easier to add fields later than to remove them.

@grizz grizz mentioned this pull request Feb 18, 2024
@caguado
Copy link
Collaborator Author

caguado commented Feb 21, 2024

I would assume it ends when the session is deleted, and the server should give a new id, but wanted to confirm.

That's the proposal, to make explicit not to recycle resource IDs or otherwise explain why.

This is different from the session_id, in that, if (later, when the API supports it) the client wants to change the prefix limit on session (session_id_1), that request would be assigned item_id_1?

I'm struggling to see the usefullness of this, we could do array offset of the request if we need to, otherwise we can still always fall back to local/remote IP to figure these out.

I am proposing to make request and response item matching explicit for the duration of the request alone. If the agreement is not to use a unique identifier that only exists for such duration, then how a developer should match items the response to the items they requested? Joining request and response lists by offset? Matching IP-pairs for the session in the response body? I don't agree those are simpler than matching by the client-side-assigned identity to the resource (session_name, or this item_id if we don't agree to add such session_name). But if everyone agrees, we should make that expectation explicit in the doc.

I would vote we leave it out and let clients do as they wish. I don't see this getting passed back and forth between the client and server.

I agree, at least in it the current implementation. If we were to add a request for get_id_from_session_name, that could make sense -- but I can't see a use case for that.

The main use case is ease of use of the API for a human interacting with it. Imagine presenting a list of sessions to act on in a drop down menu in the UI, what would that drop down expect the user to know about the sessions? their random ID? the IPv6 pair concatenated string for the session? Surely, it is up to each consumer of the API to decide, but this field is aiming to provide them with an option that both initiator-receiver can independently agree and explicit.

A unique way to identify the session is always going to be local/remote IPs (and local/remote ASNs if a network has multiple ones) as there can be only one, so I don't think we need idempotency_token.

My first comment on session_name also mentions that using it would make idempotency_token redundant. But I disagree that sessions are univocally identified by a 2-way or 4-way tuple, let's make it explicit how we choose one format, or the other, or both? IMO it is actually simpler for a client to identify the session with its ID :) Somewhat related, this other attempt at identifying auto-configuration is deliberately open to such definition so remains ambiguous to follow one or another but implementers will have to disambiguate off spec.

I think we should keep the initial API spec as simple as possible, it's a lot easier to add fields later than to remove them.

I argue that not defining behavior explicitly makes it not simple as people will have to agree off spec. It will be easy so long those changes are backward compatible. We are not defining how versioning of the API will work either (content negotiation? embedded in resource path (i.e. /v1/sessions)?) so we implicitly expect that new fields to be added will not be mandatory.

As per the different discussions, this commit removes references to
identifiers used to identify parts of an inflight request that
disambiguate failure scenarios. They may be addressed later.

It also removes the partial successful reponse conveyed through a
redirect error code. The rationale to use only the 200 success code
is that the response for a batched request has been successfully
processed and the response may still contain references to items which
were not successful. Clients will have to introspect the response to
decide what to do with those failed items.
Copy link
Collaborator

@jramseyer jramseyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@caguado caguado merged commit 09762b6 into bgp:main Mar 13, 2024
1 check passed
@caguado caguado deleted the refactor/batching branch March 13, 2024 17:37
@github-actions github-actions bot locked and limited conversation to collaborators Mar 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal for more idiomatic REST API
4 participants