Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: propagate a 429 status code if email-provider rate limiting occurs #5801

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mzikherman
Copy link
Contributor

This is an alternate implementation of #5798 , different approach altogether.

This is a two-pronged thing:

First - if there are any 429's from an upstream service experienced by an email provider app making a request to MP - we log this via extensions, as emailProviderLimited: true. This is sort-of a place where we temporarily store the fact that this situation has occurred during this request, and some middleware elsewhere in the stack will introspect. The GraphQL spec allows for arbitrary data to live here, and while our middleware will actually introspect and intercept requests with this extension so it never makes it to an end user as-is, I think using extensions for this is reasonable.

Second - we have a middleware on our GraphQL route that is mounted before our regular GraphQL server. This allows it to register a custom res.end function to be able to intercept the response and introspect - if there is extensions and emailProviderLimited: true exists, we can send our custom 429/error response.

This overall still feels wonky to me as there's a bit of hackiness around res.end re-defining, I'm not sure if this is truly better or not than the patch in #5798 - but it is an alternative!

Copy link
Contributor

@artsyjian artsyjian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me!

Copy link
Contributor

@joeyAghion joeyAghion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer this middleware-like version over patching a lib.

extensions["emailProviderLimited"] = true
return true
}
return false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these true/false return values do anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just the early-returning of the .some iterator. I can double-check this snippet (it may or may not have been AI-generated).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use something other than some here? To joey's point, return true / false means that we should / should not return a result from some to be collected into a variable. If we're simply iterating over an array to check for values to be merged into an object, we can use forEach, which doesn't return anything and reads more correctly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

const extensions = result.errors?
  .flatMap(flattenErrors) 
  .reduce((acc, error) => {
    const httpStatusCode = statusCodeForError(error)
    
    if (httpStatusCode === 429) {
      acc.emailProviderLimited = true
    }
    
    return acc
  }, {})

return extensions

const contentType = res.get("content-type") || "text/html"

if (contentType.indexOf("application/json") === 0) {
const response = JSON.parse(chunk)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate we have to re-parse this.

Uninformed question: I've seen context used for passing request data around. Is it not possible to use that to make the 429 status available, as opposed to rendering it into the extensions response?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose one could use context for that, but that's a GraphQL-server-concept (as in, within the scope of the special express-graphql sub-app mounted at our /v2 endpoint) - and so this middleware (which is a totally separate Express middleware mounted before that sub-app) doesn't have access to that context.

In general trying to implement some version of this was tricky for exactly this reason!

  • Can't do it natively with express-graphql as there's no parseResponse or any other 'hook' available -> leading to possible patch.
  • Can't do it natively with Express middleware as that won't have access to the completed response, unless there's some tricksy res.on("end", ... overriding - and then this approach implies that we do need to store the 'this is an email provider request that resulted in rate limiting' fact in the response somewhere.

const response = JSON.parse(chunk)

if (response.extensions && response.extensions.emailProviderLimited) {
res.status(429).json({ error: "rate limit reached, try again later" })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This drops the other response data (including the extensions data), and just returns this tiny error blob, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup

@@ -84,10 +85,17 @@ function createExtensions(document, result, requestID, userAgent) {
? fetchLoggerRequestDone(requestID, userAgent)
: {}

// Should this introspect xapp token/roles for `email_provider` instead?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered this same thing. It would be more reliable, but more complex code-wise. Overall I think it's OK to use the user-agent in this way because Braze is a well-behaved integration. The worst case is that a caller spoofing their user-agent gets this same, conservative treatment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, worst-case winds up being this is applied more often/overzealously. I think it's fine that we try to call it 'email provider' in our code, but somewhere in the stack (ideally maybe kept to one or minimal spots), there is a check that winds up having to mention 'braze'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may not know and this is a bit of a off topic question but I've wondered for a while about the ID thats part of Braze user-agent:

Braze Sender 954a8af2d130500d09e736a5e8ce1db59da5c25f; Metaphysics

What is the purpose of that ID? Is that something that is unique for the organization that Braze has a relationship with or perhaps its masking information to make it harder to spoof the user-agent? Does the ID stay constant or can it change at any point? 🤔

@mzikherman
Copy link
Contributor Author

I actually think this can be merged independently of any backend work we might do to add rate limiting, and more as just an enhancement within Metaphysics - since we know we want this behavior - in fact maybe we even want this behavior today in case Braze, via Metaphysics, is already being rate-limited by a backend app.

src/index.ts Show resolved Hide resolved
Copy link
Contributor

@ovasdi ovasdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure either, as far as which approach is better, but by my understanding this alternative approach would continue to work even if at some future point express-graphql is removed from the codebase. Assuming this is true, this gets a +1 from me over the other approach.

@@ -0,0 +1,22 @@
import { flattenErrors, statusCodeForError } from "lib/graphqlErrorHandler"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but can we rename this file to emailProviderRateLimitExtension so that it matches idiomatic JS? We have snake_case in MP, but that's very old.


// Log in `extensions` if this request was by an email provider and resulted
// in a 429 rate limit error from a downstream request made using `fetch`.
export const emailProviderRateLimitExtension = (_documentAST, result) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ie, with the named export here, and the file name being snake case, that creates a 1:2 naming representation in the codebase, which is unnecessary.

// rate limting occurred.
//
// If so, we return a 429 status code and a JSON error message.
app.use("/v2", function (req, res, next) {
Copy link
Member

@damassi damassi Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're gonna chatGPT stuff we should at least follow modern JS conventions and audit the output 😄 we never see these types of anonymous function declarations anywhere, not for years. Please use arrow functions


const originalEnd = res.end

res.end = function (...args) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants