Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom path and root_path resolution logic in ASGI scope #147

Open
chestnutcase opened this issue Nov 17, 2020 · 12 comments
Open

Allow custom path and root_path resolution logic in ASGI scope #147

chestnutcase opened this issue Nov 17, 2020 · 12 comments
Labels
feature New feature or request improvement Improve an existing feature

Comments

@chestnutcase
Copy link

chestnutcase commented Nov 17, 2020

FastAPI + mangum works well for single-backend API gateways, but for API gateways where different resources can point to different FastAPI applications in different stacks (a true "gateway") this can break pretty quickly. I think the logic for path and root_path variables in the ASGI scope object should be extensible by the user – right now, root_path is hardcoded to a blank string (which I do not think is correct according to the ASGI specification).

Consider an example:

/prod (stage name)
|--> /service_a
|    |--> /{proxy+}
|    |    |--> ANY => proxy to FastAPI lambda application
|--> /service_b
|    |--> /{proxy+}
|    |    |--> ANY => proxy to another FastAPI lambda application

For the FastAPI (mangum) applications to work in this setup, api_gateway_base_path in the handlers of each application must be hardcoded to exactly match /service_a and /service_b respectively (the resource names in API Gateway).

I do not think that this is a good practice, because this means that the lambda code needs to be aware of the value of the settings used in API Gateway (coupling?). This hurts the portability and "plug and play" functionality of API Gateway Integrations with lambda applications. It also means I am unable to use the same codebase to mount two separate copies of a service under two different service names within the same API gateway.

The good news is that theoretically speaking the application is able to infer what is the true root path of the application (either /prod or /prod/service_a or even just /service_a in the case of custom domain names) just through the event that API gateway passes (the requestContext key contains information such as stage name, domain name and resource name). The AWS documentation also recommends the use of stage variables (that are also passed to the proxy event object) for application configuration:

You can also use stage variables to pass configuration parameters to a Lambda function through your mapping templates. For example, you might want to reuse the same Lambda function for multiple stages in your API, but the function should read data from a different Amazon DynamoDB table depending on which stage is being called. In the mapping templates that generate the request for the Lambda function, you can use stage variables to pass the table name to Lambda.

I believe this concept can be extended to the root path resolution as well. Through the use of stage variables, the API gateway decides where the FastAPI application should be mounted; the application simply reads this information at invocation time.

Suggested Solution

Instead of hardcoding a blank string in root_path and stripping path using api_gateway_base_path, simply define methods that can be overridden by child classes:

Inside adapter.py, Mangum.__call__:

            scope = {
                "type": "http",
                "http_version": "1.1",
                "method": http_method,
                "headers": [[k.encode(), v.encode()] for k, v in headers.items()],
                "path": self.resolve_asgi_path(event, context),
                "raw_path": self.resolve_asgi_raw_path(event, context),
                "root_path": self.resolve_asgi_root_path(event, context),
                "scheme": headers.get("x-forwarded-proto", "https"),
                "query_string": query_string,
                "server": server,
                "client": client,
                "asgi": {"version": "3.0"},
                "aws.event": event,
                "aws.context": context,
            }

Mangum.resolve_asgi_path and the others can contain the default implementation of stripping path of api_gateway_base_path for backward compatibility, but the point here is to allow other clients to customise the resolution logic depending on their use case (single backend api gateway? multiple backend api gateway? with or without custom domains?) by extending the class and overriding these methods.

@jordaneremieff
Copy link
Collaborator

@chesnutcase This makes a lot of sense, thanks for the detailed issue description. 👍

Support for custom resolution logic is something that should probably be included here, though I'm not certain yet how it should be designed. I've generally wanted to avoid subclassing the adapter itself, but perhaps there isn't a better alternative in this case (have to think a bit more on it).

If we did end up going with your suggested solution, then I think a single resolve_scope method would probably be preferred over individual methods for specific keys.

Do you have an idea of how usage of the subclassed approach might look?

@jordaneremieff jordaneremieff added feature New feature or request improvement Improve an existing feature labels Nov 18, 2020
@ttamg
Copy link

ttamg commented Feb 1, 2021

Hi.

Firstly a thumbs up that I am finding the same issue here with FastAPI deployment on Lambda and then putting that behind the AWS API Gateway. All works fine with Mangum when using custom domain without a path proxy, but when using a custom path proxy, (e.g. /custom) the application gets confused and returns 404s.

I can fix it by adding a prefix on all paths in the app, but as @chesnutcase says, it makes it not very portable.

As reported here for a similar case (CodeGenieApp/serverless-express#216) it looks as though the path passed to the FastAPI app by the Lambda event is the full path including the base path.

Looking at the codebase I've just noticed the api_gateway_base_path option in Mangum. Presumably if I add in the custom base path here then the urls will be tidied up. I will test it now.

If that works, then could this parameter be set automatically from the pathParameters: { proxy: 'healthz' }, element of the event as mentioned here

Just a half-baked idea. I'm not an expert on your package, but very much support the suggestion for better portability.

@ttamg
Copy link

ttamg commented Feb 1, 2021

An update. I tested that and it works well as a workaround

  1. Created a simple proxy variable settings.PROXY_PATH which was populated with "/custom".
  2. Set the app = FastAPI(root_path=settings.PROXY_PATH) as per the docs there.
  3. For the Mangum adapter, used handler = Mangum(app, api_gateway_base_path=settings.PROXY_PATH)

This all worked nicely. The issue raised by @chesnutcase on portability still applies and it would be nice if we could get the PROXY_PATH from the Lambda events somehow, and also push it into the FastAPI root_path

@ttamg
Copy link

ttamg commented Mar 10, 2021

I see the merged PR #162 That looks good. Will that then resolve this issue? I've been through the source code and it looks as though it might but I wanted to check as we probably want to update docs somewhere for the API Gateway / Mangum / FastAPI use case which is what is affected here.

The workaround I use for now that works is to hardcode in the ROOT_PATH both in the FastAPI app and the Mangum handler.

I am assuming we can then remove both after you have released this update. But I haven't got my head around how the ROOT_PATH from the scope object is read by FastAPI yet.

Am I missing something and more is needed?

@ttamg
Copy link

ttamg commented Mar 10, 2021

I found a workaround that does enable the ROOT_PATH to be set dynamically for the STAGED endpoints at least. Custom domains require a little more work.

Once the scope does pass the right path and root_path (which I see are not yet set by Mangum) then the following short additional middleware in FastAPI makes everything play nicely. It just injects the FastAPI root_path on the fly.

@app.middleware("http")
async def set_root_path_for_api_gateway(request: Request, call_next):
    """Sets the FastAPI root_path dynamically from the ASGI request data."""

    root_path = request.scope["root_path"]
    if root_path:
        app.root_path = root_path

    response = await call_next(request)
    return response

As it happens, Mangum doesn't set the root_path correctly yet, so as a FastAPI middleware workaround is (messy but working):

@app.middleware("http")
async def set_root_path_for_api_gateway(request: Request, call_next):
    """Sets the FastAPI root_path dynamically from the ASGI request data."""

    root_path = request.scope["root_path"]
    if root_path:
        # Assume set correctly in this case
        app.root_path = root_path

    else:
        # fetch from AWS requestContext
        if "aws.event" in request.scope:
            context = request.scope["aws.event"]["requestContext"]

            if "customDomain" not in context:
                # Only works for stage deployments currently
                root_path = f"/{context['stage']}"

                if request.scope["path"].startswith(root_path):
                    request.scope["path"] = request.scope["path"][len(root_path) :]
                request.scope["root_path"] = root_path
                app.root_path = root_path

                # NOT IMPLEMENTED FOR customDomain
                # root_path = f"/{context['customDomain']['basePathMatched']}"

    response = await call_next(request)
    return response

If we could get root_path and path set correctly by Mangum then that would be nice.

@jordaneremieff
Copy link
Collaborator

@ttamg how configurable would a solution here need to be? Could your middleware solution (or something similar) be used to accurately determine the correct scope keys in all cases without additional configuration/customisation?

If we can modify the adapter to handle this using a common method that is able figure out the correct path variables (and deprecate the api_gateway_base_path parameter) then I think that'd be the ideal solution, but I haven't had a lot of time to dig into this so maybe I'm missing something.

Happy to work with you reach a resolution on this, but I need more information - if you have an idea for a draft PR I'd be glad to review it.

@ttamg
Copy link

ttamg commented Mar 17, 2021

@jordaneremieff we are getting beyond my pay grade here. My thinking was that at least on the FastAPI side, if we set the root_path correctly in the scope then that app can deal with everything that end. The simple middleware works, and I am sure there are other simple ways of managing that in the app too.

But that means mangum would need to parse the root_path, etc correctly for the different API Gateway calls.

I had a play with it myself a few days ago to see if it was consistent what is added in the requestContext object that AWS API Gateway passes to Lambda in the event.
For the case of an api on a staging point, it was easy enough.
For the case of a custom domain on an api either at the root or at on a sub-path of the URL, I couldn't quite see where all the elements were being placed in the requestContext.

So in summary I think it is very doable, but I worry it will be brittle unless we get someone who knows what's stable at AWS API Gateway to let us know the requestContext schema for the different cases. Once we have that I think a PR is relatively easy to create. Know anyone, or know where there is detailed AWS documentation? I've looked online.

@chestnutcase
Copy link
Author

Hey all, sorry I disappeared from the discussion. Admittedly, I haven't touched FastAPI&API Gateway in a few months so I need some time to get back to speed with this issue.

The main point I had in mind when I opened this issue was to simply allow the user to customize the root path resolution logic (i.e. allowing the user to pass a string doesn't count). However, there are so many ways a HTTP application in a Lambda function can be integrated into API gateway depending on the user's needs. We can try to have a sensible default that uses heuristics to "guess" what the correct root path is. But just in case, we should leave it extensible by the user (probably by extending the Mangum class and overriding a method).

I too find the AWS documentation very lacking when it comes to explaining what each field in the context variable does in different scenarios (especially in the case of custom domain names with path mappings). Maybe to work on the heuristic we can first draw up a table of all the use cases we want to support, then do experiments in each case to discuss what strategies we have for evaluating the correct scope in each case.

Gateway Type \ Use Case Custom Domain Name Single Application in Stage Multiple Applications in Stage
HTTP API (Payload 2.0)
HTTP API (Payload 1.0)
REST API

Summary of use cases:

Custom Domain Name

Custom Domain with base path mapping that maps to API Id, Stage Name, Path (optional). If we can somehow inspect the base path mapping from the integration event or context this should be trivial.

Single Application in Stage

For the "hello world" applications that do not use a domain name and the proxy resource with the lambda proxy integration is located at the root of the stage.

Multiple Application in Stage

For more complex applications that do not use custom domain names (or private API inside a VPC) where the proxy resource with the lambda proxy integration is not located at the root of the stage, also possible that the API Gateway hosts different proxy resources pointing to different lambdas (what I originally outlined in the first comment).

@bafonso
Copy link

bafonso commented Sep 17, 2021

I have not extensively tested this approach but this seems to work if you are proxying with stage link (ie, /prod) and custom domain mapped to a path.

@app.middleware("http")
async def set_root_path_for_api_gateway(request: Request, call_next):
    
    print(f"request.scope : {request.scope['root_path']}")

    """Sets the FastAPI root_path dynamically from the ASGI request data."""

    root_path = request.scope["root_path"]
    print(f"root_path : {root_path}")
    
    if root_path:
        # Assume set correctly in this case
        app.root_path = root_path

    else:
        # fetch from AWS requestContext
        if "aws.event" in request.scope:
            context = request.scope["aws.event"]["requestContext"]

            if "pathParameters" in request.scope["aws.event"]:

                if request.scope['aws.event']['pathParameters'] is not None and 'proxy' in request.scope['aws.event']['pathParameters']:

                    request.scope["path"] = f"/{request.scope['aws.event']['pathParameters']['proxy']}"
                    path_parameters = request.scope["aws.event"]["pathParameters"]
                    root_path = context['path'] [ : context['path'].find(path_parameters["proxy"]) ]
                    request.scope["root_path"] = root_path

    response = await call_next(request)
    return response

@rernst76
Copy link

@bafonso Thank you so much for this breadcrumb. This helped me fix an issue I have been fighting all day. It should be noted though that the code provided only seems to work with API Gateway payload 1.0!

I ended up using the same approach but modified it to work with payload 2.0.

The issue I was running into was that aws.event.requestContext.path is not defined in payload v2.0.

I am using stages in APIGateway to deploy multiple versions of an API. We can use the aws.event.requestContext.stage value to determine if we need to do any path futzing. If we see $default we know this is the default stage that is being served from the base of our API. In this case nothing needs to be done.

If it's anything else we know that our root_path needs to be /{stage} and we can update our request.scope.path to be the value at aws.event.pathParameters.proxy.

This is not heavily tested by any means -- so use at your own discretion. It has at lease unblocked me for now. Hopefully it helps someone else!

class AWSAPIGatewayMiddleware(BaseHTTPMiddleware):
    """
    Handles the FastAPI path and root_path dynamically from the ASGI request data.
    Mangum injects the AWS event data which we can use to dynamically set the path
    and root_path.
    https://github.com/jordaneremieff/mangum/issues/147
    """

    def __init__(self, app: ASGIApp) -> None:
        """Initialize the middleware"""
        super().__init__(app)
        self.app = app

    async def dispatch(
        self, request: Request, call_next: RequestResponseEndpoint
    ) -> Response:
        """Process the request and call the next middleware"""

        root_path = request.scope["root_path"]
        if root_path:
            # Assume set correctly in this case
            self.app.root_path = root_path  # type: ignore

        else:

            stage = request.scope["aws.event"]["requestContext"]["stage"]
            # Check if stage is the default, if so, we don't need to do anything
            if stage != "$default":
                # If stage is not $default, it means we are behind an APIGateway
                # stage and we need to set the path and root_path values correctly

                # For example if the stage is "dev", and the path is "/dev/users/123"
                # the root_path should be "/dev" and the path should be "/users/123"

                # AWS/APIGateway conveniently provides pathParameters.proxy
                # which is the path after the stage_part. We can use this to
                # set the path.

                # Set root_path value to APIGateway stage in requestContext
                stage_path = f"/{stage}"
                self.app.root_path = stage_path
                request.scope["root_path"] = stage_path

                # Set path value to proxy path from event
                request.scope[
                    "path"
                ] = f"/{request.scope['aws.event']['pathParameters']['proxy']}"

        response = await call_next(request)
        return response

@jordaneremieff
Copy link
Collaborator

@rernst76 A handler pattern and custom_handlers argument were introduced starting 0.14.0 which you could try instead of a middleware.

The original suggested solution was to modify the main adapter class, but handlers are now responsible for the scope resolution details and would be the equivalent solution at this point.

You can have a look at the APIGateway handler and example of a custom handler in here.

@tlaundal
Copy link

tlaundal commented Nov 30, 2023

Hi, haven't read through all the discussion, but I was able to come up with a quite neat solution based on Jordan's comment above.

My issue was specifically that I wanted FastAPI to dynamically set the root_path, so the FastAPI built-in OpenAPI documentation site would work regardless of whether I was running locally or behind API gateway.

I found I could infer the root path by comparing the requestContext.path and path properties of the request event sent by API Gateway. Specifically, path will contain the path of the endpoint relative to the API Gateway path, say /items/my-item, but requestContext.path will contain the full path, for instance /Stage/items/my-item. From this we can infer /Stage as the root path in a quite robust way.

The custom handler looks like this:

from typing import Any
from mangum import Mangum
from mangum.handlers import APIGateway
from mangum.types import Scope, LambdaEvent, LambdaContext, LambdaConfig

from .main import app


def find_root_path(event: LambdaEvent) -> str:
    # This is the full path, including /<stage> at the start
    request_path = event.get("requestContext", {}).get("path", "")
    # This is the path of the resource, not including a prefix
    resource_path = event.get("path", "")

    root_path = ""
    if request_path.endswith(resource_path):
        root_path = request_path[: -len(resource_path)]

    return root_path


class APIGatewayCorrectedRootPath(APIGateway):
    """A variant of the APIGateway Mangum handler which guesses the root path.

    The `root_path` property of the ASGI scope is intended to indicate a
    subpath the API is served from. This handler will try to guess this
    prefix based on the difference between the requested path and the
    resource path API gateway reports.

    Using this should eleviate the need to manually specify the root path in
    FastAPI.
    """

    def __init__(
        self,
        event: LambdaEvent,
        context: LambdaContext,
        config: LambdaConfig,
        *_args: Any
    ) -> None:
        super().__init__(event, context, config)

    @property
    def scope(self) -> Scope:
        return {**super().scope, "root_path": find_root_path(self.event)}


handler = Mangum(app, custom_handlers=[APIGatewayCorrectedRootPath])

@jordaneremieff, would you be interested in merging a PR adding this functionality as an opt-in to APIGateway together with some docs?

EDIT: I originally posted with with a bogus suggestion for how it could be implemented, which I realized wouldn't work. I think it can be added as a separate handler, exactly like I do above, or by adding a new apigw_infer_root_path option to the LambdaConfig type, which would be handled in the APIGateway handler and could be specified when constructing the Mangum instance:

handler = Mangum(app, apigw_infer_root=True)

tlaundal added a commit to tlaundal/mangum that referenced this issue Dec 4, 2023
The `api_gateway_infer_root_path` option instructs Mangum to infer the
`root_path` ASGI scope property based on the AWS API Gateway event
object. This enables applications to know what subpath they are being
served from, without explicit configuration.

Relates to Kludex#147.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request improvement Improve an existing feature
Projects
None yet
Development

No branches or pull requests

6 participants