-
Notifications
You must be signed in to change notification settings - Fork 3k
Fix CDI request context activation race when Quarkus REST and Reactive Routes are used during the same request #51299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,10 @@ | ||
| package io.quarkus.vertx.http.runtime; | ||
|
|
||
| import java.util.Objects; | ||
| import java.util.Set; | ||
|
|
||
| import org.jboss.logging.Logger; | ||
|
|
||
| import io.vertx.core.http.HttpHeaders; | ||
| import io.vertx.core.http.impl.MimeMapping; | ||
| import io.vertx.core.net.impl.URIDecoder; | ||
|
|
@@ -10,10 +13,39 @@ | |
|
|
||
| public final class RoutingUtils { | ||
|
|
||
| private static final String CURRENT_CDI_REQUEST_CTX_OWNER = "io.quarkus.vertx.http.runtime#current-cdi-req-ctx-owner"; | ||
| private static final Logger LOG = Logger.getLogger(RoutingUtils.class); | ||
|
|
||
| private RoutingUtils() throws IllegalAccessException { | ||
| throw new IllegalAccessException("Avoid direct instantiation"); | ||
| } | ||
|
|
||
| /** | ||
| * Assumes ownership of the currently active CDI request context. | ||
| * Thus, code invoked (even asynchronously) from previous route handlers shouldn't deactivate it. | ||
| * | ||
| * @param ctx RoutingContext | ||
| * @param newOwner typically a route handler that needs CDI request context active | ||
| */ | ||
| public static void assumeCdiRequestContext(RoutingContext ctx, String newOwner) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @michalvavrik is there any guarantee that the last handler to claim the context as their own is always going to be the last to actually attempt to use it? I am not familiar enough with the code to be able to tell so maybe I am missing something obvious.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It is for the scenario I am trying to solve, but if we treat this method as SPI that theoretically anyone could reuse, then probably no. IMO there should be always only one route handler actively handling the incoming request, because if there were multiple handlers using the same request, we would be in for whole lot of issues like this one #49468. So what I meant to solve is:
And if in the future (or in
I think that would be a solution covering all eventualities, but the only way to keep it safe (be sure it is cleaned) would be synchronization (or other locking) which could have performance impact if it was hot path? I think there is a reason why IMO current code in Quarkus REST doesn't expect that someone else is handling active CDI request context and I have seen the same thing in gRPC. I'd keep it simple (as is, or just rename method / adjust javadoc if you have some suggestion) @manovotn , but I can do what you want if you deem it important.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. First off, I don't think this should be just a local discussion, all relevant people need to be involved. I think we've established that there may be multiple users (as in, Quarkus components) that want to access the same request context. We've also established that we want the request context created on demand, not eagerly. (Thanks for those remarks! It also feels like déjà vu to me. And I'm sure it will happen again.) I'm gonna claim a simple
Conversely, if someone wants to destroy the request context, two (or three, depending how thorough we are) things may happen:
I think the above should be enough to ensure that the request context can be activated on-demand and destroyed only after all possible users are done with it. I think either this could be put into a utility class, probably in ArC, or perhaps this could be done in the ArC implementation of request context itself?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
AFAIU this would not help with the use case @michalvavrik is trying to fix. This sentence from #51299 (comment) is important: "IMO current code in Quarkus REST doesn't expect that someone else is handling active CDI request context and I have seen the same thing in gRPC"
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah, but that assumption is wrong, otherwise the issue and this PR wouldn't exist. We clearly have multiple components assuming they own the request context, which is wrong.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I know I'm not the one to decide here, but I'm personally not. This PR is a result of great investigation, but the solution presented is a band-aid applied on a wrong abstraction level. If we actually want to support multiple "owners" of the same request context, I think something like I describe above is what we should do, and the more I think about it, the more I'm convinced the request context implementation itself should do it. (This is obviously performance sensitive...) Now, there's a difference between deactivation and destruction and I didn't think through this distinction, so there would be more work to do. Still, it feels better to do it properly once, rather than doing it improperly (at least once, but possibly multiple times) and then having to do it properly and undoing the hacks.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Frankly, ATM I don't even have time to carefully read all your comments, but I got a gist that @Ladicek have in mind solid solution and @manovotn agree in principle. I'd be fine to close this PR if you will handle the linked issue @Ladicek ? You can reuse that test, I wrote a lot of comments about re-running it, but it actually reproduced the issue pretty well from my workstation. I just don't know how stable reproducing will be from your laptop. Anyway, if noone else handles the issue, I can go back to it by the end of this week, read all your comments and try to come up with something.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just talked to @mkouba and @manovotn and we concluded that concurrency in Quarkus is pretty much a mess (in how We also briefly discussed the difference between context deactivation and destruction and concluded that deactivation itself shouldn't "unregister" the owner, only destruction should. We didn't discuss what should constitute the "registration" of the request context owner, but it seems logical to me that it's the call to Looking at the implementation of
I'll try to figure something out. In the meantime, I'm certainly not blocking the merge of this PR, because both @mkouba and @manovotn are fine with it.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for all your comments and expertise everyone.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
OK, so that is clearly wrong. (I knew there was a hurdle!) Activating a request context with no initial state is creating a new request context. That only happens once. Registration of an additional owner may only happen when activating with non- |
||
| var previousOwner = ctx.data().put(CURRENT_CDI_REQUEST_CTX_OWNER, Objects.requireNonNull(newOwner)); | ||
| if (previousOwner != null && LOG.isDebugEnabled()) { | ||
| LOG.debugf("CDI request context owner has changed from '%s' to '%s'", previousOwner, newOwner); | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Enables route handlers to determine if they can deactivate/destroy CDI request context without impacting | ||
| * any other extension. | ||
| * | ||
| * @param ctx RoutingContext | ||
| * @param owner typically a route handler that needs CDI request context active | ||
| * @return true if the CDI request context is owned by the {@code owner} | ||
| */ | ||
| public static boolean isCdiRequestContextOwner(RoutingContext ctx, String owner) { | ||
| return owner.equals(ctx.get(CURRENT_CDI_REQUEST_CTX_OWNER)); | ||
| } | ||
|
|
||
| /** | ||
| * Get the normalized and decoded path: | ||
| * - normalize based on RFC3986 | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| package io.quarkus.it.openapi.security; | ||
|
|
||
| import jakarta.enterprise.context.ApplicationScoped; | ||
|
|
||
| import io.quarkus.logging.Log; | ||
|
|
||
| @ApplicationScoped | ||
| public class FailureStorage { | ||
|
|
||
| private volatile Throwable throwable = null; | ||
|
|
||
| public Throwable getThrowable() { | ||
| return throwable; | ||
| } | ||
|
|
||
| public void setThrowable(Throwable throwable) { | ||
| Log.info("Setting throwable value to " + throwable); | ||
| this.throwable = throwable; | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,29 +1,91 @@ | ||
| package io.quarkus.it.openapi.security; | ||
|
|
||
| import jakarta.annotation.security.RolesAllowed; | ||
| import jakarta.enterprise.event.Observes; | ||
| import jakarta.inject.Inject; | ||
| import jakarta.ws.rs.DELETE; | ||
| import jakarta.ws.rs.GET; | ||
| import jakarta.ws.rs.Path; | ||
| import jakarta.ws.rs.core.Context; | ||
| import jakarta.ws.rs.core.HttpHeaders; | ||
| import jakarta.ws.rs.core.SecurityContext; | ||
|
|
||
| import io.quarkus.vertx.web.RouteFilter; | ||
| import io.vertx.core.Handler; | ||
| import io.vertx.ext.web.Router; | ||
| import io.vertx.ext.web.RoutingContext; | ||
|
|
||
| @Path("/security") | ||
| public class TestSecurityResource { | ||
|
|
||
| public static final String TEST_HEADER_NAME = "test-security-header"; | ||
| public static final String TEST_HEADER_VALUE = "hush-hush"; | ||
| public static int REQUEST_TIMEOUT = 3; | ||
|
|
||
| @Inject | ||
| FailureStorage failureStorage; | ||
|
|
||
| @Context | ||
| HttpHeaders httpHeaders; | ||
|
|
||
| @RolesAllowed("admin") | ||
| @GET | ||
| @Path("reactive-routes") | ||
| public String reactiveRoutes(@Context SecurityContext securityContext) { | ||
| return securityContext.getUserPrincipal().getName(); | ||
| } | ||
|
|
||
| @RolesAllowed("admin") | ||
| @GET | ||
| @Path("reactive-routes-with-delayed-response") | ||
| public String reactiveRoutesWithDelayedResponse(@Context SecurityContext securityContext) throws InterruptedException { | ||
| Thread.sleep(REQUEST_TIMEOUT); | ||
| try { | ||
| // ATM this code is not invoked every single time, it is timing issue | ||
| // but on my laptop it is invoked 7 times out of 10, therefore, if re-run multiple times, | ||
| // it sufficiently reproduces our original issue with accessing CDI request context | ||
| var headerValue = httpHeaders.getHeaderString(TEST_HEADER_NAME); | ||
|
|
||
| // just to do something - check the expected value | ||
| if (!TEST_HEADER_VALUE.equals(headerValue)) { | ||
| throw new IllegalStateException( | ||
| "Invalid header value, got '%s',but expected '%s' ".formatted(headerValue, TEST_HEADER_VALUE)); | ||
| } | ||
| } catch (Throwable t) { | ||
| failureStorage.setThrowable(t); | ||
| } | ||
| return securityContext.getUserPrincipal().getName(); | ||
| } | ||
|
|
||
| @Path("throwable") | ||
| @GET | ||
| public String getThrowable() { | ||
| return String.valueOf(failureStorage.getThrowable()); | ||
| } | ||
|
|
||
| @Path("empty-failure-storage") | ||
| @DELETE | ||
| public void emptyFailureStorage() { | ||
| failureStorage.setThrowable(null); | ||
| } | ||
|
|
||
| @RouteFilter(401) | ||
| public void doNothing(RoutingContext routingContext) { | ||
| // here so that the Reactive Routes extension activates CDI request context | ||
| routingContext.response().putHeader("reactive-routes-filter", "true"); | ||
| routingContext.next(); | ||
| } | ||
|
|
||
| void addFailureHandler(@Observes Router router) { | ||
| // this is necessary, because before the fix, the ContextNotActiveException was sometimes | ||
| // thrown in the CDI interceptors and handled by the QuarkusErrorHandler | ||
| router.route().order(Integer.MAX_VALUE - 100).failureHandler(new Handler<RoutingContext>() { | ||
| @Override | ||
| public void handle(RoutingContext routingContext) { | ||
| failureStorage.setThrowable(routingContext.failure()); | ||
| routingContext.end(); | ||
| } | ||
| }); | ||
| } | ||
|
|
||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this check unnecessary? I mean, at this point the owner must be
quarkus-reactive-routes(unless something replaced the CURRENT_CDI_REQUEST_CTX_OWNER)...Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not how I read the situation. When
routingContext.next()happens, other route handlers are called synchronously until some code is scheduled (like run blocking), at which point this can be invoked. So it can easily be Quarkus REST.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant "synchronously".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but all those handlers share the same
RoutingContext...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is that
finallyis run when Quarkus REST is processing becauseroutingContext.next()is run before.I don't know how to answer that :-) You will need to elaborate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so we don't want to deactivate the context either. Small note - if Vertx is used then
requestContext.deactivate()is a NOOP if duplicated context is used: https://github.com/quarkusio/quarkus/blob/main/extensions/vertx/runtime/src/main/java/io/quarkus/vertx/runtime/VertxCurrentContextFactory.java#L99There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know, thanks. I thought that the bug is about the
endHandlerwhere isdestroyinvoked, but during the debugging I saw my breakpoint ondeactivatecalled after Quarkus REST started doing stuff, so I'd keep it here just in case you change howdeactivateworks in the future.