Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the embedded NR Editor work in a cross domain-setup #3800

Closed
cstns opened this issue Apr 30, 2024 · 42 comments · Fixed by #3801
Closed

Make the embedded NR Editor work in a cross domain-setup #3800

cstns opened this issue Apr 30, 2024 · 42 comments · Fixed by #3801
Assignees
Labels
area:api Work on the platform API area:frontend For any issues that require work in the frontend/UI headline Something to highlight in the release priority:high High Priority size:M - 3 Sizing estimation point story A user-oriented description of a feature
Milestone

Comments

@cstns
Copy link
Contributor

cstns commented Apr 30, 2024

Epic

#3646

Description

As a:
user

I want to:
Be able to use the embedded NR editor while the editor and app domains differ

The issue is easily reproduced in production where the app domain is .flowfuse.com and instance domains are .flowforge.cloud.

In order to access the embedded editor, use chrome, and from the instance overview page, replace the /overview path with /editor

Which customers would this be availble to

Everyone - CE/Starter/Team/Enterprise

Acceptance Criteria

I am able to access and manage the embedded Node-RED editor using any browser.

Have you provided an initial effort estimate for this issue?

I have provided an initial effort estimate

@cstns cstns added story A user-oriented description of a feature area:frontend For any issues that require work in the frontend/UI size:S - 2 Sizing estimation point size:M - 3 Sizing estimation point area:api Work on the platform API labels Apr 30, 2024
@cstns cstns self-assigned this Apr 30, 2024
@cstns cstns moved this to In Progress in 🛠 Development Apr 30, 2024
@joepavitt joepavitt added this to the 2.5 milestone May 8, 2024
@joepavitt joepavitt moved this to Started in ☁️ Product Planning May 8, 2024
@joepavitt
Copy link
Contributor

Currently blocked by FlowFuse/nr-launcher#230 and #3801

@cstns is investigating whether there are workaround here, @cstns can you report here with a status summarising the explicit blockers, and what prospect (if any) we have of unblocking this please? I'm considering abandoning and moving to something else given the time it's taking.

@cstns
Copy link
Contributor Author

cstns commented May 8, 2024

sure thing!

Spoke with @hardillb earlier this morning and he pointed me to the oauth lib that handles auth between the editor and ff.

I also stumbled across jaredhanson/passport#938 in which the author of the lib kind of leaves the cross domain implementation part to the specific implementation of each use case.

I'm currently trying to circumvent the lib's inability to set the connect.sid cookie on cross domains.

ETA wise, I really can't give a precise estimation because i'm somewhat at a loss because debugging the nr-launcher and the entire oauth/passport.js lib is cumbersome.

I would greatly appreciate some support on it if possible

@cstns
Copy link
Contributor Author

cstns commented May 8, 2024

I have a feeling that running the entire ff suite (and instances) with valid tls certificates locally would solve the issue, the set-cookie headers are being sent, but the cookie is not being persisted.

This may be due to the fact that the cookie's SameSite defaults to lax and is not taken into consideration by the browser. In order to be able to set the cookie, the SameSite attribute would need to be set to none, but that would need the Secure attr which in turn requires valid tls certs over https.

Unfortunately, the whole scenario cannot be replicated on the staging environment because of the identical domains for the FF app and instances and would just set the cookies from the start.

@knolleary
Copy link
Member

@cstns lets get some time together this week to dig into this together. I've spent a lot of time in the depths of passport and the oauth flows. I just need to understand a bit more on where this is currently failing; we know we can do cross-domain authentication because that's how it already works when not being embedded in an iframe. I'm not yet up to speed on how the iframe part is breaking things.

@cstns
Copy link
Contributor Author

cstns commented May 9, 2024

I would greatly appreciate it!

@cstns
Copy link
Contributor Author

cstns commented May 9, 2024

This is what we're dealing with in a nutshell:
image

@knolleary
Copy link
Member

For future reference, this blog post describes the issue we're hitting very well: https://www.codeproject.com/Articles/5330276/Cross-Domain-Embedding-Making-Third-Party-Cookies

@joepavitt
Copy link
Contributor

@knolleary does that mean we can circumvent the problems?

@cstns
Copy link
Contributor Author

cstns commented May 10, 2024

Status update:

After setting up a local docker installation that uses valid TLS certificates over HTTPS with different domains, I was partially successful in loading the NR editor in an embedded iframe in chrome.

By partially successful I mean that we overcame the browser's security policies by assigning the CSP headers which allowed the NR editor to load but got stuck in an authentication loop.

Might be related to: #2841, #2582, #1184 and NR #4684

NR #4684 might resolve the problem, we should check with NR v3.1.10 or v3.4 when they come out.

In the whole oauth process, the connect.sid's SameSite needs to be set to none on the NR side in order for the cookie to be picked up by the NR instance domain and complete the authentication callback.

Currently the cookies SameSite attribute defaulted to lax which causes a strange situation where the cookie is present on the FF domain (with the correct NR instance's domain set) but not on the NR instance domain.

image
image

N.B my dev setup used the

  • *.docker.flowforge domain for the FF app
  • *.docker.flowfuse for the NR instance domain

@cstns cstns added blocked and removed size:S - 2 Sizing estimation point labels May 10, 2024
@knolleary
Copy link
Member

The linked NR issue regarding login loop will stop the loop, but only to show a login prompt in Node-RED. It won't address the core issue here related to the cookie options required to be set by Node-RED's session handling.

I will look at the Node-RED side on where this cookie is being set, and what would be needed to get the cookie options to be configurable.

We also need to figure out the impact all of this has of the localfs setup which doesn't use https - because all of this cookie handling is contingent on https usage. That will complicate localfs setups (including local dev setups - although there appears to be some leeway available when working purely on localhost domains).

@cstns
Copy link
Contributor Author

cstns commented May 10, 2024

The impact for development will be quite limited, as you said, localhost is quite lenient specifically for these types of scenarios and localfs won't require valid https configuration unless the NR instances and FF app reside on different domains.

@knolleary
Copy link
Member

unless the NR instances and FF app reside on different domains.

With the localfs driver, the core app and node-red instances run on different ports. If the instance is being accessed via an external IP address (ie not localhost), then the different port numbers mean they are treated as different domains. So I'm pretty certain that will be impacted.

@cstns
Copy link
Contributor Author

cstns commented May 10, 2024

Port-wise, we're safe, the documentation for CSP headers frame-ancestor host-sources directive allows the use of wildcards.

If we're successful in resolving the cookie issue, the next step would be to set the CSP headers automatically from the runtimeSettings in nr-launcher and that should cover the scenario for the port differences.

Unfortunately it still won't work without Https because the cookie will not be set properly.

@cstns
Copy link
Contributor Author

cstns commented May 23, 2024

POC of cross domain setup with valid tls certificates

csp-2024-05-23_14.30.47.mp4

Changes required are as follows:

  • @FlowFuse
    • we need to allow the app to embed itself for auth redirects
    • setting Content-Security-Policy header with "frame-ancestors 'self' *.main-ff-domain:*" *.main-ff-other domain:*" directive
      • the self directive would suffice in this case, but we can also add a custom list of domains that accept wildcards for domains/subdomains/ports
      • this directive determines what domains can embed the FF app
  • @nr-launcher
    • we need to configure NR instances by adding a global CSP header to determine a list of domains which can embed the NR editor
    • setting Content-Security-Policy header with "frame-ancestors 'self' *.main-ff-domain:*" *.main-ff-other domain:*" directive
    • these headers should be determined programmatically as opposed to the flowfuse repo as domains might differ
    • the self directive should not be required here, but left it as is
  • @node-red
    • we would need to add additional cookie attributes on the oauth callback method in @node-red/packages/node_modules/@node-red/editor-api/lib/auth/index.js:167
    • ```js
        cookie: {
          sameSite: 'None',
          secure: true,
          partitioned: true, // not available yet
        },
      ```
      
    • change needs to be added in lib/runtimeSettings.js:327

Cookie CHIPS

On a different note, Google is apparently in the middle of phasing out 3rd party cookies:

Google's decision to phase out the third-party cookie in Chrome was announced in 2020 and has been postponed a couple of times. Then as of May 2023, Google declared that they had reached a point of no return and would begin the process in January 2024

Luckily, they postponed it yet again to Q4 in 2024.

The way they're going forward is with 3rd Party Cookie CHIPS (Cookies Having Independent Partitioned State ), which is currently only available in chrome an is more or less namespace-ing for cookies.

This entire shift to cookie with CHIPS is so new that is causing problems everywhere, and, in my opinion Google will just postpone their phasing out yet again. There are a lot of articles online on this matter.

Long story short, we need to set the partitioned attribute on the express session in NR as well, but can't just yet.
There's an open issue on the expressjs expressjs/express#5275 which sums everything up nicely.

Express uses the cookie.js lib for cookie management, but cookie.js doesn't support the partitioned cookie attribute only from v0.6.0 onward, a version that's not currently bundled with express.
- a straightforward fix would be expressjs/express#5275 (comment) in the same issue.
- another fix would be to avoid the partitioned attribute altogether and handle it when time comes as will everybody else

As an aside, the changes Google will be implementing will impact all cross domain interactions moving forward, not only this specific scenario

N.B. Google's 3rd party deprecation info

@joepavitt
Copy link
Contributor

@knolleary can you verify please?

@knolleary
Copy link
Member

I will doing some testing on staging to verify.

My main concern with this whole feature at this point is that if it requires configuration changes outside of the helm setup, then how do self-hosted customers handle it? What documentation do we need to have in place as part of the upgrade guide?

Do we need a way to disable the feature via config in case a customer has a setup that breaks it somehow? Otherwise they will be locked out of the editor.

@knolleary
Copy link
Member

@ppawlowski sorry, should have tagged you into the above comment on documentation for self-hosted users.

@joepavitt
Copy link
Contributor

joepavitt commented Jun 19, 2024

if it requires configuration changes outside of the helm setup,

Can you expand on what these are please? I suspect in most cases the NR instances and the FF instance would be at the same domain (as per our staging), FF Cloud's setup I would expect to be fairly unique, but I may be wrong

@knolleary
Copy link
Member

@joepavitt
Copy link
Contributor

and we'll have no way of knowing, within the scope of FF whether or not that's been correctly configured?

@knolleary
Copy link
Member

Correct. We have to provide the cookie options statically to Node-RED before we ever see a request come in which would tell us if it was using http or https.

@joepavitt
Copy link
Contributor

So, our options are:

  1. On by default for all FF - which will break for all self-hosted instances
  2. Configuration Option - toggled on, it'll run, but requires pre-configuration meaning barely anyone outside of FF Cloud will use it
  3. We don't deliver at all

We're already constrained by NR4.0+ and a min. nr-launcher version too, right?

@ppawlowski
Copy link
Contributor

ppawlowski commented Jun 19, 2024

Changes introduced in https://github.com/FlowFuse/CloudProject/pull/440 are scoped to the Nginx Ingress Controller. We are not installing any ingress controller as a part of our Helm chart. However, we do require an Nginx Ingress Controller installed for a self-hosted solution.

In my opinion, we should:

  • make the option configurable in the app or make an explicit requirement for the self-hosted infrastructure, that if it is running behind any kind of load balancer, headers should be passed correctly to the backend (especially x-forwarded-proto header)
  • update our documentation regarding ingress controller here , here and here

@cstns
Copy link
Contributor Author

cstns commented Jun 19, 2024

  1. On by default for all FF - which will break for all self-hosted instances

It would only affect self-hosted instances where they have different domains for their FF app & instances

@knolleary
Copy link
Member

I can confirm on staging the x-forwarded- headers are now being passed through. However something is still broken in the authentication when we enable secure cookies. I will need to debug further to see where it's falling down.

@knolleary
Copy link
Member

Okay - made some progress. The cookie is being set now - which is good. However, as I had already logged into the editor prior to testing this, I already had a session cookie without the secure/paritioned flag. When testing with the new cookie options, it set a second cookie with the new flags - but the session then used the old cookie to get the session id, which didn't match a known session so auth failed.

By manually deleting the old cookie, I was then able to login as expected.

The question is then if/how we can delete the old cookie automatically so users don't have to do that themselves.

I need to park this now as I have other priorities to focus today. I will think on how to handle the change in cookie options.

@joepavitt joepavitt assigned knolleary and unassigned cstns Jun 19, 2024
@joepavitt joepavitt added the headline Something to highlight in the release label Jun 19, 2024
@joepavitt
Copy link
Contributor

Have marked this is a headline too for 2.6, as it should be called out in relevant socials once it's available

@joepavitt
Copy link
Contributor

@knolleary when are you hoping to get to this? Would be nice to prioritise this now that 4.0 is out.

@knolleary
Copy link
Member

@joepavitt it's high on the todo list. The cookie issue I mentioned before is complicated - I don't have any idea yet on how to fix it without asking users to delete their cookies; something we cannot do.

@joepavitt
Copy link
Contributor

Is this cleared if a user just logs out, and logs back in again? If so, it'll only impact users logged in at the time the feature is delivered?

@cstns
Copy link
Contributor Author

cstns commented Jun 21, 2024

unpopular opinion:
If we need to clear the auth cookies on the NR instance when the the nr-launcher and NR versions are met we could do that in the nr-launcher httpAdminMiddleware runtime settings for auth cookies that are older than xxx. I think that would trigger an authentication request if i'm not mistaken.. It's stupid but it might work, Correct me if I'm wrong

@knolleary
Copy link
Member

This is what I need to investigate - including figuring out the true impact.

The cookie in question is coming from the internals of the auth system; it is only required for the duration of the auth exchange - once logged in, it is no longer required. However the cookie is not getting cleared - even after logout. Any changes to that are going to be in the core of Node-RED - which doesn't help us with existing instances.

So we do need some logic to clear the cookie - the hard part is working out the right place to get that done without breaking the auth flow - or anything else related to that cookie.

@joepavitt
Copy link
Contributor

@knolleary status update please? Release next week, and marketing need to know if they can talk about this in release blog

@knolleary
Copy link
Member

Still blocked, but working on it. Have managed to rule out plenty of possible fixes with more still to look at. This is my primary task at the moment. Will update as soon as I have anything positive to share.

@knolleary
Copy link
Member

knolleary commented Jun 27, 2024

Summary of current status.

Problem

  1. The oauth flow relies on a cookie called connect.sid - this is set by the interals of the node-red auth system.
  2. If a user has previously logged in, they will have an existing connect.sid cookie with the old options (not secure, not partitioned).
  3. For the immersive editor to work, this cookie needs to have the secure and partitioned flags set
  4. Once we update nr-launcher to set these flags, we end up with two cookies called connect.sid - the old unsecure one and the new secure one.
  5. The auth code (express-sesssion) grabs the first cookie with the name connect.sid - the old unsecure one. This doesn't match the one it had set in the previous request (the secure one) so the auth fails.

At the server side, the http request comes in with the two cookies. You don't get to see any of the secure/partioned flags - you just get the two cookies with the same name but different values. There is no way to distinguish between them.

The goal is to find a way to detect this situation and proactively delete the old cookie. That is proving quite difficult to do.

We simply don't have the hooks in the Node-RED auth code in order to do this custom action - the underlying req and response objects are never exposed in a way we can examine them - and more importantly, modify the response to clear the cookie.

I've looked at changing Node-RED to delete the cookie if auth fails. However, to delete a cookie you have to provide the exact same secure/partitioned flags as were set originally. Node-RED only knows what options should be set on the real cookie - it doesn't know (and should not need to know) about a situation like ours where we want to migrate to a new set of cookie options.

I have one final approach to explore:

  1. Currently Node-RED doesn't set the name of the session cookie - it defaults to connect.sid. We could update Node-RED to allow that cookie name to be customised.
  2. Do a maintenance release of NR with that change in.
  3. Update the launcher to customise the name of the cookie at the same time as setting the new secure options (thereby no longer clashing with the old cookie)
  4. Update the version number checks the UI uses to determine if the immersive editor should be available or not - the minimum version will have to be the new NR maintenance release.

Will test that approach tomorrow.

@knolleary
Copy link
Member

knolleary commented Jul 1, 2024

Currently on track to get this working for the release this week. This is the sequence of events:

@joepavitt
Copy link
Contributor

After a long journey - we've finally got this working - follow on tasks to link up some relevant components will be tracked independently, and I will close this issue out now.

@github-project-automation github-project-automation bot moved this from In Progress to Done in 🛠 Development Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Work on the platform API area:frontend For any issues that require work in the frontend/UI headline Something to highlight in the release priority:high High Priority size:M - 3 Sizing estimation point story A user-oriented description of a feature
Projects
Archived in project
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants