Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow bgworkers to access JWT claims #11

Merged
merged 21 commits into from
Oct 11, 2024
Merged

Conversation

mrl5
Copy link
Contributor

@mrl5 mrl5 commented Oct 8, 2024

related to https://github.com/neondatabase/cloud/issues/16041
and this checkpoint from #6

Are the thread local variables supposed to be per-backend? Doesn't seem like they need to be thread local to me. I think this has been previously discussed.

closes #6

@mrl5
Copy link
Contributor Author

mrl5 commented Oct 8, 2024

src/gucs.rs Outdated Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
src/gucs.rs Outdated Show resolved Hide resolved
@mrl5
Copy link
Contributor Author

mrl5 commented Oct 9, 2024

tested manually with:

export PGOPTIONS='-c neon.auth.kid=1 -c neon.auth.jwk={"crv":"P-256","kty":"EC","x":"CV9leSCKaIvTHgFe1Anhl8ylOu73QmZfGQ7nibnAFpo","y":"8PZKd0_9EZ6-Qm_on5DdO2LErp3T4LCEs7gBZWj3ACs"} -c neon.auth.jwt=eyJraWQiOjF9.eyJzdWIiOiJrb3RiZWhlbW90IiwianRpIjozfQ.5anyLNZ-8w8LycJnhIke_j3UZI-lEKXek1JhFWxLYD71VulwnBjPoTFVXY5fW31i2khdeMCxOeYGB0C10H1w3w'
cargo pgrx run pg16
drop extension pg_session_jwt; create extension pg_session_jwt;
select auth.user_id();

@mrl5 mrl5 marked this pull request as ready for review October 9, 2024 11:57
src/lib.rs Outdated
Spi::run(format!("SET {} = '{}'", NEON_AUTH_JWT_RUNTIME_PARAM, jwt).as_str())
.unwrap_or_else(|e| {
error_code!(
PgSqlErrorCode::ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate some1 confirming or suggesting proper ERRCODE for this case

Comment on lines +7 to +9
Forked off of pgrx 0.11.3 by Conrad Ludgate for the purposes of adding support for
1. Providing options used for initialising GucContext::Backend
2. Running tests as non-superuser
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will chat with the pgrx team and see what we can do here to upstream these ideas

Ok(())
}

// fn discard() -> eyre::Result<()> {
Copy link
Contributor Author

@mrl5 mrl5 Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@conradludgate can I remove this commented test case or you'd prefer to have it committed as is?

@mrl5 mrl5 merged commit b70ccea into neondatabase:main Oct 11, 2024
3 checks passed
@mrl5 mrl5 deleted the issue-16041 branch October 11, 2024 09:58
mrl5 added a commit that referenced this pull request Oct 15, 2024
## assumptions

1. `pg_session_jwt.jwk` is set by local proxy when connection process
(a.k.a backend) is created
2. `pg_session_jwt.jwt` is set by local proxy before query is run
3. bgworker process is started by query optimizer, as per [this
ref](https://www.postgresql.org/docs/current/how-parallel-query-works.html)
4. bgworker process has access to runtime parameters of a "leader" which
is a connection process in this case, as per [this
ref](https://www.postgresql.org/docs/current/bgworker.html)
5. bgworker process ends after execution

## rationale
* `auth.user_id()` and `auth.session()` functions can be marked as
`PARALLEL SAFE`:
* because rust's variables are always seeded by runtime params mentioned
in `#1` and `#2`. [Code
ref](https://github.com/neondatabase/pg_session_jwt/blob/b30d7df196d0cb816365e565a7579f9c78221206/src/lib.rs#L267-L270)
* both runtime params are already set once optimizer spawns bgworker
process. otherwise functions will panic
  * bgworker will always exit after exection
*
[ref](https://www.postgresql.org/docs/current/parallel-safety.html#PARALLEL-LABELING)
> PARALLEL SAFE indicates that the function is safe to run in parallel
mode without restriction, including in parallel worker processes.
  
> Similarly, functions must be marked PARALLEL RESTRICTED if they access
(...) miscellaneous backend-local state that the system cannot
synchronize across workers.

* `auth.user_id()` and `auth.session()` can be marked as `STABLE`,
because once runtime params mentioned in `#1` and `#2` are set, they
will consistently return the same result.
[Ref](https://www.postgresql.org/docs/current/sql-createfunction.html)
> STABLE indicates that the function cannot modify the database, and
that within a single table scan it will consistently return the same
result for the same argument values, but that its result could change
across SQL statements. This is the appropriate selection for functions
whose results depend on database lookups, parameter variables (such as
the current time zone), etc.

## how it was tested
1. set JWK and start new connection
```console
MY_JWK='{"crv":"P-256","kty":"EC","x":"CV9leSCKaIvTHgFe1Anhl8ylOu73QmZfGQ7nibnAFpo","y":"8PZKd0_9EZ6-Qm_on5DdO2LErp3T4LCEs7gBZWj3ACs"}'
export PGOPTIONS="-c pg_session_jwt.jwk=$MY_JWK"
cargo pgrx run pg16
```
2. init rust's variables
```sql
SELECT auth.init();
SELECT auth.jwt_session_init('eyJraWQiOjF9.eyJzdWIiOiJrb3RiZWhlbW90IiwianRpIjozfQ.5anyLNZ-8w8LycJnhIke_j3UZI-lEKXek1JhFWxLYD71VulwnBjPoTFVXY5fW31i2khdeMCxOeYGB0C10H1w3w');
```
3. create large table for parallel execution
```sql
CREATE TABLE foo (
  id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  bar text
);
INSERT INTO foo (bar)
  SELECT md5(baz::text) from generate_series(1,1e6) as baz;

INSERT INTO foo (bar) VALUES ( 'kotbehemot' );

INSERT INTO foo (bar)
  SELECT md5(baz::text) from generate_series(1,1e3) as baz;
```
4. ensure that optimizer runs query in parallel for static value
```sql
EXPLAIN ANALYZE SELECT * from foo where bar = 'kotbehemot';
```
5. validate that optimizer run query in parallel also for
`auth.user_id()`:
```sql
EXPLAIN ANALYZE SELECT * from foo where bar = auth.user_id();
```
6. compare the results of both queries
```sql
SELECT * from foo where bar = 'kotbehemot';
SELECT * from foo where bar = auth.user_id();
```

## test results
query plan and the result of last two queries was the same:
```
                                                     QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..16612.79 rows=1 width=41) (actual time=323.547..324.734 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on foo  (cost=0.00..15612.69 rows=1 width=41) (actual time=317.550..317.658 rows=0 loops=3)
         Filter: (bar = auth.user_id())
         Rows Removed by Filter: 333667
 Planning Time: 3.390 ms
 Execution Time: 324.842 ms
(8 rows)
```
```
   id    |    bar     
---------+------------
 1000001 | kotbehemot
(1 row)
```

## additional context
this is a follow up of
#11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compute team review
2 participants