Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(migration): Support non-alphanumeric passwords in alembic. #26094

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ramki88
Copy link

@ramki88 ramki88 commented Nov 24, 2023

This commit modifies alembic's env.py, so that it supports non-alphanumeric passwords.

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • [ x] Has associated issue: Fixes URL encoded passwords are getting decoded in env.py  #26029
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

This commit modifies alembic's `env.py`, so that it supports non-alphanumeric passwords.
@@ -42,8 +42,8 @@
"SQLite Database support for metadata databases will \
be removed in a future version of Superset."
)
decoded_uri = urllib.parse.unquote(DATABASE_URI)
config.set_main_option("sqlalchemy.url", decoded_uri)
escaped_uri = DATABASE_URI.replace('%', '%%')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it might break existing functionality. Should we instead do the replace on the "decoded_url" var?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@craig-rueda You mean use the same var name? Or apply the replace on top of urllib.parse.unquote(DATABASE_URI).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this:

decoded_uri = urllib.parse.unquote(DATABASE_URI)
decoded_uri = decoded_uri.replace('%', '%%')
config.set_main_option("sqlalchemy.url", decoded_uri)

So basically just add the replace in there.

Note that if your PWD has a % symbol in it, you may need to "enquote" it going in, i.e. some_pass_with_%_in_it would need to be url encoded as some_pass_with_%25_in_it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh.. I got it now. Sorry I should have tried to explain in detail the need for this change.

Example : Password with @ in them super2p@ss. As suggested in SQLAlechemy Docs if the password is either escaped and passed to superset or in superset_config.py if the user escapes the special char using urllib.parse.quote_plus("super2p@ss") alembic's env.py is going to unquote it and we end up with "super2p@ss" which will not be parsed by psycopg2.

  1. user passes -> "super2p@ss"
  2. superset_config.py parsers -> "super2p%40ss"
  3. superset_config.py sets DATABASE_URI -> "postgresql+psycopg2://user:super2p%40ss@localhost:5432/superset"
  4. env.py decodes DATABASE_URI into decoded_uri -> "postgresql+psycopg2://user:super2p@ss@localhost:5432/superset"

And superset app starts up, but the migration fails with the error "psycopg2.OperationalError: could not translate host name "ss@localhost" to address" as it splits at symbol "@".

Just by removing the unquote in env.py will not fix the problem as the password is being set as a config, which uses configparser hence the "%" has to be double escaped into %% for it to be valid.

so the flow becomes..

  1. user passes -> "super2p@ss"
  2. superset_config.py parsers -> "super2p%40ss"
  3. superset_config.py sets DATABASE_URI -> "postgresql+psycopg2://user:super2p%40ss@localhost:5432/superset"
  4. env.py doubles escapes DATABASE_URI into escaped_uri -> "postgresql+psycopg2://user:super2p%%40ss@localhost:5432/superset"

References:
psycopg/psycopg2#1546
sqlalchemy/alembic#700 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense. From what it looks like here, we're just trying to address an errant @ from finding its way into the connection string.

Hoping this doesn't break anyone 🙏

CC - @john-bodley @dpgaspar ??

Copy link

codecov bot commented Nov 29, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (57d61df) 69.09% compared to head (626ea48) 70.64%.
Report is 362 commits behind head on master.

Files Patch % Lines
superset/migrations/env.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #26094      +/-   ##
==========================================
+ Coverage   69.09%   70.64%   +1.54%     
==========================================
  Files        1940     1945       +5     
  Lines       75867    82222    +6355     
  Branches     8444     8444              
==========================================
+ Hits        52423    58087    +5664     
- Misses      21268    21959     +691     
  Partials     2176     2176              
Flag Coverage Δ
hive 54.20% <0.00%> (+0.51%) ⬆️
mysql 77.97% <0.00%> (-0.19%) ⬇️
postgres 79.36% <0.00%> (+1.10%) ⬆️
presto 54.24% <0.00%> (+0.60%) ⬆️
python 83.86% <0.00%> (+0.91%) ⬆️
sqlite 78.71% <0.00%> (+1.79%) ⬆️
unit 56.16% <0.00%> (+0.36%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ramki88 ramki88 changed the title Support non-alphanumeric passwords in alembic. fix:(migration) Support non-alphanumeric passwords in alembic. Nov 29, 2023
@ramki88 ramki88 changed the title fix:(migration) Support non-alphanumeric passwords in alembic. fix(migration): Support non-alphanumeric passwords in alembic. Nov 29, 2023
@rusackas
Copy link
Member

Hmm... title looks fine... but other CI tasks are stuck too. Closing/reopening to kick this. It might need a rebase, and @craig-rueda might want to take a fresh look.

@rusackas rusackas closed this Jan 31, 2024
@rusackas rusackas reopened this Jan 31, 2024
@github-actions github-actions bot added the risk:db-migration PRs that require a DB migration label Jan 31, 2024
@rusackas
Copy link
Member

rusackas commented Apr 8, 2024

@ramki88 would you mind running the pre-commit hooks on this so it can pass CI/linting?

@ramki88
Copy link
Author

ramki88 commented Apr 9, 2024

@ramki88 would you mind running the pre-commit hooks on this so it can pass CI/linting?

Sure, I've ran it and rebased with master.
image

@rusackas
Copy link
Member

rusackas commented Apr 9, 2024

Running CI. Fingers crossed!

@kahlua-kol
Copy link

Hey! Do you have any news on this topic? :) Would be really helpful to have a working fix for special characters

@rusackas
Copy link
Member

I think the fact that some tests aren't running, and we have a weird (possibly unrelated) MySQL test error, indicates that some of these CI tasks have been fixed on master but aren't being reflected here. A rebase of this branch/PR (again, I know, sorry...) might resolve that. CI has been a bit turbulent lately.

@ramki88
Copy link
Author

ramki88 commented May 15, 2024

I think the fact that some tests aren't running, and we have a weird (possibly unrelated) MySQL test error, indicates that some of these CI tasks have been fixed on master but aren't being reflected here. A rebase of this branch/PR (again, I know, sorry...) might resolve that. CI has been a bit turbulent lately.

Sure, I have rebased with master 8fb2cae

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
risk:db-migration PRs that require a DB migration size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

URL encoded passwords are getting decoded in env.py
4 participants