Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIP] Ability to connect to postgres with different schema with superset_config #29582

Closed
justmike1 opened this issue Jul 13, 2024 · 10 comments
Closed
Labels
change:backend Requires changing the backend

Comments

@justmike1
Copy link
Contributor

justmike1 commented Jul 13, 2024

Please make sure you are familiar with the SIP process documented
here. The SIP will be numbered by a committer upon acceptance.

[SIP] Proposal for Ability to connect to postgres with different schema with superset_config

Motivation

I need to connect to the same database as my application but use a different schema name than public

Proposed Change

Currently we have:
SQLALCHEMY_DATABASE_URI = f"postgresql+psycopg2://{env('DB_USER')}:{env('DB_PASS')}@{env('DB_HOST')}:{env('DB_PORT')}/{env('DB_NAME')}"

I suggest adding:
SQLALCHEMY_DATABASE_CONNECT_ARGS = {}

New or Changed Public Interfaces

NONE

New dependencies

NONE

Migration Plan and Compatibility

NONE as default would be {}

Rejected Alternatives

NONE

@justmike1 justmike1 added the sip Superset Improvement Proposal label Jul 13, 2024
@dosubot dosubot bot added the change:backend Requires changing the backend label Jul 13, 2024
@justmike1
Copy link
Contributor Author

I successfully connected to another schema using:

SQLALCHEMY_ENGINE_OPTIONS = {
    'connect_args': {'options': '-csearch_path=newtest'}
}

but now I recieve:

2024-07-13 15:34:08 INFO  [alembic.runtime.migration] Running upgrade  -> 4e6a06bad7a8, Init
2024-07-13 15:34:08 Traceback (most recent call last):
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
2024-07-13 15:34:08     self.dialect.do_execute(
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
2024-07-13 15:34:08     cursor.execute(statement, parameters)
2024-07-13 15:34:08 psycopg2.errors.UndefinedTable: relation "ab_user" does not exist
2024-07-13 15:34:08 
2024-07-13 15:34:08 
2024-07-13 15:34:08 The above exception was the direct cause of the following exception:
2024-07-13 15:34:08 
2024-07-13 15:34:08 Traceback (most recent call last):
2024-07-13 15:34:08   File "/usr/bin/superset", line 8, in <module>
2024-07-13 15:34:08     sys.exit(superset())
2024-07-13 15:34:08              ^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
2024-07-13 15:34:08     return self.main(*args, **kwargs)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/click/core.py", line 1078, in main
2024-07-13 15:34:08     rv = self.invoke(ctx)
2024-07-13 15:34:08          ^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
2024-07-13 15:34:08     return _process_result(sub_ctx.command.invoke(sub_ctx))
2024-07-13 15:34:08                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
2024-07-13 15:34:08     return _process_result(sub_ctx.command.invoke(sub_ctx))
2024-07-13 15:34:08                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
2024-07-13 15:34:08     return ctx.invoke(self.callback, **ctx.params)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/click/core.py", line 783, in invoke
2024-07-13 15:34:08     return __callback(*args, **kwargs)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
2024-07-13 15:34:08     return f(get_current_context(), *args, **kwargs)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/flask/cli.py", line 357, in decorator
2024-07-13 15:34:08     return __ctx.invoke(f, *args, **kwargs)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/click/core.py", line 783, in invoke
2024-07-13 15:34:08     return __callback(*args, **kwargs)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/flask_migrate/cli.py", line 149, in upgrade
2024-07-13 15:34:08     _upgrade(directory, revision, sql, tag, x_arg)
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/flask_migrate/__init__.py", line 98, in wrapped
2024-07-13 15:34:08     f(*args, **kwargs)
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/flask_migrate/__init__.py", line 185, in upgrade
2024-07-13 15:34:08     command.upgrade(config, revision, sql=sql, tag=tag)
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/command.py", line 406, in upgrade
2024-07-13 15:34:08     script.run_env()
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/script/base.py", line 582, in run_env
2024-07-13 15:34:08     util.load_python_file(self.dir, "env.py")
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/util/pyfiles.py", line 95, in load_python_file
2024-07-13 15:34:08     module = load_module_py(module_id, path)
2024-07-13 15:34:08              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/util/pyfiles.py", line 113, in load_module_py
2024-07-13 15:34:08     spec.loader.exec_module(module)  # type: ignore
2024-07-13 15:34:08     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "<frozen importlib._bootstrap_external>", line 940, in exec_module
2024-07-13 15:34:08   File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/superset/extensions/../migrations/env.py", line 127, in <module>
2024-07-13 15:34:08     run_migrations_online()
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/superset/extensions/../migrations/env.py", line 119, in run_migrations_online
2024-07-13 15:34:08     context.run_migrations()
2024-07-13 15:34:08   File "<string>", line 8, in run_migrations
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/runtime/environment.py", line 946, in run_migrations
2024-07-13 15:34:08     self.get_context().run_migrations(**kw)
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/runtime/migration.py", line 628, in run_migrations
2024-07-13 15:34:08     step.migration_fn(**kw)
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/superset/migrations/versions/2015-09-21_17-30_4e6a06bad7a8_init.py", line 35, in upgrade
2024-07-13 15:34:08     op.create_table(
2024-07-13 15:34:08   File "<string>", line 8, in create_table
2024-07-13 15:34:08   File "<string>", line 3, in create_table
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/operations/ops.py", line 1311, in create_table
2024-07-13 15:34:08     return operations.invoke(op)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/operations/base.py", line 442, in invoke
2024-07-13 15:34:08     return fn(self, operation)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/operations/toimpl.py", line 131, in create_table
2024-07-13 15:34:08     operations.impl.create_table(table)
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/ddl/impl.py", line 369, in create_table
2024-07-13 15:34:08     self._exec(schema.CreateTable(table))
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/alembic/ddl/impl.py", line 210, in _exec
2024-07-13 15:34:08     return conn.execute(construct, params)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1385, in execute
2024-07-13 15:34:08     return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/sql/ddl.py", line 80, in _execute_on_connection
2024-07-13 15:34:08     return connection._execute_ddl(
2024-07-13 15:34:08            ^^^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1477, in _execute_ddl
2024-07-13 15:34:08     ret = self._execute_context(
2024-07-13 15:34:08           ^^^^^^^^^^^^^^^^^^^^^^
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1953, in _execute_context
2024-07-13 15:34:08     self._handle_dbapi_exception(
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2134, in _handle_dbapi_exception
2024-07-13 15:34:08     util.raise_(
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
2024-07-13 15:34:08     raise exception
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
2024-07-13 15:34:08     self.dialect.do_execute(
2024-07-13 15:34:08   File "/usr/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
2024-07-13 15:34:08     cursor.execute(statement, parameters)
2024-07-13 15:34:08 sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "ab_user" does not exist
2024-07-13 15:34:08 
2024-07-13 15:34:08 [SQL: 
2024-07-13 15:34:08 CREATE TABLE clusters (
2024-07-13 15:34:08     created_on TIMESTAMP WITHOUT TIME ZONE NOT NULL, 
2024-07-13 15:34:08     changed_on TIMESTAMP WITHOUT TIME ZONE NOT NULL, 
2024-07-13 15:34:08     id SERIAL NOT NULL, 
2024-07-13 15:34:08     cluster_name VARCHAR(250), 
2024-07-13 15:34:08     coordinator_host VARCHAR(255), 
2024-07-13 15:34:08     coordinator_port INTEGER, 
2024-07-13 15:34:08     coordinator_endpoint VARCHAR(255), 
2024-07-13 15:34:08     broker_host VARCHAR(255), 
2024-07-13 15:34:08     broker_port INTEGER, 
2024-07-13 15:34:08     broker_endpoint VARCHAR(255), 
2024-07-13 15:34:08     metadata_last_refreshed TIMESTAMP WITHOUT TIME ZONE, 
2024-07-13 15:34:08     created_by_fk INTEGER, 
2024-07-13 15:34:08     changed_by_fk INTEGER, 
2024-07-13 15:34:08     PRIMARY KEY (id), 
2024-07-13 15:34:08     UNIQUE (cluster_name), 
2024-07-13 15:34:08     FOREIGN KEY(created_by_fk) REFERENCES ab_user (id), 
2024-07-13 15:34:08     FOREIGN KEY(changed_by_fk) REFERENCES ab_user (id)
2024-07-13 15:34:08 )
2024-07-13 15:34:08 
2024-07-13 15:34:08 ]

Copy link

dosubot bot commented Jul 13, 2024

Hey there, @justmike1! I'm Dosu, a bot here to help you with bugs, questions, and becoming a contributor. Let's squash those bugs together! 🤖

To resolve the error "relation 'ab_user' does not exist" when trying to connect to another schema using SQLAlchemy, you need to ensure that the schema is correctly specified in your SQLAlchemy connection string or in your query. The ab_user table is part of the schema used by Apache Superset, and if you are connecting to a different schema, you need to reference it properly.

Here is an example of how you can specify the schema in your SQLAlchemy connection:

from sqlalchemy import create_engine, MetaData, Table

# Example connection string with schema specified
engine = create_engine('postgresql://user:password@localhost/dbname')

# Reflect the tables from the specified schema
metadata = MetaData(schema='your_schema_name')
ab_user_table = Table('ab_user', metadata, autoload_with=engine)

# Now you can use the ab_user_table for your queries
with engine.connect() as connection:
    result = connection.execute(ab_user_table.select())
    for row in result:
        print(row)

In this example, replace 'your_schema_name' with the actual schema name where the ab_user table resides. This ensures that SQLAlchemy looks for the ab_user table in the correct schema.

Additionally, ensure that the ab_user table and its related tables are correctly defined in your schema. For example, in Apache Superset, the ab_user table is referenced in various other tables such as sqlatable_user and druiddatasource_user:

sqlatable_user = sa.Table(
    "sqlatable_user",
    sa.MetaData(),
    sa.Column("id", sa.Integer, primary_key=True),
    sa.Column("user_id", sa.Integer, sa.ForeignKey("ab_user.id")),
    sa.Column("table_id", sa.Integer, sa.ForeignKey("tables.id")),
)

druiddatasource_user = sa.Table(
    "druiddatasource_user",
    sa.MetaData(),
    sa.Column("id", sa.Integer, primary_key=True),
    sa.Column("user_id", sa.Integer, sa.ForeignKey("ab_user.id")),
    sa.Column("datasource_id", sa.Integer, sa.ForeignKey("datasources.id")),
)

Make sure that these relationships are properly set up in your database schema to avoid any issues with missing relations [1][2].

To continue talking to Dosu, mention @dosu.

@justmike1 justmike1 changed the title [SIP] Ability to connect to postgres with connection arguments with superset_config [SIP] Ability to connect to postgres with different schema with superset_config Jul 13, 2024
@Habeeb556
Copy link

@justmike1 the error you encountered sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "ab_user" does not exist indicates that the tables are created under the public schema, causing the application to be unable to locate the ab_user table with the new schema, which contains the old login information.

To resolve this, you should connect to your database and move the tables from the public schema to the new schema. After that, run the following commands to redeploy the database connection:

export FLASK_APP=superset
superset db upgrade
superset init

@justmike1
Copy link
Contributor Author

@Habeeb556 but why they were created in public in the first place that’s what I am wondering? As I have deployed a clean deployment. I can’t do this in production as I don’t have access to public schema and need superset to connect to another one I will create for it

@Habeeb556
Copy link

@justmike1 I believe this is because it's the default schema, and the Superset deployment is based on it without specifying a schema. The question is, why do you need to change the schema if you don't have access to the database? What difference would it make for you?

@justmike1
Copy link
Contributor Author

@Habeeb556

We have an internal policy which we can't use the default schema, which is public in postgres, so I need to be able to run migrations and connect and run on a schema which is the name of the applicaiton, so it will look like:

develop PSQL -> bitools (database name) -> superset (schema name)

@Habeeb556
Copy link

@justmike1

I tried simulating this setup, and it worked for me using the following configuration in superset_config.py:

# Metadata Repository
SQLALCHEMY_DATABASE_URI = 'postgresql://username:password@localhost/databasename?options=-c%20search_path=schemaname'

Ensure you connect to the database and create the new schema:

\c databasename
CREATE SCHEMA schemaname;
GRANT ALL ON SCHEMA schemaname TO username;

Then, run the following commands to upgrade and initialize the new configuration on this schema:

export FLASK_APP=superset
superset db upgrade
superset init

This will initialize a new configuration, not migrate from the old schema to the new schema. If you want to migrate production data, I recommend setting the new schema name in superset_config.py and simply renaming the public schema.

ALTER SCHEMA schema_name RENAME TO new_name;

Note: If you're working in a production environment, make sure to take a backup of the database before proceeding to safeguard against any potential issues.

@justmike1
Copy link
Contributor Author

@Habeeb556 Great, thanks!

currently it is running in testing environment which we are looking to move to production, thats how I found this need, thanks!

Is it possible to run the migration in different user as well? I have also a production requirement to run the migrations in a user which can create/delete tables, but another runs on runtime which can write/read on the tables

@Habeeb556
Copy link

You're welcome. If it's resolved, we can close this for now and continue with thread #29570.

@rusackas
Copy link
Member

Removing the SIP label, and we won't number this since it's effectively solved. Thanks again @Habeeb556, keep up the great work!

@rusackas rusackas removed the sip Superset Improvement Proposal label Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change:backend Requires changing the backend
Projects
None yet
Development

No branches or pull requests

3 participants