Skip to content

Conversation

nicolasmueller
Copy link
Contributor

@nicolasmueller nicolasmueller commented Feb 24, 2024

Enable the creation of external TableReferences that reference tables in external schemas. This is useful to tell pipedag about tables that already exist in the database and should not be copied. It avoid manually reflecting them, which can sometimes be cumbersome as one might want to call a Task with a pipedag Table and sometimes with an external table.

Checklist

  • Added a docs/source/changelog.md entry

@nicolasmueller nicolasmueller self-assigned this Feb 24, 2024
@NicolasMuellerQC
Copy link
Contributor

Enabling external views for DuckDB depends on duckdb/duckdb#10322

@nicolasmueller nicolasmueller marked this pull request as ready for review February 26, 2024 23:54
@nicolasmueller nicolasmueller requested a review from a team as a code owner February 26, 2024 23:54
@windiana42
Copy link
Member

Enabling external views for DuckDB depends on duckdb/duckdb#10322

We could implement a workaround since this problem is quite common among some SQLAlchemy backends. We have such a workaround in other places of the codebase.

dict(
name=tbl.name,
schema=store.get_schema(tbl.stage.current_name).get(),
schema=store.get_schema(tbl.stage.current_name, tbl).get(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good to add external tables here. However, it might be worth noting which ones are external ones. We use this list to request a shared reader lock for DB2 on source tables. This list is not used to produce the actual query. Thus it might be ok to exclude external tables for some actions. An external table might also be a federated production table in DB2 which we should not lock or don't have the permission to lock.
=> This is important to keep in mind during roll-out for DB2

Suggested change
schema=store.get_schema(tbl.stage.current_name, tbl).get(),
schema=store.get_schema(tbl.stage.current_name, tbl).get(),
external_schema=tbl.external_schema,

If we transport external_schema separately, we might not need to add tbl as argument to store.get_schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new argument db2_shared_lock_allowed (still to be tested). Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A DB2 test has been added in a5769bc

@nicolasmueller
Copy link
Contributor Author

We could implement a workaround since this problem is quite common among some SQLAlchemy backends. We have such a workaround in other places of the codebase.

Done.

@NicolasMuellerQC
Copy link
Contributor

@windiana42 Let me know if it looks good now 😃

Also adjusted documentation to be more generic.
Redefined default to False.
src_tables = [
f"{preparer.format_schema(tbl['schema'])}.{preparer.quote(tbl['name'])}"
for tbl in create.source_tables
if tbl["shared_lock_allowed"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to remind myself: this is needed to prevent DB2 dialect from locking external table reference input tables if not allowed.

Copy link
Member

@windiana42 windiana42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me (I changed a few things directly on the branch).

@windiana42
Copy link
Member

@nicolasmueller good catch! did you find something else?

@NicolasMuellerQC
Copy link
Contributor

@nicolasmueller good catch! did you find something else?

@windiana42 No, I am fine with merging then 👍

# Conflicts:
#	docs/source/changelog.md
@nicolasmueller nicolasmueller merged commit 9b57f7c into main Mar 7, 2024
@nicolasmueller nicolasmueller deleted the external_table_reference branch March 7, 2024 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants