Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dbt_dev environment #580

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Add dbt_dev environment #580

wants to merge 10 commits into from

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Feb 27, 2023

Resolves #579

@aaronsteers
Copy link
Contributor Author

aaronsteers commented Feb 27, 2023

@pnadolny13 - Any ideas?

05:02:39  Database Error in model cloud_ip_ranges (models/common/cloud_ip_ranges.sql)
05:02:39    002003 (02000): SQL compilation error:
05:02:39    Schema 'USERDEV_PREP.SNAPSHOT' does not exist or not authorized.
05:02:39    compiled Code at ../.meltano/transformers/dbt/target/run/squared/models/common/cloud_ip_ranges.sql

Update: This is now fixed. Resolved by meltano run dbt-snowflake:snapshot. I added a note to CONTRIBUTING.md.

@pnadolny13
Copy link
Contributor

@pnadolny13 - Any ideas?

05:02:39  Database Error in model cloud_ip_ranges (models/common/cloud_ip_ranges.sql)
05:02:39    002003 (02000): SQL compilation error:
05:02:39    Schema 'USERDEV_PREP.SNAPSHOT' does not exist or not authorized.
05:02:39    compiled Code at ../.meltano/transformers/dbt/target/run/squared/models/common/cloud_ip_ranges.sql

@aaronsteers I actually think this might be a bug even though you got it figure out. I would expect that schema name to be prefixed with your user prefix vs just a plain .SNAPSHOT.

@pnadolny13
Copy link
Contributor

@aaronsteers thanks for poking around and opening this PR!

Can you explain more about why the dbt_dev is necessary for this case? I havent spent much time thinking about others using this repo recently because I've been flying solo but my thought was that userdev should be set up to work for anyone running dbt into their own name prefixed schemas and if they want they can toggle a few settings to run EL + dbt. More docs on how that should work is definitely needed and if it doesnt work like I expect then I'd consider that a bug.

If there's a use case I'm missing then I'm open to a new environment but thought we had all our bases covered (although there might be bugs that make it not work 😅 ).

@aaronsteers
Copy link
Contributor Author

aaronsteers commented Feb 28, 2023

@pnadolny13 re:

@aaronsteers thanks for poking around and opening this PR!

My pleasure! 😅

Can you explain more about why the dbt_dev is necessary for this case?

Under userdev, I believe the raw DBs are all expected to be recreated on a per-user basis. Instead of a BYO-raw-data approach, the dbt_dev profile defaults all raw data locations to the prod DB - so that the contributor can immediately focus just on building transforms on top of existing datasets.

If this pattern works, we might want to rename userdev to el_dev or e2e_dev - to emphasize the different use case of EL and/or end-to-end development. None of this needs to change permissions - this would just toggle behaviors more quickly based on the type of development being done.

Wdyt?

@pnadolny13
Copy link
Contributor

Under userdev, I believe the raw DBs are all expected to be recreated on a per-user basis. Instead of a BYO-raw-data approach, the dbt_dev profile defaults all raw data locations to the prod DB - so that the contributor can immediately focus just on building transforms on top of existing datasets.

@aaronsteers oh yeah I must have changed the default at some point but originally I had it set to read from prod RAW with a commented sections instructing users how to toggle between prod raw and their own personal EL raw

. I see why the uncommenting approach is less ideal and a new environment would help. Initially I'm hesitant to copy/paste the config to a new environment file to avoid drift if those ever need to be updated but I think thats a low risk.

If this pattern works, we might want to rename userdev to el_dev or e2e_dev - to emphasize the different use case of EL and/or end-to-end development. None of this needs to change permissions - this would just toggle behaviors more quickly based on the type of development being done.

Makes sense to me. So if I understand the intended setup correctly then running dbt using userdev (or renamed) and dbt_dev would have the same output as long as I dont run models that read from RAW. The schema prefixing is exactly the same across. If thats the case, I think we'd need to update the

{%- if env_var("MELTANO_ENVIRONMENT") in ["userdev", "cicd"] and env_var("MELTANO_UTILITY_NAME", "") != "sqlfluff" -%}{{ env_var("DBT_SNOWFLAKE_TARGET_SCHEMA_PREFIX") + new_schema_name | trim }}{% else %}{{ new_schema_name | trim }}{% endif %}
to make that work.

@pnadolny13
Copy link
Contributor

@aaronsteers whenever you have time and pick this back up, I added a new contributing guide for the meltano project along with a custom extension for cloning snowflake objects into our dev environment. It should take ~1 min to get a prod replica configured.

I decided not to create a standalone dbt_dev meltano environment for now and instead set the default of userdev to read from prod RAW i.e. default behavior is transform-only development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Environment userdev doesn't work for team members wanting to do dbt development
2 participants