Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-4487] Add Integration Tests for Azure Storage #89

Merged
merged 72 commits into from
Jun 20, 2024

Conversation

theoctober19th
Copy link
Member

No description provided.

@theoctober19th theoctober19th marked this pull request as ready for review June 15, 2024 03:19
@theoctober19th theoctober19th changed the title Draft: Add Integration Tests for Azure Storage [DPE-4487] Add Integration Tests for Azure Storage Jun 17, 2024
Copy link
Contributor

@deusebio deusebio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code does what it should, but I have a comment to maybe make the tests more maintainable and reusable, possibly also improving in consistency.

Right now we handle s3 setup and azure setup a bit differently, so for instance: s3 configuration are fed into the spark-submit command, while azure are fed using kubectl commands. Also a number of business logic in the iceberg test is just copied and pasted. I'm wondering whether we could rewrite the tests such that it reads:

(setup_user_context && setup_object_storage_s3 && test_iceberg_example_in_pod && cleanup_user_success) || cleanup_user_failure_in_pod

(setup_user_context && setup_object_storage_azure && test_iceberg_example_in_pod && cleanup_user_success) || cleanup_user_failure_in_pod

The custom part (between the two) is just the setup_object_storage_*part, where we both setup using the CLI but also we inject the right configuration using the spark-client.service-account-registry that is embedded in the OCI image.

Then, the iceberg test should be just using these configuration already injected and add the spark-submit command with the iceberg configuration only

It is not too critical, but I would honestly spend some time right now to do this, such that when adding other backends (blob storage, abfs, etc) it should be easier and more straight forward. Also this should be super easy to translate this into more structured tests, where the azure or s3 configuration is setup by the integration-hub

tests/integration/integration-tests-kyuubi.sh Outdated Show resolved Hide resolved
tests/integration/integration-tests-kyuubi.sh Outdated Show resolved Hide resolved
Copy link
Collaborator

@welpaolo welpaolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I have only a comment regarding the massive usage of bash. As future work, strongly believe we should starting using the spark-test library and move all our tests with pytest

@theoctober19th theoctober19th requested a review from deusebio June 19, 2024 09:24
Copy link
Contributor

@deusebio deusebio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! The tests look really great! thanks!

edit I noticed that sql tests is with S3 only. Would it be possible to make it both for S3 and azure?

@theoctober19th
Copy link
Member Author

Great! The tests look really great! thanks!

edit I noticed that sql tests is with S3 only. Would it be possible to make it both for S3 and azure?

Thanks @deusebio. I've just updated the PR to make the SQL tests for both S3 and Azure storage.

@deusebio
Copy link
Contributor

deusebio commented Jun 19, 2024

great!!! Thanks! Feel free to merge! I'm very happy with this PR, I believe it provides also an improved structure of functionalites!

@theoctober19th theoctober19th merged commit 69d493c into 3.4-22.04/edge Jun 20, 2024
9 checks passed
@theoctober19th theoctober19th deleted the azure-tests branch June 20, 2024 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants