Skip to content

Support creating s3 pipeline with KMS key config #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,6 @@ jobs:
DATABRICKS_ACCESS_TOKEN: ${{ secrets.DATABRICKS_ACCESS_TOKEN }}
S3_UPLOAD_BUCKET: ${{ secrets.S3_UPLOAD_BUCKET }}
S3_OUTPUT_BUCKET: ${{ secrets.S3_OUTPUT_BUCKET }}
S3_KMS_KEY_ARN: ${{ secrets.S3_KMS_KEY_ARN }}
run: |
poetry run pytest
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "tonic-textual"
version = "3.6.0"
version = "3.6.1"
description = "Wrappers around the Tonic Textual API"
authors = ["Adam Kamor <[email protected]>", "Joe Ferrara <[email protected]>", "Ander Steele <[email protected]>", "Ethan Philpott <[email protected]>", "Lyon Van Voorhis <[email protected]>", "Kirill Medvedev <[email protected]>", "Travis Matthews <[email protected]>"]
license = "MIT"
Expand Down
3 changes: 2 additions & 1 deletion tests/sample.env
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ AWS_DEFAULT_REGION=us-east-1
AZURE_ACCOUNT_KEY=
AZURE_ACCOUNT_NAME=
DATABRICKS_URL=
DATABRICKS_ACCESS_TOKEN=
DATABRICKS_ACCESS_TOKEN=
S3_KMS_KEY_ARN=
21 changes: 21 additions & 0 deletions tests/tests/parse_tests/test_pipeline_creation.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,27 @@ def test_s3_pipelines(textual_parse):
credentials=creds,
)

# test just checks that exception is not thrown
def test_s3_pipeline_with_kms(textual_parse):
for synth in [False, True]:
for cred_source in ["user_provided", "from_environment"]:
creds = (
PipelineAwsCredential(
aws_access_key_id=os.environ["S3_UPLOAD_ACCESS_KEY"],
aws_region=os.environ["AWS_DEFAULT_REGION"],
aws_secret_access_key=os.environ["S3_UPLOAD_SECRET_KEY"],
)
if cred_source == "user_provided"
else None
)
textual_parse.create_s3_pipeline(
f"aws_{cred_source}_{str(synth)}_{uuid.uuid4()}",
aws_credentials_source=cred_source,
synthesize_files=synth,
credentials=creds,
kms_key_arn=os.environ["S3_KMS_KEY_ARN"]
)


def test_local_pipelines(textual_parse):
for synth in [False, True]:
Expand Down
2 changes: 1 addition & 1 deletion tonic_textual/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "3.6.0"
__version__ = "3.6.1"
7 changes: 6 additions & 1 deletion tonic_textual/parse_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ def create_s3_pipeline(
credentials: Optional[PipelineAwsCredential] = None,
aws_credentials_source: Optional[str] = "user_provided",
synthesize_files: Optional[bool] = False,
kms_key_arn:Optional[str] = None
) -> S3Pipeline:
"""Create a new pipeline with files from Amazon S3.

Expand All @@ -105,7 +106,8 @@ def create_s3_pipeline(
Whether to generate a redacted version of the file in addition to the parsed output. Default value is `False`.
aws_credentials_source: Optional[str]
For an Amazon S3 pipeline, how to obtain the AWS credentials. Options are `user_provided` and `from_environment`. For `user_provided`, you provide the credentials in the `credentials` parameter. For `from_environment`, the credentials are read from your Textual instance.

kms_key_arn: Optional[str]
When provided, the KMS key denoted by the ARN will be used to encrypted files prior to writing to output location via SSE-KMS. This value cannot be changed later.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When provided, the KMS key denoted by the ARN will be used to encrypted files prior to writing to output location via SSE-KMS. This value cannot be changed later.
When provided, the KMS key denoted by the ARN will be used by AWS to encrypt files prior to writing to output location via SSE-KMS. This value cannot be changed later.

Returns
-------
S3Pipeline
Expand Down Expand Up @@ -145,6 +147,9 @@ def create_s3_pipeline(
if aws_credentials_source is not None and fs == FileSource.aws:
data["awsCredentialSource"] = aws_cred_source

if kms_key_arn is not None:
data["fileSourceConfig"] = { "awsS3ServerSideEncryptionType": "Kms", "awsS3ServerSideEncryptionKey": kms_key_arn}

p = self.client.http_post("/api/parsejobconfig", data=data)
return S3Pipeline(p.get("name"), p.get("id"), self.client)
except RequestException as req_err:
Expand Down