Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue #2794] Modify export_opportunity_data_task to create ExtractMetadata database records #2998

Merged

Conversation

mikehgrantsgov
Copy link
Collaborator

@mikehgrantsgov mikehgrantsgov commented Nov 22, 2024

Summary

Fixes #2794

Time to review: 15 mins

Changes proposed

Add records to database when running export_opportunity_data_task.py.

Context for reviewers

This completes the ExtractMetadata API work. These records will later be queryable by the already-built API around ExtractMetadata records.

Additional information

See attached unit test

@mikehgrantsgov mikehgrantsgov marked this pull request as ready for review November 22, 2024 20:16
Comment on lines 154 to 157
csv_buffer = StringIO()
opportunities_to_csv(opportunities, csv_buffer)
csv_data = csv_buffer.getvalue()
csv_size = len(csv_data.encode("utf-8"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than effectively make the file twice, what if we just ask the file system/s3.

We'd need to do that ourselves, but a method in our file_utils like (hastily thrown together from copying a few different bits of code):

def get_file_length_bytes(path: str) -> int:
     if is_s3_path(path):
          s3_client = get_s3_client() # from our aws utils - handles some of the weird localstack stuff
          
          bucket, key = split_s3_url(path)
          file_metadata = s3_client.head_object(Bucket=bucket, Key=key)
          return file_metadata["ContentLength"]          


     file_stats = os.stat(path)
     return file_stats.st_size

Copy link
Collaborator

@mdragon mdragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mikehgrantsgov mikehgrantsgov merged commit 0c9b01b into main Nov 26, 2024
2 checks passed
@mikehgrantsgov mikehgrantsgov deleted the mikehgrantsgov/2794-create-extract-metadata-records branch November 26, 2024 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modify export_opportunity_data_task to create ExtractMetadata database records
4 participants