Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COST-5213] - fix S3 prepare #5194

Merged
merged 6 commits into from
Jun 28, 2024
Merged

[COST-5213] - fix S3 prepare #5194

merged 6 commits into from
Jun 28, 2024

Conversation

lcouzens
Copy link
Contributor

@lcouzens lcouzens commented Jun 28, 2024

Jira Ticket

COST-5213

Description

This change will revert the default of s3_parquet_cleared flag. Then during daily processing we should set this flag accordingly.

The problem before is each worker was flipping this flag back and forth instead of holding a constant for all running workers. Meaning as we processed more files for a single provider this filtering/collection of files for deletion would take longer and longer while ultimately deleting nothing.

Testing

  1. Checkout Branch
  2. Restart Koku with multiple workers
  3. Load AWS/Azure data with finalised bills which have multiple report files (X2)
  4. Watch the logs and see removed s3 files and marked manifest s3_parquet_cleared is only called once per month.

Release Notes

  • proposed release note
* [COST-5213](https://issues.redhat.com/browse/COST-5213) Improve collection for delete from S3

@lcouzens lcouzens requested review from a team as code owners June 28, 2024 12:12
@lcouzens lcouzens added the smoke-tests pr_check will build the image and run minimal required smokes label Jun 28, 2024
@lcouzens
Copy link
Contributor Author

/retest

@lcouzens lcouzens changed the title fix S3 prepare [COST-5213] - fix S3 prepare Jun 28, 2024
Copy link

codecov bot commented Jun 28, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 94.2%. Comparing base (4eddf8e) to head (2834d66).

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #5194   +/-   ##
=====================================
  Coverage   94.2%   94.2%           
=====================================
  Files        376     376           
  Lines      31257   31252    -5     
  Branches    3735    3734    -1     
=====================================
- Hits       29435   29432    -3     
+ Misses      1161    1160    -1     
+ Partials     661     660    -1     

@lcouzens lcouzens enabled auto-merge (squash) June 28, 2024 20:00
@lcouzens lcouzens merged commit a42cf32 into main Jun 28, 2024
10 of 11 checks passed
@lcouzens lcouzens deleted the COST-5213 branch June 28, 2024 20:38
djnakabaale pushed a commit that referenced this pull request Jul 9, 2024
* Switch default parquet flag to prevent iterating on all files in each worker when there is nothing to delete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
smoke-tests pr_check will build the image and run minimal required smokes smokes-required
Projects
None yet
4 participants