-
Notifications
You must be signed in to change notification settings - Fork 15
Refactor: ingester changes #1574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
||
|
||
def build_instances_from_submission(data: dict[str, Any]) -> dict[TableNames, list]: | ||
def build_instances_from_submission(data: dict[str, Any]) -> ProcessedSubmission: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to avoid using Any? Maybe with generic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this data is the submission json, so the Any
means any value that an item can have: str, int, list, dict, none... IMO in this case it is ok to leave it as Any
(and other similar cases around these functions too). This use will at least define that they dict keys are str
, otherwise we would be using just dict
which just means dict[Any, Any]
STORAGE_BASE_URL = os.environ.get( | ||
"STORAGE_BASE_URL", "https://files-staging.kernelci.org" | ||
) | ||
CONVERT_LOG_EXCERPT = False # If True, convert log_excerpt to output_files url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not an env as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything was defined statically in the file, I just added verbose as an env var. I'll make the new constants file and I can make them all env vars 👍
) | ||
CONVERT_LOG_EXCERPT = False # If True, convert log_excerpt to output_files url | ||
|
||
TREES_FILE = "/app/trees.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think a new file:
constants/ingester.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
if VERBOSE: | ||
logger.info("Uploading logexcerpt for %s to %s", id, upload_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk this VERBOSE variable could be more descriptive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean the description for the VERBOSE
variable or the string in this specific logger.info
?
str: On success upload: the reference url. On failed upload: the original logexcerpt | ||
""" | ||
|
||
upload_url = f"{STORAGE_BASE_URL}/upload" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outside of function
Even with this reduction, it's still hard to lower the complexity even further without creating idempotent functions Closes kernelci#1571
fa7db92
to
1255402
Compare
Changes
Refactors ingester code:
How to test
The behavior should be the same as before. So:
backend/kernelCI_app/management/commands/tmp_submissions
(you can change the path, just update the command below)/backend/volume_data
as a result of thetreeproof
command. If you are feeling lazy and you don't want to run the command, there's an example file here (Google Drive link)USE_DASHBOARD_DB=True
poetry run python3 ./manage.py monitor_submissions --spool-dir kernelCI_app/management/commands/tmp_submissions --trees-file volume_data/trees-name.yaml
(being in the/backend
directory, update the paths if necessary)You can also test the ingester on docker, make sure the environment variables are on
.env.backend
and usedocker compose run --rm backend
before running the command shown above.Closes #1571