-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing a bug in thredds loader that limited crawling ability #44
Conversation
@@ -157,3 +159,5 @@ def ingest(self) -> None: | |||
update=self.update, | |||
session=self._session, | |||
) | |||
counter += 1 | |||
LOGGER.info(f"Processed {counter} data items") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a warning log if the STAC item was not generated/returned? Possibility an unexpected error that was silently ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point. I've been wanting to add some more error management code. For the moment, I have added an error log message here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the item name and loc to the log error message so we can debug the one that failed appropriately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, sorry I missed this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just last logging comment. Then, good to merge.
@@ -157,3 +159,5 @@ def ingest(self) -> None: | |||
update=self.update, | |||
session=self._session, | |||
) | |||
counter += 1 | |||
LOGGER.info(f"Processed {counter} data items") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the item name and loc to the log error message so we can debug the one that failed appropriately?
I missed it too! |
I've added this to #45. |
A subtle bug was limiting the number of files that could be successfully crawled by the crawler before the crawler abruptly stopped (without error). Investigation released that this had to do with the "depth" variable in the
THREDDSLoader
class, that was originally put in place to limit the depth to which the crawler will crawl. This original logic was largely retained, however, it turns out, there was an error in the logic which only showed up while crawling several deeply-nested data structures like the datasets in the UofT CMIP6 data catalog. Essentially, the depth variable was always linearly decreasing and when it became 0, the crawler stopped working. This PR fixes that issue.