-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix: use shared enqueue links wrapper in AdaptivePlaywrightCrawler
#3188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
packages/playwright-crawler/src/internals/adaptive-playwright-crawler.ts
Outdated
Show resolved
Hide resolved
|
Note - this will be fun to merge with #3119 |
498ec43 to
0a94f70
Compare
0a94f70 to
31edb70
Compare
|
I've fixed all of the remarks under this one: Force pushed for a rebase onto: |
Fixes #3188 (comment) in a separate PR for clean history and review Ensures that the underlying retries are awaited properly, so that 1. the main promise does not resolve early, which could've caused race conditions on usage of this method 2. they do not lead to orphan non-awaited promises, that if they throw an error, it propagates all the way to the root level of runtime, causing it to crash
31edb70 to
3185919
Compare
janbuchar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, just a few comments
|
Let me know if you consider everything as resolved 👍 Otherwise, I consider this done :) |
AdaptivePlaywrightCrawler
janbuchar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fixes the adaptive playwright crawler's issue of not using the new crawl depth implementation of
enqueueLinksDiscovered in WCC here:
enqueueLinkswrapper into a new method of the basic crawler -_crawlingContextEnqueueLinksWrapperUp for discussion:
addRequestswrapper -_crawlingContextAddRequestsGeneratorOnly after finishing up my tests, I realized that this one is not actually needed :D Let me know if you think we should keep it or move it back :) It seems like a nice standalone function, but maybe we should keep it as
privateand notprotected🤔_crawlingContextEnqueueLinksWrapper?