How do I set up the rate limiting since the CrawlerRunConfig class no longer has rate limit parameters? #1091
Replies: 1 comment
-
|
The old For config = CrawlerRunConfig(
mean_delay=1.0, # base delay between requests (seconds)
max_range=0.5, # random jitter added on top (0 to 0.5s)
)
results = await crawler.arun_many(urls, config=config)Under the hood, For full control, pass your own dispatcher: from crawl4ai import RateLimiter, MemoryAdaptiveDispatcher
dispatcher = MemoryAdaptiveDispatcher(
rate_limiter=RateLimiter(
base_delay=(1.0, 3.0), # random delay between 1-3s per domain
max_delay=60.0, # max backoff delay
max_retries=3, # retries before giving up on a domain
rate_limit_codes=[429, 503] # HTTP codes that trigger backoff
)
)
results = await crawler.arun_many(urls, config=config, dispatcher=dispatcher)The For single URL crawling ( You're right that the docs are out of date on this — the old params were removed but the docs weren't fully updated. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
CrawlerRunConfig: class - described in CrawlerRunConfig Essentials has parameters that are not actually in the latest class. The class shown in the section of the docs call CrawlerRunConfig Essentials shows the parameters below. But the class no longer supports those parameters.
enable_rate_limiting=False,
rate_limit_config=None,
How do I setup rate limiting? There is a RateLimit class described and a section on how to use the class but there is no actual documentation on how to use them now that the CrawlerRunConfig no longer supports the configuration.
Thank you
Beta Was this translation helpful? Give feedback.
All reactions