Description
Hii guys,
Since you are hardly changing the pipeline, I'd like to hear your opinion:
In one of our studies (s. 4.2.4), we showed that interacting on web pages triggers much more HTTP traffic and helps to explore more about the visited webpages. We have simulated keys like page down
, page up
, page end
. I can understand that in such large measurements these can cause longer crawling time. If we don't do this, we may miss a lot (e.g., because of lazy loading). We miss in many visits; images, CSS files, XMLHttpRequests, and JavaScript files, and I think also identified technologies by Wappalyzer.
That's why I'd suggest also making a minimal simulation (like only page end
). This'll allow the crawler to scroll to the end of the page and will load more data than we see.