Skip to content

Interacting on websites #972

Open
Open
@nrllh

Description

@nrllh

Hii guys,

Since you are hardly changing the pipeline, I'd like to hear your opinion:

In one of our studies (s. 4.2.4), we showed that interacting on web pages triggers much more HTTP traffic and helps to explore more about the visited webpages. We have simulated keys like page down, page up, page end. I can understand that in such large measurements these can cause longer crawling time. If we don't do this, we may miss a lot (e.g., because of lazy loading). We miss in many visits; images, CSS files, XMLHttpRequests, and JavaScript files, and I think also identified technologies by Wappalyzer.

That's why I'd suggest also making a minimal simulation (like only page end). This'll allow the crawler to scroll to the end of the page and will load more data than we see.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions