You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I am still digging to see how consistent this is, but I am doing basic urlfinder scans with jsonl output. When I extract the url field and look at my unique values, I am getting results that were not what I input. urlfinder version
Include the version of urlfinder you are using, urlfinder -version
Complete command you used to reproduce this
urlfinder -list target.txt -output output.txt
jq -r '.url' output | unfurl -u domain
Not sure if related or should be its own bug, but I did a run with a target file containing 3 domains. I passed that in to both -list and -match hoping it would restrict it to the domains that way, but I still got some external domains popping up.
edit: update on this second case using a match file is likely working as expected. I had stripped off the paths so I could check the domains and I missed the fact that the paths contained my target domains. For example if blogspot.com was my target, this would come back on a matcher https://allo.google.com/url?q=http://rivexapa.blogspot.com/
So the matcher seems to do what it should, I just did not think about this case. The original issue where it came back with the domain not in my target list in the first place is still an issue.
The extractor tries to extract every URL found in the results (that are URLs themselves), so if the target is foo.com and it encounters https://foo.com/redirect?url=https://bar.com, the regex will extract https://bar.com and add it to the results as well.
ehsandeep
changed the title
[Issue] Getting out of scope domains in my results
Getting out of scope domains output
Nov 16, 2024
Describe the bug
I am still digging to see how consistent this is, but I am doing basic urlfinder scans with jsonl output. When I extract the url field and look at my unique values, I am getting results that were not what I input.
urlfinder version
Include the version of urlfinder you are using,
urlfinder -version
Complete command you used to reproduce this
urlfinder -list target.txt -output output.txt
jq -r '.url' output | unfurl -u domain
target.txt contains just google.com
The text was updated successfully, but these errors were encountered: