Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unhar: Content extraction insufficient #58

Open
baltpeter opened this issue Dec 4, 2023 · 1 comment
Open

unhar: Content extraction insufficient #58

baltpeter opened this issue Dec 4, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@baltpeter
Copy link
Member

In unhar(), we are currently assuming that a HAR can only hold a request body in request.postData.text:

content: e.request.postData?.text,

That is not true. It can also have POST params in request.postData.params (which we are currently just ignoring):

http://www.softwareishard.com/blog/har-12-spec/#postData

@baltpeter baltpeter added the bug Something isn't working label Dec 4, 2023
@baltpeter
Copy link
Member Author

It's really unfortunate that there are so many differences between HAR implementations. :(

Note that text and params fields are mutually exclusive.

Yeah, that's not true at all in practice. Let's go through a few examples.

File upload

I've used https://cgi-lib.berkeley.edu/ex/fup.cgi to capture a HAR of a simple file upload in Firefox (file-upload-firefox.json) and Chrome (file-upload-chrome.har).

The site uses multipart/form-data as the encoding:

image

In Firefox, the raw multipart encoded data ends up as a string in text in the HAR:

image

In Chrome, meanwhile, both text and params are populated in the HAR:

image

In params, the file I uploaded is "helpfully" replaced with (binary), in text, it appears to be missing entirely. In fact, as far as I can tell, the uploaded file isn't included anywhere in the HAR. o.o

And indeed, I can't seem to find a way to get to it in the Chrome dev tools, either:

image

image

So, don't use Chrome to generate HAR files if you want them to actually contain everything you've uploaded, I guess? Phenomenal stuff.

HTML form

I also tried a more simple case of this basic HTML form in Chrome (post-chrome.json) and Firefox (post-firefox.har):

<!DOCTYPE html>
<html>
<body>
<form action="https://example.org" method="post">
    <input type="text" name="test">
    <input type="submit">
</form>
</body>
</html>

As I didn't set an enctype, the data is transmitted as application/x-www-form-urlencoded (the default).

In Firefox, both text and params are populated, with the raw and parsed data, respectively:

image

The same is the case in Chrome:

image

Other implementations

I also tried two other HAR implementations. First, Insomnia (post-insomnia.har, multipart-insomnia.har), which only populates params in both cases:

image

image

And, more importantly for us, the mitmproxy HAR dump script. I only had an example for application/x-www-form-urlencoded lying around. In that case, it populates both params and text:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant