Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP 1.0 range support? #18

Open
MikolajMarkiel opened this issue Jul 5, 2024 · 3 comments
Open

HTTP 1.0 range support? #18

MikolajMarkiel opened this issue Jul 5, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@MikolajMarkiel
Copy link

MikolajMarkiel commented Jul 5, 2024

The code provides a Range header only to HTTP 1.0 and not for HTTP 1.1. Is that for sure correct? Shouldn't be that reversed?

	if (op == HTTPC_GET && h->os->flags & HTTPC_OPT_HTTP_1_0 && h->position && h->accept_ranges) {
		char range[64 + 1] = { 0, };
		if (httpc_buffer_add_string(h, b0, "Range: bytes=") < 0)
			goto fail;
		httpc_num_to_str(range, h->position, 10);
		if (httpc_buffer_add_string(h, b0, range) < 0)
			goto fail;
		if (httpc_buffer_add_string(h, b0, "-\r\n") < 0)
			goto fail;
	}
		case FLD_ACCEPT_RANGES:
			if (httpc_case_insensitive_search(line, "bytes")) {
				h->accept_ranges = !!(h->os->flags & HTTPC_OPT_HTTP_1_0);
				return info(h, "Accept-Ranges: bytes");
			}

Also using modified at run time Range header during retrying previous transfer after network failure cause obtaining reduced Content-Length and because of that GET request ends somewhere in middle of file. The first assign of h->length should persist from the first Content-Length response header, and shouldn't be modified during retrying. Do you agree?

howerj added a commit that referenced this issue Jul 9, 2024
Apart from fixing issue #18, this changes a few comments
@howerj
Copy link
Owner

howerj commented Jul 9, 2024

I've fixed the flag, I'm not sure what you have written at the end though.

@MikolajMarkiel
Copy link
Author

There are 2 flags related to HTTPC_OPT_HTTP_1_0, so one more still needs to be changed

case FLD_ACCEPT_RANGES:
	if (httpc_case_insensitive_search(line, "bytes")) {
		h->accept_ranges = !!(h->os->flags & HTTPC_OPT_HTTP_1_0);  // it should be negated once
		return info(h, "Accept-Ranges: bytes");
	}

With that change if error occurs during HTTP 1.1 GET request, httpc.c will try to retry transfer from last successful position. It will add Range header in subsequent request. Because of range header, the server will respond with lower value of Content-Length That lowered value will be assigned to h->length variable. But it should persist with previous value, because it points to last byte and closing transfer function depends on that variable. So if for example we tried to download 500kB at the end we will end with ~300kB file or something.

I reproduced the issue. Below are logs from debug session that visualize that problem:

info:1143 Program: Embeddable HTTP 1.0/1.1 client
info:1144 Version: 0.0.0
info:1145 Repo:    https://github.com/howerj/httpc
info:1146 Author:  Richard James Howe
info:1147 Email:   [email protected]
info:1148 Options: stk=1024 tst=1 grw=1 log=1 cons=10 redirs=3 hmax=8192 sz=2176
info:1152 License: The Unlicense (public domain)
info:534 domain:    ...
info:535 port:      443
info:536 SSL:       true
...
debug:1363 state -- initial   -> open     
debug:1363 state -- open      -> send-head
debug:674 custom header 'User-Agent: http client' added
info:682 GET  request complete
debug:1363 state -- send-head -> send-body
info:1084 no callback - nothing to do
debug:1363 state -- send-body -> recv-head
info:952 HEADER: HTTP/1.1 200 OK/15
info:788 Content Length: 527360
info:754 Accept-Ranges: bytes
info:777 connection close mandatory
info:968 header done
debug:1363 state -- recv-head -> recv-body
App: Downloaded bytes: 1024
App: Downloaded bytes: 2048
App: Downloaded bytes: 3072
App: Downloaded bytes: 4096
...
App: Downloaded bytes: 210337
App: Downloaded bytes: 211361
App: Downloaded bytes: 212385
---------------------------------------------------------- network error  ---------------------------------------------------------- 
error:261 network read error -1
error:1002 read error
debug:1363 state -- recv-body -> back-off 
debug:1363 state -- back-off  -> sleeps   
info:703 backing off for 500 ms, retried 0
debug:1363 state -- sleeps    -> open     
debug:1363 state -- open      -> back-off 
debug:1363 state -- back-off  -> sleeps   
info:703 backing off for 1000 ms, retried 1
debug:1363 state -- sleeps    -> open     
debug:1363 state -- open      -> back-off 
debug:1363 state -- back-off  -> sleeps   
info:703 backing off for 2000 ms, retried 2
debug:1363 state -- sleeps    -> open     
debug:1363 state -- open      -> send-head
debug:674 custom header 'User-Agent: http client' added
info:682 GET  request complete
debug:1363 state -- send-head -> send-body
info:1084 no callback - nothing to do
debug:1363 state -- send-body -> recv-head
info:952 HEADER: HTTP/1.1 206 Partial Content/28
info:788 Content Length: 314975                                         <-- new Content Length
info:810 unknown field: Content-Range: bytes 212385-527359/527360       <-- successful applied Range
info:754 Accept-Ranges: bytes
info:777 connection close mandatory
info:968 header done
debug:1363 state -- recv-head -> recv-body
App: Downloaded bytes: 213409
App: Downloaded bytes: 214433
App: Downloaded bytes: 215457
...
App: Downloaded bytes: 313088
App: Downloaded bytes: 314112
App: Downloaded bytes: 315136             <-- last packet should be at 527360 byte. It ended there because of new Content Length value
debug:1363 state -- recv-body -> done

I tried to do something like that and it worked fine for me. But I don't know if it could be useful also for other cases.

case FLD_CONTENT_LENGTH:
	if (h->length_set == 0){
		if (httpc_scan_number(&line[fld->length], &h->length, 10) < 0)
			return error(h, "invalid content length: %s", line);
		h->length_set = 1;
		return info(h, "Content Length: %lu", (unsigned long)h->length);
	}
	return info(h, "Content Length remains from previous request: %lu", (unsigned long)h->length);

and also needed to remove these (they probably could be in initial step before open):

static int httpc_parse_response_header(httpc_t *h, httpc_buffer_t *b0) {
 	h->v1 = 0;
 	h->v2 = 0;
 	os->response = 0;
-	h->length = 0;         <-- remove
 	h->identity = 1;
-	h->length_set = 0;   <-- remove
 	h->keep_alive = !(os->flags & HTTPC_OPT_HTTP_1_0);
 	h->accept_ranges = !(os->flags & HTTPC_OPT_HTTP_1_0);
 	b0->used = 0;

and lastly... if server respond with Transfer-encoding: identity then h->position will be reset. The solution above doesn't handle that, so there could be some issues. I would suggest to reset the position at fault only with server that does not accept ranges or with HTTP 1.0. Then we are downloading from 0 byte but already downloaded packets are discarded (as you provided in response body parser) and position is updated like it should to. Isn't that correct?

@howerj howerj added the bug Something isn't working label Jul 10, 2024
@howerj howerj self-assigned this Jul 10, 2024
howerj added a commit that referenced this issue Jul 10, 2024
This should enable Range for HTTP 1.1
@howerj
Copy link
Owner

howerj commented Jul 10, 2024

Right you are about the second flag, I've pushed a fix for that. Thanks for writing up a detailed report, I think you are right, but it will take more time for me to analyze this which I do not have at the moment. I'll get around to looking at this properly sometime, I'd prefer not to make changes that could potentially break anything else.

I'd keep a local/personal of the other changes you've made. I'm not planning on making massive changes to this library any time soon so any personal branch would not go stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants