Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
WIP: Factorisation of implementation into modules and layers (#135)
* Replace FIFOBuffer implementation with BufferStream/IOBuffer wrapper. Add mark/reset support to Form <: IO Dont call eof() in body(::IO, ...) if content length is known (eof() can block) Work around JuliaLang/julia#24465 Use non-blocking readavailable() to read serverlog in test/server.jl Hack test/fifobuffer.jl to work with BufferStream/IOBuffer implementation. The BufferStream API does not expose the size of the buffer, so tests that passed a fixed size to FIFOBuffer(n) have been tweaked to pass with BufferStream's automatic buffer size managment behaviour. Add note re @test_broken and https://github.com/kennethreitz/httpbin/issues/340#issuecomment-330176449 * Rename canonicalize!(::String) -> tocameldash!(::String) Move header canonicalization out of parser. * Remove URI, header and body size limits from parser. * Replace unused parser option "lenient" with global const "strict". * Parser simplification move "close(response.body)" from parser to client remove @nread -- redundant now that header size is unrestricted remove unused handlers: onmessagecomplete, onstatus, onmessagebegin, onheaderscomplete * Add tests for parsing one charactar at a time. Fix #125 and #126. Replace local `KEY` in `parse!()` with `parser.previous_field`. Add onurlbytes(). Accumulates URL in `parser.valuebuffer`. Use `Ref{String}` to return `extra`/`upgrade` so that caller can tell the difference between upgrade with empty string and no upgrade. Add `s_req_fragment_start` to list of states where the url_mark should be reset to 1. * Store method, major, minor, url and status in Parser - begin to separate Parser from Request/Response * Use Vector{Pair} instead of Dict for Headers. See #108. Store Vector{Pair} headers in Parser struct. Move cookie processing from parser to client. Move special treatement of multi-line and repeated headers out of main parser function. * Use IOBuffer instead of Vector{UInt8} for field and value buffers in Parser. append!() is replaced by write(). Faster, less allocaiton. unsafe_string(pointer(x.buf), length(x.buf)) is replaced by String(take!(x.buf)) ``` mutable struct s1 buf::Vector{UInt8} end mutable struct s2 buf::IOBuffer end a(x::s1, s) = append!(x.buf, s) a(x::s2, s) = write(x.buf, s) r(x::s1) = unsafe_string(pointer(x.buf), length(x.buf)) r(x::s2) = String(take!(x.buf)) function go(x) inbuf = Vector{UInt8}([0]) for i = 1:100000 inbuf[1] = UInt8(i % 255) a(x, inbuf) end outbuf = String[] for i = 1:1000 push!(outbuf, r(x)) end println(length(outbuf)) end t1() = go(s1(Vector{UInt8}())) t2() = go(s2(IOBuffer())) t1() t2() println("t1, Vector...") @time t1() println("") println("t2, IOBuffer...") @time t2() ``` ``` t1, Vector... 1000 0.101289 seconds (1.12 k allocations: 95.733 MiB, 65.69% gc time) t2, IOBuffer... 1000 0.002676 seconds (2.04 k allocations: 241.281 KiB) ``` * reduce main parse! function to a single method by keeping a reference to body in the Parser struct * Remove *_mark from Parser. Explicitly handle data within main parse loop instead of using *_mark to clean up loose ends. Number of places that callbacks are called is reduced. Add `while p <= len` inner loop for URL parsing (follows pattern used for header field and value parsing) Remove on*bytes functions, just `write()` to valuebuffer or fieldbuffer as needed. * Remove (mostly) unused Parser.nread field Purpose is not clear. Wasn't working anyway in some cases in origin master branch. e.g. In the "Foo: F\01ailure" test case nread was buffer length + 1. * Explicitly return HPE_OK (not errno) in Parser in places where there is no error. Remove redundant write-back of parser.state. * Clean up gotos and main loop condition in parser * Replace @debug(PARSING_DEBUG, ParsingStateCode(p_state)) in every ifelse with single @debug at top of main loop. * Move header and body procssing out of parser (via callbacks) * throw message incomplete error from parse() fix tests for invalid content length * move upgrade flag to Parser struct so that state is preserved between calls to parse!() ensure that catch-all return at end of main loop is only used for incomplete messages * throw ArgumentError for zero len passed to parse! * simplify ParsingError display * Replace multiple status var returned by parse!() with state accessor functions. Replace multiple return points in main parse!() loop with single return at the bottom. Make s_message_done a loop termination condidion. * replace "while true" read loop in client with "while !eof(socket)" loop * throw errors directly from parse!() * Call show(r) for verbose logging and write(r) for comms. (was using string(r) for both). Work around for JuliaLang/MbedTLS.jl#113 Try to rationalise Base.write, Base.string, Base.show and opts.logbody (WIP) * fix parse error when fragmentation splits CRLF * simple debug macro * function absuri(u::URI, context::URI) * parser interface tweaks in server (untested) * Work in progress. Separation of client functionality into layers. New modules: - Connect.jl - Make a simple TCL or SSL connection. - Connections.jl - Connection pool with interleaved requests/responses. - Messages.jl - Refactored Request and Response. - Body.jl - Statick and streaming bodies. parser.jl: - add isheadresponse flag - add onheaderscomplete callback - remove "extra" view - return number of bytes consumed instead of using "extra" view - change bytes parameter to be a SubArray client.jl and types.jl - Unfinished refactoring of redirects, connection management etc * Added closeread/closewrite interface to manage pooling/interleaving. Simplify pool implementation. * Add high level client interfaces - CookieRequest - RetryRequest Add redirect and header normalisation to SendRequest Move ::IO API extensions from Connect.jl to IOExtras.jl (unread!, closeread, closewrite) Add chunked mode to write(::IO, ::Body), add setlengthheader(). Replace writebusy and readlock with writecount/readcount for managing connection interleaving in Connections.jl Clean using and import statements. Hook up StatusError Replace old parse() function with Request(::String) and Response(::String) const ructors. Replace "offset" scheme in URIs with SubStrings (WIP). * Move default port (80/443) to getconnection. Change `hostname` to `host` per obsoleting of `hostname` in RFC: https://tools.ietf.org/html/rfc3986#appendix-D.2 Add `hostport` that returns e.g. "foobar.com:8080" or "[1:2::3:4]:8080" (old `host` function returned "foobar.com8080" or "1:2::3:48080"). Replace Offset with SubString in urlparser.jl * simplify Base.print(io::IO, u::URI), just print string directly * simplify Base.print(io::IO, u::URI), just print string directly * Split util functions into: Pairs.jl, Stirngs.jl, debug.jl, parseutils.jl * debug tweaks * tweak import of MbedTLS.SSLContext * Update client.jl API to use refactored internals. Bodies.jl - Only use chunked encoding if length is unknown. Move redirect from SendRequest to CookieRequest layer so that cookies can be processed before redirect. header() and setheader() - make header lookup case-insensitive Revert to original FIFOBuffer implementation * improve show(io::IO, c::Connection) * Parsers.jl cleanup Split output fields from struct Parser into struct Message. Docstrings and cosmetics. Pass Parsers.Message to p.onheaderscomplete callback. Add const NOMETHOD. Use @PasserT for inner parser assertions (see enable_passert) Rename @strictcheck -> @errorifstrict Move http_should_keep_alive to server.jl Use HPE_INVALID_INTERNAL_STATE instead of error("unhandled state") Fix up end of loop assertion: p_state == s_message_done || p == len * doc examples * Architecture and Internal Interface Documentation * Whoops * Split stack furthur into seperate modules. See module module RequestStack and const DefaultStack in HTTP.jl * more refinement of layer separation, doc updates * note that header functions are case-insensitive * doc typos * Cache Parser in ConnectionPool. Make stack dynamically configurable. * simplify Base.read!(::IO, ::Parser) per JuliaLang/MbedTLS.jl#114 * fix prior broken commit * test/REQUIRE JSON * fix for syntax: unexpected "catch" on 0.7 * temporarily limit travis to v0.6/linux * Ensure that all old client.jl keyword options are handeled or produce a depricat ion message. Simplify Client.options by using ::Vector{Tuple{Symbol,Any}} instead of ::RequestOptions. Add minimal flag in HTTP.jl to use a stripped down stack config. * Revert rename of parser.jl -> Parsers.jl to avoid making GitHub's diff display confused. * tweaks to "minimal" configuration * fix connection pool locking for concurrent access to same connection * Parser tweaks - waitingforeof() true if p.state == s_start_req_or_res - Check isheadresponse before F_CHUNKED in s_headers_done state. HEAD response can include chunked, but has no body. - Consume trailing end of line after message. Saves unreading a stray newline at the end of a message. * cosmetics * add state and status to ParsingError. Handle eof() before message has begun * fix read/write locks in ConnectionPool.jl * typo in @err macro * add note about header fragmentation * ConnectionPool scheme refinement. Add `max_duplicates` setting. Up to `max_duplicates` for a certian scheme/host/port, create new connections if the existing connections are already locked for writing. If there are aleady `max_duplicates` connections fo a scheme/host/port, return one of the existing connections (even though it is locked for writing by another task). The new call to lock(c.writelock) will then block, forcing the calling task to wait. This seems to work nicely with the AWSS3 concurrency tests: https://github.com/samoconnor/AWSS3.jl/blob/master/test/runtests.jl#L231 Also added a new test at the bottom of test/client.jl Inspection with tcpdump/wireshark shows that multiple requests are being written to the socket before the first response comes back; and once responses start comming, requests and responses stream continuously, utilising both halves of the TCP connection. * HTTP.jl - Update top-level RequestStack.request method to convert body stream/data to a ::Body ASAP. - Pass response_body as positional (not kw) arg. Bodies.jl - Add isstreamfresh(::Body) to check if stream is untouched and safe to use in a retry. Messages.jl - Add exception field to ::Response, set by writeandread() when there is an error processing the request. Rethrown by by wait() and waitforheaders(). SocketRequest.jl - Handle exceptions thrown by writeandread() background task for streamed Response Bodies. RetryRequest.jl - Retry for Base.ArgumentError.msg == "stream is closed or unusable" - Don't retry if the previous attempt has altered the state of the request stream or the response stream. ConnectionPool.jl - Use writecount/readcount to ensure that the readlock is obtained in the correct sequence so that Responses match Requests. See closewrite(). - Revised pool strategy: - Try to find a connection with no readers or writers, then - Open new connections up to the limit, then - Try to find a connection with pending readers, but writable, then - Sleep waiting for writes to finish, then try again. - The previous scheme ended up over-interleaving requests while other connections sat idle. - New async test cases to validate connection interleaving. -- INSERT -- * v0.7 compat * remove old async test, see new test/async.jl * v0.7 updates * split parse! into parseheaders! and parsebody! * remove unreachable state s_dead * add readbody! and readheaders! * API notes * API notes * API notes * More extreme async and streaming tests: async.jl New HTTPStream <: IO module replaces Bodies.jl - Brings together Parser, Message and Connection - Provides transparent IO-compatible writing/reading of chunked bodies. - Provides a HTTP Body-aware eof() - Supports better streaming via new open() API. New HTTP.open() API function for streaming bodies: HTTP.open(...) do io ... end - Removes the need for tasks and condition signaling. parser.jl - Remove callback functions. - Reduce interface down to parseheaders() and parsebody() - parseheaders is now called as: parseheaders(p, bytes) do header ... end - parseheaders returns a SubArray of the unprocessed bytes. - parsebody returns a tuple of SubArrays: the processed and unprocessed bytes. Messages.jl - Replace ::Body with ::Vector{UInt8}. Simpler. - Add explict Response.request and Request.response fields. - Remove obsolete waiting functions - Remove connectparser. Parser no longer needs connecting. - Add readtrailers() - Remove writeandread() - now handled by StreamLayer ConnectionRequest.jl: - close io on error ConnectionPool.jl - Add reuse_limit option. e.g. httpbin.org seems to drop the connection after 100 requests, so it's better to set a limit to avoid sending hundreds of requests to a connection when only the first few will get processed. IOExtras.jl - add unread! methods for IOBuffer and BufferStream (for use in testing) Move MessageLayer above RetryLayer and ExceptionLayer to give RetryLayer access to Request and Response structs. Used to determine retryability of body streaming. MessageRequest.jl - add bodylength and bodybytes functions (with many methods for different types) - use these to set Conent-Length/Transfer-Encoding RetryRequest.jl - Use special const values and === to mark "body_was_streamed" to avoid reusing a stream. consts.jl - remove unused consts Move VERSION specific stuff to compat.jl * Move request() function from RequestStack module to top-level HTTP module. Add async streaming PUT and GET tests using AWS S3 (added simple AWS4AuthRequest.jl layer to sign requests). * whoops * More async streaming tests (and resulting bug-fixes) ConnectionPoolLayer - accept connectionpool::Bool=true option instead of seperate ConnectLayer ConnectionPool.jl - Add NonReentrantLock, wrapper for ReentrantLock with assertion of no multiple locking. - Many new locking logic assertions - Make locking logic more rigid HTTPStream.jl - Fix write to read transition logic in eof() RetryRequest.jl - Reset response body before processing retry. parser.jl - rename setheadresponse() to setnobody() * whoops * whoops * whoops * need eof bugfix in +MbedTLS 0.5.2 * tweak AWS4AuthRequest.jl debug levels * DEBUG_LEVEL = 0 by default * added lockedby() to compat.jl * type annotation * type annotation * cosmetics * reenstate verbose= option * ConnectionPool.jl enhancements. Remove NonReentrantLock for simplicity, testing indicates that locking is ok now, retain assertions Add pipeline_limit option. Split startread() code out of closewrite(). Cosmetics move localport and peerport to IOExtras.jl * test tweaks * added TimeoutLayer * Concurrency enhancements to Handle RFC7230 6.5, early termination. - Add struct ConnectionPool.Transaction <: IO to manage state of a single transaction (Request/Response) on a shared Connection. The Transaction struct holds a sequence number and a reference to the connection. - Decouple read/write lock/unlock sequecing in ConnectionPool. Each Transaction now knows its sequence number. closewrite() no longer calls startread(). Reading can now begin before writing has finished, or start much later without the need to atomically unlockread+lockwrite. - Remove redundant Connection.writelock. - Handle RFC7230 6.5, early send body termination on error: send body runs in an @async task and closeread(::HTTPStream) checks for error code and "Connection: close". - iserror(::Messages) false if status is 0 (not yet recieved) - Don't write(::IO, ::Request) the whole request in StreamLayer, instead startwrite(::HTTPStream) sends the headers automatically and StreamRequest.jl sends the body (static or streamed). - Callstartwrite(::HTTPStream) in HTTPStream{Response} constructor to send headers automatically when stream is created. - Rename readheaders(::HTTPStream) -> startread(::HTTPStream) - Rename writeend(::HTTPStream) -> closewrite(::HTTPStream) - Rename close(::HTTPStream) -> closeread(::HTTPStream), StreamLayer now calls closeread (not close), ConnectionPoolLayer now calls close when pooling is disabled (i.e. one request per connection mode). - Add @require from https://github.com/JuliaLang/julia/pull/15495/files - IOExtras.jl stubs for startwrite(), startread() and closeread() * typo * ConnectionPool eof() handle closed Transaction. Remove auto startwrite() in HTTPStream constructor (better to have the symetry of startwrite()/closewrite() in request(StreamLayer, ...) ). In closeread(http::HTTPStream) don't call closeread(::Transaction) if the Transaction is already closed (e.g. due to exception or early abort). Split out default_iofunction from request(StreamLayer, ...) * fix pipline_limit off-by-one, add +Base.close(c::Connection) = Base.close(c.io) * more variation of parameters in test/async.jl * Add test/WebSockets.jl to test connection upgrade. * Ignore 100 Continue message in startread(http::HTTPStream) * Test for 100-continue * MUST NOT automatically retry a request with a non-idempotent method https://tools.ietf.org/html/rfc7230#section-6.3.1 * fix deadlock when no conncetion duplication is allowed * https://tools.ietf.org/html/rfc7230#section-6.3.2 "A user agent SHOULD NOT pipeline requests after a non-idempotent method, until the final response status code for that method has been received" * Fix isrecoverable()/isidempotent() bug Retry on 408 timeout and 403 forbidden (credentials timeout is fixable by retrying). Improve retry debug output. * Add loopback test for abort while sending body (& fix bugs) Add loopback tests for streaming and collection body types (& fix bugs). Do closewrite() in close(t::Transaction) to ensure aborted connections are stuck in writable state. Implement post-abort exception handling in StreamRequest.jl * dont set chunked header when upgrade header is set * add tests for pipeline_limit and duplicate_limit * Pass Content-Length:0 to S3 GET in tests to avoid automatic chunked header. Tweak debug levels. * added tests for no pipelinging after non-idempotent * Fix for closed connections stuck in pool. When connection failed during write, readcount was one less than writecount. Remove read/write count check from purge. Now simply remove closed connections from pool. Any outstanding Transactions have their own reference to the Connection anyway. * loopback test connection pool cleanup * close before purge in close(c::Connection) * Set Content-Length: 0 by default for iofunction GET requests * tweak delays in loopback test to remove race conditions * dont run AWS S3 tests if creds ENV not set * whoops * docstring cleanup, work in progress * arch doc * Doc cleanup * fix !minima/minimal bug in stack() * Doc cleanup. Wrap IO errors in IOError type. * Remove Connect.jl module. Rename duplicate_limit to connection_limit. Rename timeout to readtimeout (to match original API). Use const nobody for default empty body to avoid allocation. Make default headers = Header[] rather than just [] for type stability. Move IOExtras unread!(::IOBuffer) and unread!(::BufferStream) to ../test/ Improve Streams.jl doc strings. Add `URI(uri::URI) = uri` to avoid reconstructing URIs * doc updates * typo * Fix show(::IOError). Close response_stream after writing in StreamRequest.jl. * handlers tweaks for compatibility with HTTP.Request * remove redundant connectionclosed(p::Parser), replaced by hasheader(http, "Connection", "close") * added @Ensure to debug.jl * remove unneeded abbeviation of request to req * added hasheader(::Message, key, value) * ConnectionPool and HTTP.Stream tweaks for server mode compatibility. ConnectionPool: - Make startwrite/closewrite locling symetrical with startread/closread. - Use a Condition and busy flag instead of ReentrantLock to esnure consistent behaviour regardless of what task the start/end functions are called from. - Add Connection.sequence for initialisation of new Transactions's seq No. - Base.isopen(t::Transaction) returns false after read and write are closed. MessageRequest.jl - Don't set "chunked" header, this is now done in startwrite(http::Stream) for client and server modes. Streams.jl - add messagetowrite(http::Stream{*}) - Set "chunked" header in startwrite(http::Stream) when needed. - Call startwrite() in first call to Base.unsafe_write() - Add closewrite(http::Stream{Request}) and closeread(http::Stream{Request}) for server mode. * cosmatics * WIP integrating connection pipelining and HTTP.Stream with server.jl * Add HTTP.listen server API HTTP.listen() do http write(http, "Hello World!") end Add HTTP.WebSockets.listen API as test case for HTTP.listen HTTP.WebSockets.listen() do ws write(ws, "WebSockets Echo Server...\n") while !eof(ws); s = String(readavailable(ws)) println(("in: $s") write(ws, "echo: $s\n") end end * Parser cleanup post "Connection: Upgrade" testing. - rename ULLONG_MAX => unknown_length - rationalise reset!() / constructors - move some init from reset!() to state s_start_req_or_res - replace flags with chunked::Bool and trailing::Bool - dead code removal: - No need for special treatment of Connection: and Upgrade: headers. Parser does not use these internally. (Content-Length: and Transfer-Encoding: are still used to determine how to parse body) - Hard-coded list of methods precludes use of other registered/custom methods. e.g. https://tools.ietf.org/html/rfc3253#section-3.5. Method is a token: https://tools.ietf.org/html/rfc7230#section-3.1.1 https://tools.ietf.org/html/rfc7230#section-3.2.6 So, parse method until first non-token character. * Remove commented out dead code. add parse_token function, used to parse method. * add Base.readbytes!(t::Transaction, a::Vector{UInt8}, nb::Int) * Messages.jl Fix case insensitiveness of hasheader Add RFC compliant bodylength(::Request/::Response) methods. See nodejs/http-parser#403 Update readbody() to only use Parser for chunked bodies. * Revise Streams.jl to bypass parser for non-chunked bodies. readavailable(::HTTP.Stream) for non chunked messages is now a very thin wrapper for readavailable(::TCPSocket) (or readbytes!(::TCPSocket, ...) if nb_available() is more than Content-Length). Parser overhead is removed. !unread overhead is removed. * use ConnectionPool.byteview conveniance fn in Base.readavailable(http::Stream) * Remove redundant header interpretation from Parser. bodylength(::HTTP.Message) implements RFC compliant body lenth determination. Parser no longer needs to interpret Content-Length or "chunked" headers. No Need to "parse" non chunked boides. Just let the client read them directly from the stream. Update tests per: nodejs/http-parser#403 * #135 (review) * #135 (review) * doc tweak * typo * #135 (review) #135 (review) * doc fix * doc tweaks * doc tweak * remove redundant import in Messages.jl * typos, and add escapepath to URIs module * Rename Nitrogren -> Servers 042dab6#r26814240 Add logging macros to compat.jl * 0.6 tweaks Rename some files to match module names. * update README * Merge samoconnor#1 Merged by hand, many of these changes were already made. * compat for Val * compat for Val * IOExtras.closeread::Stream{Request} fix for server mode * Add http://localhost:8081 server for ConnectionPool debug. Connecting to http://localhost:8081 while running the async test shows an auto-refreshing page showing connection pool state. * JuliaLang/julia#25479 * latest v0.7 master compat tweaks - f(;kw...), kw seems to now be an iterator. The only way I could see to get a NamedTuple was merge(NamedTuple, kw) - b"fooo" now yeilds CodeUnit , so Vector becomes AbstractVector in sniff * move STATUS_CODES const to Messages.jl (only place it is used) and use Vector instead of Dict * Don't include consts.jl in top level HTTP namespace rename STATUS_CODES => STATUS_MESSAGES Use symbols for ParsingErrorCode to avoid duplicating list of error names. Put constants only used by parser in Parsers.jl. Remove Method enum. Don't want to limit methods to the ones in the enum. * Server doc updates * untested integration of Server and serve with listen * FIXME * Add MicroLogging for Windows v0.6 * Rename handlers.jl to Handlers.jl * untested implementation of HTTP.serve() calling HTTP.listen() * Server integration and tests. Add kw option to set http_version. In closewrite(::Stream{Request}) close the connection for non keep-alive HTTP/1.0. * oops * disable extra debug printf in test/server.jl * Return status 413 for too-large headers. #135 (comment) * randomise server port number in tests * iso8859_1_to_utf8 per #141 * typo * Naming consistancy tweaks. URLs are the subset of URIs that specify a means of acessing a resource in addition to specifying the resource name. e.g http://foo.com/bar?a=b is a URL but bar?a=b is not. Changes: - Variables of type URI that must contain a compelte URL renamed uri -> url. - Remove the "URL" constructor for URIs. It seems to have been constructing URI's that were not URLs for some input, which seems wrong. The other "URI" constructors should suffice. - Simplify the main kwargs URI constructor by putting the sanity checks in preconditions. Changes: - Renames HTTP.Request.uri -> HTTP.Request.target - Added "target=" kw arg to request(MessageLayer, ...) to allow setting the target to somthing that cannot be derived from the request URL. - Remove the old behaviour (I assume untested) of handling "CONNECT" as a special case and using the Request URL host:port as the target. I seems that this would have created a request for the server to proxy to itself. - Remove the now redundant option isconnect from the url parser. A client making a CONNECT Request should use target="$host:$port". A server implemetning a CONNECT proxy should use http_parse_host(target) * update tests and compat * try to fix Inexact Error in win32 * Update @Ensure macro to print values of failed comparison. (expr-fu stolen from Test.jl) * Add postconditions to URI(;kw...) and parse(::URI, ...) to ensure that the result can be losslessly transformed back into the input. Don't allow '*' at the start of URI path. 1226c4b#r26878192 Add sentinal const blank_userinfo to handle test cases with empty '@' userinfo. Remove UF_XXX and UF_XXX_MASK constants. It's simpler to use the state machine constants to keep track of state, and use local variables to keep track of values * handle asterix-form Request Target https://tools.ietf.org/html/rfc7230#section-5.3.4
- Loading branch information