Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Avoid keeping hold of partial bytes forever. (#984)
Motivation: The HTTPDecoder is a complex object that has very careful state management goals. One source of this complexity is that it is fed a stream of bytes with arbitrary chunk sizes, but needs to produce a collection of objects that are contiguous in memory. For example, each header field name and value must be turned into a String, which requires a contiguous sequence of bytes to do. As a result, it is quite common to have a situation where the HTTPDecoder has only *part* of an object that must be emitted atomically. In this situation, the HTTPDecoder would like to instruct its ByteToMessageHandler to keep hold of the bytes that form the beginning of that object. To avoid asking http_parser to parse those bytes twice, the HTTPDecoder uses a value called httpParserOffset to keep track. As an example, consider what would happen if the "Connection: keep-alive\r\n" header field was delivered in two chunks: first "Connection: keep-al", and then "ive\r\n". The header field name can be emitted in its entirety, but the partial field value must be preserved. To achieve this, the HTTPDecoder will store an offset internally to keep track of which bytes have been parsed. In this case, the offset will be set to 7: the number of bytes in "keep-al". It will then tell the rest of the code that only 12 bytes of the original 19 byte message were consumed, causing the ByteToMessageHandler to preserve those 7 bytes. However, when the next chunk is received, the ByteToMessageHandler will *replay* those bytes to HTTPDecoder. To avoid parsing them a second time, HTTPDecoder keeps track of how many bytes it is expecting to see replayed. This is the value in httpParserOffset. Due to a logic error in the HTTPDecoder, the httpParserOffset field was never returned to zero. This field would be modified whenever a partial field was received, but would never be returned to zero when a complete message was parsed. This would cause the HTTPDecoder to unnecessarily keep hold of extra bytes in the ByteToMessageHandler even when they were no longer needed. In some cases the number could get smaller, such as when a new partial field was received, but it could never drop to zero even when a complete HTTP message was receivedincremented. Happily, due to the rest of the HTTPDecoder logic this never produced an invalid message: while ByteToMessageHandler was repeatedly producing extra bytes, it never actually passed them to http_parser again, or caused any other issue. The only situation in which a problem would occur is if the HTTPDecoder had a RemoveAfterUpgradeStrategy other than .dropBytes. In that circumstance, decodeLast would not consume any extra bytes, but those bytes would have remained in the buffer passed to decodeLast, which would then incorrectly *forward them on*. This is the only circumstance in which this error manifested, and in most applications it led to surprising and irregular crashes on connection teardown. In all other applications the only effect was unnecessarily preserving a few tens of extra bytes on some connections, until receiving EOF caused us to drop all that memory anyway. Modifications: - Return httpParserOffset to 0 when a full message has been delivered. Result: Fewer weird crashes. (cherry picked from commit ae3d298)
- Loading branch information