Optimize text with escapes parsing#2719
Conversation
846d398 to
78027c7
Compare
sbc100
left a comment
There was a problem hiding this comment.
This seems like some extra code complexity. Are we sure its worth it? Can you quantify the memory savings?
Presumably the savings are just peak memory saving during parsing and the memory usage at the end of parsing will be unchanged?
| return Result::Ok; | ||
| } | ||
|
|
||
| static const size_t kInlineBufferSize = 96; |
There was a problem hiding this comment.
No reason. a 100 byte stack is not big, and people prefer shorter names.
There was a problem hiding this comment.
Maybe worth a comment, even if to mention that its somewhat arbitrary.
The final string is allocated once without wasting memory at the end. An internal buffer is allocated at most once, when the inline buffer is too small. The final size of the unescaped buffer is always less or equal than the original size. This observation can be used for reducing memory allocations.
|
I made a measurement with cpu cycle counters. Parsing time of "[Method] 012345778\61::func" is reduced from 90960 cycles to 77543 cycles, which is 17% improvement (in release mode). The overall runtime is 0.0000379 sec, so this is negligible. This patch is intended to be a small improvement. Of course it is not that important, I don't mind if it is rejected. The GC patches are more important for me. |
|
I'm somewhat conflicted because I think we should aim to keep to keep wabt simple where possible. But performance is nice. Perhaps we could leave this open and we can re-consider if folks are noticing slow parse times for text files. Another measurement I'd be curious about: Does this change have any measurable effect on the time it takes to run the test suite? |
This patch is a small optimization for quoted text parsing, which reduces the number of allocations, and the final string is allocated once (capacity == size, so no bytes wasted at the buffer end).
The final string is allocated once without wasting memory at the end. An internal buffer is allocated at most once, when the inline buffer is too small. The final size of the unescaped buffer is always less or equal than the original size. This observation can be used for reducing memory allocations.