-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory management #1337
Comments
There is some relevant information about the impact of using TL;DR the libxml2 parser was measured as being 1.12x - 1.34x slower when using |
I think the best solution here is to only use arena allocations. Since this seems performance-critical I think it should be in for 1.0. But also I don't know what is the vision/general requirements for 1.0. I think we need to review and cleanup all nodes for 1.0, as after that it will be expected the AST is fully stable. |
Yeah @flavorjones I remember you showing me the nokogiri metrics. I think we should build it either way in order to be able to measure it on YARP for sure. |
@HParker did you want to take a look at the first bullet here? |
Sure! I might have already started looking into it. :) |
At the moment, we allocate memory as we need it, making a single
malloc
call for every node in the tree, as well as whichever fields on those nodes need memory. When we're done with the tree, we recursively visit the tree and deallocate all at once. This is a really good candidate for arena allocation. But more than that, it would be good enough just to centralize our memory management functions, which we need to do anyway. Below are a couple of tasks that are related to memory that would improve our general memory story.malloc
/calloc
/realloc
/free
functions whenever we need memory. (We purposefully avoidalloca
because it's not C standard.) Those functions calls are scattered throughout the codebase. For all subsequent bullets, we need them to go through centralized functions. These functions should all accept ayp_parser_t *
, but it can be marked as unused for now (YP_ATTRIBUTE_UNUSED
). For this first task, they should all simply call the stdlib function. For example, ayp_malloc
function would callmalloc
with the same arguments.yp_parser_t
such that theyp_
version of the memory functions call the function pointers if they are provided. We need this for CRuby, which providesxmalloc
which will attempt a GC if a call tomalloc
fails. This should follow the same pattern we have for theencoding_changed
callback, which is that it provides a function that accepts a function pointer.yp_malloc
would then change to callmalloc
/xmalloc
only when the current arena has run out of allocatable memory. We can be very naive with this and make it simply a bump allocator since we so rarely deallocate nodes while parsing.The first two of these tasks are necessary for our integration into CRuby. The last one is a nice-to-have enhancement, and is not necessary for v1.0.
The text was updated successfully, but these errors were encountered: