-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(proxy-wasm) fix memory growth on low-isolation modes #420
Conversation
7a0b343
to
7fb5bd2
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #420 +/- ##
===================================================
- Coverage 90.26802% 90.15278% -0.11525%
===================================================
Files 46 46
Lines 8395 8378 -17
===================================================
- Hits 7578 7553 -25
- Misses 817 825 +8
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! And also good updates overall.
Turns out that ngx_pfree, which was being used in the code to free args and rets, does not free small allocations. It only frees large allocations; small allocations are only freed when the pool itself is destroyed.
Hmm... Welp, that much I knew (that's also why the nopool patch exists), so my first thought upon reading your summary was to think that request pools are also destroyed... 🤔 So far its been the intention to attach everything to the request pool. So, checking on ngx_http_proxy_wasm.c:361 says that for main requests, we allocate the filter chain in r->connection->pool
; I think that was a wrong choice, those pools might not get destroyed, pretty sure; overall I don't think this ever changed since the prototype... When subrequest filter chains became tested I fixed it for them but left main requests on the connection pool, probably not wanting to change something that at the time, seemed to work.
Maybe using r->pool
in all cases would have been a sufficient fix, but I do like better giving filter chains their own pools.
@@ -242,7 +242,8 @@ ngx_proxy_wasm_ctx_alloc(ngx_pool_t *pool) | |||
return NULL; | |||
} | |||
|
|||
pwctx->pool = pool; | |||
pwctx->pool = ngx_create_pool(128, pool->log); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call can fail, let's catch the error; perhaps also make 128
a #define
? It probably is a little low too, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another note too, I think we could first create the pool, then allocate pwctx
inside of it? That way no need to keep track of the parent pool at all, and it could even disappear from this function's arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call can fail, let's catch the error
Will do!
perhaps also make 128 a #define? It probably is a little low too, no?
It was a coin toss, as both 128 and 512 are used in other pools:
] g -r create_pool
common/lua/ngx_wasm_lua.c 128: pool = ngx_create_pool(512, log);
common/ngx_wasm_socket_tcp.c 573: c->pool = ngx_create_pool(128, sock->log);
http/proxy_wasm/ngx_http_proxy_wasm_dispatch.c 210: call->pool = ngx_create_pool(512, r->connection->log);
http/ngx_http_wasm_util.c 626: c->pool = ngx_create_pool(128, c->log);
ok, so I actually took the time now to understand how this size
argument for ngx_create_pool
works (it's the memory block size that holds the ngx_pool_s
plus an allotment for small allocations, and when that gets full another one of the same size gets allocated, and it's intended to be at most ngx_pagesize - 1
, usually 4095). So yeah, 512 should be fine.
I didn't make it a #define
because none of the others did, so I was following the existing patterns — was there a criteria for picking 128 or 512, or should we #define
them all to the same number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is how it works. Maybe we can have NGX_WASM_POOL_SIZE_S
and NGX_WASM_POOL_SIZE_M
with 128
and 512
? Fine to hard-code 512
as well given we already do it, we can #define
later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasoning for not giving pwctx
its own pool was to avoid an additional memory allocation for each request (which nginx tries to avoid in general) -> everything goes into the request memory pool. For dispatch calls or fake connections, we give them their own pool since not all filters will use those operations. But all requests will have to allocate a filter chain, hence r->pool
... All-in-all, I think we should keep the r->pool
option open (hence the #if 0
block comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could first create the pool, then allocate pwctx inside of it?
Can do, if you prefer this pattern. Personally, I always try to organize memory allocations as an ownership tree, that helps me track the lifetimes of allocations by looking at the structure definitions rather than by tracking the execution flow. Having the structure that owns the pool stored inside the pool itself makes the ownership relationship circular, which takes more attentive coding as it gets harder and harder to track all flows in your head as things grow, so it's a pattern I've always avoided. (I believe it's the kind of pattern you cannot even represent (easily?) in Rust, for example — I guess I've been "thinking in Rust" before Rust existed 😅 )
I wanted to avoid refactors to keep the PR small and the review easier, but if it were up to me, I'd prefer instead to make ngx_proxy_wasm_ctx_alloc
and ngx_proxy_wasm_ctx_destroy
more symmetrical not by removing the parent pool argument, but by passing it to the destroy function as well, making them more like ngx_pcalloc
and ngx_pfree
(which would then remove the parent_pool
field from the struct).
That, of course, requires that the owner of the parent pool is reachable at destruction time. From a quick look at the code it seems that those alloc and destroy functions are called at pretty different places (one under http/
, another under common/
), so a change to make things more symmetrical would be more code surgery than I think should happen in this PR (though I still think that it could have potential benefits, at the very least to force a rethink/clarification of the pool lifetimes). I'm not a big fan of the parent_pool
field either (I try to avoid "backlinks" to the owner structure whenever possible), but at least I like it better than the apparently-circular ownership. But I can go either way, no strong feelings here, just talking shop to share my thought process.
wargs = &swargs[0]; | ||
|
||
} else { | ||
wargs = ngx_pcalloc(instance->pool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to keep both of these cases covered via the ngx_wasm_debug
module with a dummy host function interface perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look but I wasn't able to find my way right away... I'm assuming the idea is to add a new ngx_command_t
from which to call ngx_wavm_instance_call_func
(or funcref
?) on a Wasm function with 11 args and 11 rets.
- How do I get a
ngx_wavm_instance_t *
from the debug module? - And how do I get an
ngx_wavm_func_t *
(orngx_wavm_funcref_t *
?) For this, I think I can dongx_wavm_module_func_lookup
by name if I have angx_wavm_module_t
; and to get the module I can dongx_wavm_module_lookup
. But then the question becomes: how do I get angx_wavm_t *
from the debug module?
I'm assuming I'll need to add a .wat file in the test that declares the 11-arg and 11-ret function (since neither Rust or Go do multi-ret functions); but that part I think I can figure out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both the instance creation and the dummy host function interface will have to be coded into the debug module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to declare the host function interface and write the test case, but I tried to access the wcf->vm
from ngx_wasm_core
module to link the module, to avoid having to recreate the whole VM setup, and I think I ended up lost in module initialization ordering shenanigans. I think I might have ended up sinking more time trying to get this test case running than fixing the memory issue. If the lack of coverage is a blocker, I'm inclined to drop the perf(wasmtime)
commit from this PR to get the fix(proxy-wasm)
merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will do it later then.
I think the 2 fixes are the same one; they don't really make a lot of sense individually since in & out themselves they don't fix the memory growth issue. |
7fb5bd2
to
58f8960
Compare
We were observing linear memory growth when running requests in low-isolation modes. The root cause comes from the fact that small allocations had instance-long lifetimes (including, for example, the arrays for `args` and `rets` for each Wasm call, causing the observable linear growth when running wrk). Turns out that `ngx_pfree`, which was being used in the code to free `args` and `rets`, does not free small allocations. It only frees large allocations; small allocations are only freed when the pool itself is destroyed. The solution is to use a request-lived pool for `pwctx`, so we don't hoard small request-specific allocations over the lifetime of an instance. We also change the setup of the `pwexec` pool so that it uses the `pwctx` pool under all isolation modes, like it does for logs. Wasm isolation mode should pertain only to the Wasm-accessible store, not to the temporary C-side data structures, which we now free on each request.
58f8960
to
11ed65c
Compare
Perform adjustment of arguments and return values in the stack for the common case of functions with fewer than 10 arguments and/or 10 returns.
11ed65c
to
b10014c
Compare
This Valgrind report looks legitimate:
From what I could tell, it seems to happen because when calling an FFI function from I think before the lack of a |
Patch with fix in #421. I will merge tomorrow but if you have more to squeeze in feel free to apply the patch here; it will be a little tricky rebasing on my proxy_wasm.c refactor which also fixes some lingering memory issues with instances, most of the file has changed but it is almost ready as well. |
Such as fake request passed to the FFI via the init_worker_by_lua fake request.
Awesome, thanks! I've cherry-picked your commit. No further changes from my part unless you have further comments, so if you think it's ready, feel free to merge! |
Merged! |
We were observing linear memory growth when running requests in low-isolation modes.
The root cause comes from the fact that small allocations had instance-long lifetimes (including, for example, the arrays for
args
andrets
for each Wasm call, causing the observable linear growth when runningwrk
).Turns out that
ngx_pfree
, which was being used in the code to freeargs
andrets
, does not free small allocations. It only frees large allocations; small allocations are only freed when the pool itself is destroyed.I observed an initial improvement by moving args and rets allocations to the stack. I added this as a
perf()
commit here since I think this is still a worthwhile change, given these are usually very small arrays.The general solution, though, is to use a request-lived pool for
pwctx
, so we don't hoard small request-specific allocations over the lifetime of an instance.A third commit changes the setup of the
pwexec
pool so that it uses thepwctx
pool under all isolation modes, like it does for logs. Wasm isolation mode should pertain only to the Wasm-accessible store, not to the temporary C-side data structures, which we now free on each request.Given our new understanding of
ngx_pfree
, I think there are probably more opportunities in the codebase for replacing known-small data which is deterministically freed either with a lighter-weightngx_alloc
/ngx_free
pair, or even with stack allocations, but I wanted to keep this PR straight-to-the-point and not go hunting for more changes.Before and after memory graphs: