Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm: out of memory with lots of HeapIdle #3237

Open
anuraaga opened this issue Oct 18, 2022 · 20 comments
Open

wasm: out of memory with lots of HeapIdle #3237

anuraaga opened this issue Oct 18, 2022 · 20 comments
Labels
wasm WebAssembly

Comments

@anuraaga
Copy link
Contributor

I am debugging an OOM panic with a wasm binary and want to check if this is a fact of life. The binary is handling HTTP requests in Envoy

https://github.com/corazawaf/coraza-proxy-wasm

After a while, it crashes with this, from a C++ function that's calling malloc which I have linked into the binary (recall the fix to allow malloc to not be corrupted for polyglot)

Proxy-Wasm plugin in-VM backtrace:
  0:  0xaf0a6 - runtime.runtimePanic
  1:  0xac0ac - runtime.alloc
  2:  0xc02a3 - malloc
  3:  0x1db10 - _Znwm
  4:  0x9967b - _ZN3re24Prog14MarkSuccessorsEPNS_11SparseArrayIiEES3_PNSt3__26vectorINS5_IiNS4_9allocatorIiEEEENS6_IS8_EEEEPNS_10SparseSetTIvEEPS8_
  5:  0x98bab - _ZN3re24Prog7FlattenEv
  6:  0xa141d - _ZN3re28Compiler6FinishEPNS_6RegexpE
  7:  0xa06bf - _ZN3re28Compiler7CompileEPNS_6RegexpEbx
  8:  0xa1511 - _ZN3re26Regexp20CompileToReverseProgEx
  9:  0xa8128 - _ZNSt3__217__call_once_proxyINS_5tupleIJOZNK3re23RE211ReverseProgEvE3$_1OPKS3_EEEEEvPv

I print memstats after each request, and they look like the below. There is no upward trend in the number of live objects so I don't think we have a memory leak. HeapIdle remains high throughout the program - is this because the memory gets fragmented and something we can't improve? Or is there by any chance a possible bug? I'm not too sure why it panics even though there's still plenty of headway from this 512MB heap up to the theoretical 4GB max, this may be an Envoy quirk. But either way I'd expect 512MB to be enough especially given the amount of idle heap. So wondering if there is a way to debug what could cause this and perhaps reduce heap fragmentation somehow.

Thanks!

[2022-10-18 03:41:48.211576][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 305275600, HeapInuse: 235409808, HeapReleased: 0, TotalAlloc: 6501802146, Mallocs: 29364142, Frees: 29072016, Live: 292126, GCSys: 8448210
[2022-10-18 03:41:50.277445][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 310489168, HeapInuse: 230196240, HeapReleased: 0, TotalAlloc: 6417949203, Mallocs: 28800064, Frees: 28502873, Live: 297191, GCSys: 8448210
[2022-10-18 03:41:50.357212][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 303646512, HeapInuse: 237038896, HeapReleased: 0, TotalAlloc: 6424564660, Mallocs: 28829160, Frees: 28502873, Live: 326287, GCSys: 8448210
[2022-10-18 03:41:50.436784][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 296803760, HeapInuse: 243881648, HeapReleased: 0, TotalAlloc: 6431180187, Mallocs: 28858258, Frees: 28502873, Live: 355385, GCSys: 8448210
[2022-10-18 03:41:50.520306][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 298149872, HeapInuse: 242535536, HeapReleased: 0, TotalAlloc: 6508661253, Mallocs: 29398622, Frees: 29072016, Live: 326606, GCSys: 8448210
[2022-10-18 03:41:50.599077][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 289961008, HeapInuse: 250724400, HeapReleased: 0, TotalAlloc: 6437795718, Mallocs: 28887356, Frees: 28502873, Live: 384483, GCSys: 8448210
[2022-10-18 03:41:50.678640][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 291307120, HeapInuse: 249378288, HeapReleased: 0, TotalAlloc: 6515276784, Mallocs: 29427720, Frees: 29072016, Live: 355704, GCSys: 8448210
[2022-10-18 03:41:50.768192][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 284167968, HeapInuse: 256517440, HeapReleased: 0, TotalAlloc: 6522151153, Mallocs: 29462448, Frees: 29072016, Live: 390432, GCSys: 8448210
[2022-10-18 03:41:50.846150][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 283118112, HeapInuse: 257567296, HeapReleased: 0, TotalAlloc: 6444411385, Mallocs: 28916455, Frees: 28502873, Live: 413582, GCSys: 8448210
[2022-10-18 03:41:50.929658][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 277325312, HeapInuse: 263360096, HeapReleased: 0, TotalAlloc: 6528766612, Mallocs: 29491544, Frees: 29072016, Live: 419528, GCSys: 8448210
[2022-10-18 03:41:51.012431][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 275936928, HeapInuse: 264748480, HeapReleased: 0, TotalAlloc: 6451328804, Mallocs: 28951026, Frees: 28502873, Live: 448153, GCSys: 8448210
[2022-10-18 03:41:51.090576][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 269094176, HeapInuse: 271591232, HeapReleased: 0, TotalAlloc: 6457944335, Mallocs: 28980124, Frees: 28502873, Live: 477251, GCSys: 8448210
[2022-10-18 03:41:51.168760][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 262251520, HeapInuse: 278433888, HeapReleased: 0, TotalAlloc: 6464559788, Mallocs: 29009220, Frees: 28502873, Live: 506347, GCSys: 8448210
[2022-10-18 03:41:51.253787][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 270273440, HeapInuse: 270411968, HeapReleased: 0, TotalAlloc: 6535549993, Mallocs: 29526228, Frees: 29072016, Live: 454212, GCSys: 8448210
[2022-10-18 03:41:51.331706][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 263430768, HeapInuse: 277254640, HeapReleased: 0, TotalAlloc: 6542165458, Mallocs: 29555325, Frees: 29072016, Live: 483309, GCSys: 8448210
[2022-10-18 03:41:51.409618][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 255408768, HeapInuse: 285276640, HeapReleased: 0, TotalAlloc: 6471175319, Mallocs: 29038318, Frees: 28502873, Live: 535445, GCSys: 8448210
[2022-10-18 03:41:51.493998][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 256056576, HeapInuse: 284628832, HeapReleased: 0, TotalAlloc: 6549271681, Mallocs: 29589918, Frees: 29072016, Live: 517902, GCSys: 8448210
[2022-10-18 03:41:51.573293][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 248566032, HeapInuse: 292119376, HeapReleased: 0, TotalAlloc: 6477790844, Mallocs: 29067415, Frees: 28502873, Live: 564542, GCSys: 8448210
[2022-10-18 03:41:51.653381][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 241723136, HeapInuse: 298962272, HeapReleased: 0, TotalAlloc: 6484406515, Mallocs: 29096514, Frees: 28502873, Live: 593641, GCSys: 8448210
[2022-10-18 03:41:51.732819][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 234880464, HeapInuse: 305804944, HeapReleased: 0, TotalAlloc: 6491021978, Mallocs: 29125611, Frees: 28502873, Live: 622738, GCSys: 8448210
[2022-10-18 03:41:51.817217][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 227416400, HeapInuse: 313269008, HeapReleased: 0, TotalAlloc: 6498218488, Mallocs: 29160311, Frees: 28502873, Live: 657438, GCSys: 8448210
[2022-10-18 03:41:51.898968][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 220573664, HeapInuse: 320111744, HeapReleased: 0, TotalAlloc: 6504834011, Mallocs: 29189408, Frees: 28502873, Live: 686535, GCSys: 8448210
[2022-10-18 03:41:51.979107][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 213730912, HeapInuse: 326954496, HeapReleased: 0, TotalAlloc: 6511449540, Mallocs: 29218506, Frees: 28502873, Live: 715633, GCSys: 8448210
[2022-10-18 03:41:52.063271][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 206663520, HeapInuse: 334021888, HeapReleased: 0, TotalAlloc: 6518252118, Mallocs: 29252702, Frees: 28502873, Live: 749829, GCSys: 8448210
[2022-10-18 03:41:52.142527][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 199820624, HeapInuse: 340864784, HeapReleased: 0, TotalAlloc: 6524867789, Mallocs: 29281801, Frees: 28502873, Live: 778928, GCSys: 8448210
[2022-10-18 03:41:52.222177][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 192977888, HeapInuse: 347707520, HeapReleased: 0, TotalAlloc: 6531483310, Mallocs: 29310898, Frees: 28502873, Live: 808025, GCSys: 8448210
[2022-10-18 03:41:52.305710][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 185846912, HeapInuse: 354838496, HeapReleased: 0, TotalAlloc: 6538353295, Mallocs: 29344928, Frees: 28502873, Live: 842055, GCSys: 8448210
[2022-10-18 03:41:52.392574][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 179004160, HeapInuse: 361681248, HeapReleased: 0, TotalAlloc: 6544968822, Mallocs: 29374026, Frees: 28502873, Live: 871153, GCSys: 8448210
[2022-10-18 03:41:52.472070][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 249213904, HeapInuse: 291471504, HeapReleased: 0, TotalAlloc: 6555887144, Mallocs: 29619015, Frees: 29072016, Live: 546999, GCSys: 8448210
[2022-10-18 03:41:54.262522][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 172161408, HeapInuse: 368524000, HeapReleased: 0, TotalAlloc: 6551584353, Mallocs: 29403124, Frees: 28502873, Live: 900251, GCSys: 8448210
[2022-10-18 03:41:56.306579][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 308585056, HeapInuse: 232100352, HeapReleased: 0, TotalAlloc: 6558266266, Mallocs: 29433374, Frees: 29134881, Live: 298493, GCSys: 8448210
[2022-10-18 03:41:56.388781][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 301742176, HeapInuse: 238943232, HeapReleased: 0, TotalAlloc: 6564881929, Mallocs: 29462472, Frees: 29134881, Live: 327591, GCSys: 8448210
[2022-10-18 03:41:56.470830][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 294899536, HeapInuse: 245785872, HeapReleased: 0, TotalAlloc: 6571497382, Mallocs: 29491567, Frees: 29134881, Live: 356686, GCSys: 8448210
[2022-10-18 03:41:56.550087][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 288056784, HeapInuse: 252628624, HeapReleased: 0, TotalAlloc: 6578112911, Mallocs: 29520665, Frees: 29134881, Live: 385784, GCSys: 8448210
[2022-10-18 03:41:56.628840][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 281213888, HeapInuse: 259471520, HeapReleased: 0, TotalAlloc: 6584728580, Mallocs: 29549764, Frees: 29134881, Live: 414883, GCSys: 8448210
[2022-10-18 03:41:56.707301][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 274371216, HeapInuse: 266314192, HeapReleased: 0, TotalAlloc: 6591344043, Mallocs: 29578861, Frees: 29134881, Live: 443980, GCSys: 8448210
[2022-10-18 03:41:56.785671][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 267528496, HeapInuse: 273156912, HeapReleased: 0, TotalAlloc: 6597959562, Mallocs: 29607957, Frees: 29134881, Live: 473076, GCSys: 8448210
[2022-10-18 03:41:57.041439][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 260685744, HeapInuse: 279999664, HeapReleased: 0, TotalAlloc: 6604575091, Mallocs: 29637055, Frees: 29134881, Live: 502174, GCSys: 8448210
[2022-10-18 03:41:52.472070][18][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 249213904, HeapInuse: 291471504, HeapReleased: 0, TotalAlloc: 6555887144, Mallocs: 29619015, Frees: 29072016, Live: 546999, GCSys: 8448210
[2022-10-18 03:41:54.262522][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 172161408, HeapInuse: 368524000, HeapReleased: 0, TotalAlloc: 6551584353, Mallocs: 29403124, Frees: 28502873, Live: 900251, GCSys: 8448210
[2022-10-18 03:41:56.306579][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 308585056, HeapInuse: 232100352, HeapReleased: 0, TotalAlloc: 6558266266, Mallocs: 29433374, Frees: 29134881, Live: 298493, GCSys: 8448210
[2022-10-18 03:41:56.388781][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 301742176, HeapInuse: 238943232, HeapReleased: 0, TotalAlloc: 6564881929, Mallocs: 29462472, Frees: 29134881, Live: 327591, GCSys: 8448210
[2022-10-18 03:41:56.470830][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 294899536, HeapInuse: 245785872, HeapReleased: 0, TotalAlloc: 6571497382, Mallocs: 29491567, Frees: 29134881, Live: 356686, GCSys: 8448210
[2022-10-18 03:41:56.550087][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 288056784, HeapInuse: 252628624, HeapReleased: 0, TotalAlloc: 6578112911, Mallocs: 29520665, Frees: 29134881, Live: 385784, GCSys: 8448210
[2022-10-18 03:41:56.628840][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 281213888, HeapInuse: 259471520, HeapReleased: 0, TotalAlloc: 6584728580, Mallocs: 29549764, Frees: 29134881, Live: 414883, GCSys: 8448210
[2022-10-18 03:41:56.707301][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 274371216, HeapInuse: 266314192, HeapReleased: 0, TotalAlloc: 6591344043, Mallocs: 29578861, Frees: 29134881, Live: 443980, GCSys: 8448210
[2022-10-18 03:41:56.785671][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 267528496, HeapInuse: 273156912, HeapReleased: 0, TotalAlloc: 6597959562, Mallocs: 29607957, Frees: 29134881, Live: 473076, GCSys: 8448210
[2022-10-18 03:41:57.041439][17][debug][wasm] [source/extensions/common/wasm/context.cc:1167] wasm log coraza-filter my_vm_id: Sys: 549133632, HeapSys: 540685408, HeapIdle: 260685744, HeapInuse: 279999664, HeapReleased: 0, TotalAlloc: 6604575091, Mallocs: 29637055, Frees: 29134881, Live: 502174, GCSys: 8448210```
@dgryski
Copy link
Member

dgryski commented Oct 18, 2022

I'm not surprised the C heap and the tinygo garbage collector aren't playing nice together. Getting some more debug info is definitely way to go. If the crash is easily reproducible, we should think about what other info we might want to get out of tinygo allocator that would make this easier to hunt down.

@anuraaga
Copy link
Contributor Author

anuraaga commented Oct 19, 2022

Thanks @dgryski - just to clarify, malloc/free for C are still serviced by TinyGo as []byte.

https://github.com/tinygo-org/tinygo/blob/release/src/runtime/arch_tinygowasm.go#L97

Would we expect this to also not play nice together? I suppose it could have been Go code too - but perhaps a pool of references in a map is a pattern that we wouldn't expect in Go and could be triggering some bad behavior.

Happy to run with custom builds to see what we can find, I'll poke around to see if I can find something to instrument myself too. If anyone is interested, it's relatively simple to run our test with tinygo, go, docker installed (from PATH, so building a custom TinyGo would be used as-is).

git clone https://github.com/corazawaf/coraza-proxy-wasm.git
cd coraza-proxy-wasm
MEMSTATS=true go run mage.go build
MEMSTATS=true go run mage.go ftw

It'll just print those memstat lines. It's obvious when the GC is pausing as most requests take milliseconds and the paused requests take about a second, with live objects not really trending up. The pausing becomes more frequent until finally it grows the heap - then the pauses are less frequent until they become frequent again. Feels like heap fragmentation but not sure without more investigation.

The build page also has an INITIALPAGES= parameter and it can be interesting looking at starting with a larger heap

@aykevl
Copy link
Member

aykevl commented Oct 19, 2022

It's hard to say what's going on without more information.
You could try modifying this value here: https://github.com/tinygo-org/tinygo/pull/2884/files#diff-60c0f1b66b810ecb7830cdd34bc74156d4ed240b0da9c7c3488c88a2e0b85dc8R294
Other than that, you could try printing some more details when the heap is out of memory. Does growHeap() return false? What size does growHeap request and is it a reasonable value? In any case, it should grow the heap until well beyond 512MB if needed.

..that said, the GC is not at all optimized and a 1 second pause time doesn't surprise me at all with such a large heap.

@anuraaga
Copy link
Contributor Author

I suspect Envoy may be preventing growHeap from working past 512MB but didn't completely dig in. I'm more concerned that the memory usage keeps on going up even though there is plenty of idle heap that should be usable and wonder if there's known allocation patterns that could cause that.

For improving performance, I tried an approach replacing the current map-based malloc with a large buffer, e.g. 128MB, that is used as the backing heap for mimalloc and letting mimalloc handle malloc/free. I figured this would reduce the number of Go objects ([]byte) dramatically and speed up the GC, but I didn't notice much speed difference. Looking at the code, I may be misunderstanding but it seems like the GC has to mostly go through every word on the heap to see if it is pointer-like for marking, so I guess the number of []byte doesn't have much impact if the number of bytes on the heap is high anyways.

So I went ahead and made the big change of having the Go GC use a fixed-size heap allocated by mimalloc and letting mimalloc own the wasm heap (I happened to use mimalloc for having a self-contained malloc for easier tweaking, but will also try enabling dlmalloc in TinyGo's wasi-libc and see if it performs similar). But even with a 128MB fixed-size heap for Go, our benchmark runs to completion, and much faster, less pauses and they are about ~100ms, not ~1s for a comparably sized heap with normal TinyGo, throughout. I suspect some work could happen to allow multiple arenas to make it a not fixed-size heap too which I believe is how the Go GC uses system allocated buffers.

The changes are here
https://github.com/tinygo-org/tinygo/compare/dev...anuraaga:custommalloc?expand=1

Notably, growHeap is stubbed to return false pending possibly allowing multiple arenas. And initHeap overrides heapStart and heapEnd to be a buffer allocated with malloc.

I doubt there is interest in actually adding this GC to TinyGo so I am going to continue looking into how it could be allowed to be plugged in by the app. I tried gc=none and had some trouble related to marking stack objects

https://github.com/corazawaf/coraza-proxy-wasm/compare/main...anuraaga:mimalloc?expand=1#diff-576e3949e80525db1398982ad4e707779ef9f5ba2452983f79c4b1569104ec59R9

I'm not too sure the linkname approach is even supposed to work, but while I could get the tinygo compilation to succeed with this code, the result was an invalid wasm file.

I'll look more into whether this can be made to be pluggable more cleanly, but if anyone happens to have any suggestions would be great to hear. One big issue I saw with gc=none is that it stubs out initHeap

https://github.com/tinygo-org/tinygo/blob/release/src/runtime/gc_none.go#L38

I wonder if this and maybe all the other functions could also be kept as stubs to allow externally overriding - already, alloc the most important function of all is a stub, so gc=none doesn't compile without some external code.

@deadprogram deadprogram added the wasm WebAssembly label Oct 20, 2022
@anuraaga
Copy link
Contributor Author

anuraaga commented Oct 21, 2022

I think I had some confusion with go:extern and go:linkname in my hacking but was able to make the GC pluggable with this hopefully not too invasive change

#3245

This would be a huge unblocker for us (and maybe TinyGo e.g., #3162 (review) :) ). Hopefully the change makes sense.

@dgryski
Copy link
Member

dgryski commented Oct 23, 2022

@anuraaga I'd be curious to know the results of your testing with #3236 applied

@dgryski
Copy link
Member

dgryski commented Feb 2, 2023

Did we get any further investigating this? Can we place the blame on the GC or elsewhere?

@anuraaga
Copy link
Contributor Author

anuraaga commented Feb 2, 2023

Swapping in bdwgc removed the problem - not definitive but I guess it would put the blame on GC most likely. I haven't tried the new precise GC yet.

@gmlewis
Copy link

gmlewis commented Aug 13, 2024

@anuraaga - I've got a TinyGo wasi program that is running out of memory on Spin Cloud and was wondering if you have an example of how you swapped out the garbage collector using TinyGo compiling to wasm?

@anuraaga
Copy link
Contributor Author

Hi @gmlewis - I had nottinygc library to integrate them, but I found there are still too many cases of memory leaks to maintain it as a stable library. I described some issues I had in detail here

tetratelabs/proxy-wasm-go-sdk#450 (comment)

You could still give the package a try just to see how it works for you. Many issues are solvable with I think a large amount of drive so if anyone is interested, maybe they can resurrect the approach with the hints in that issue / codebase.

@dgryski
Copy link
Member

dgryski commented Aug 13, 2024

@gmlewis I'm curious to know if #4400 makes any difference to your program.

@gmlewis
Copy link

gmlewis commented Aug 13, 2024

@gmlewis I'm curious to know if #4400 makes any difference to your program.

@dgryski - I'm running on a Mac M2 Max and followed the BUILDING.md steps (with your 1-file change) but stopped at the ldd step since I can't find it...

Now, when I build my Spin app with: tinygo build -target=wasi -gc=leaking -no-debug -o main.wasm . I get:

error: could not find wasi-libc, perhaps you need to run `make wasi-libc`?

I'll try again on a Linux Mint 21.3 Cinnamon box to see if I can get it to work there. Thank you!

@dgryski
Copy link
Member

dgryski commented Aug 13, 2024

If you're using the leaking gc, then neither my change nor using nottinygc will make a difference.

@gmlewis
Copy link

gmlewis commented Aug 13, 2024

If you're using the leaking gc, then neither my change nor using nottinygc will make a difference.

Oh, sorry, that's the default that spin build uses. What should my command-line be?
Just tinygo build -target=wasi -no-debug -o main.wasm . ?

@dgryski
Copy link
Member

dgryski commented Aug 13, 2024

@gmlewis Yes, that's correct. That will switch from the leaking collector to the default conservative one.

@gmlewis
Copy link

gmlewis commented Aug 13, 2024

@dgryski - I built your version both on Linux and on Mac... it turns out that all I really needed to do was indeed follow the directions and type "make wasi-libc" before building tinygo... then both platforms were able to generate (different) main.wasm files.

I tried both main.wasm files (one from Linux, one from MacOS) on Spin Cloud and unfortunately both of them hung for 30 seconds and both timed out with this change (but no panic which was nice).

Previously, the wasm endpoint would respond in under one second.

@dgryski
Copy link
Member

dgryski commented Aug 13, 2024

@gmlewis If you have a solid reproducer (hang vs. 1 second response), if you could bisect the git commits and figure out which commit caused the regression that would be a huge help. Is your same program available somewhere or it is proprietary?

@gmlewis
Copy link

gmlewis commented Aug 13, 2024

@gmlewis If you have a solid reproducer (hang vs. 1 second response), if you could bisect the git commits and figure out which commit caused the regression that would be a huge help. Is your same program available somewhere or it is proprietary?

It's not open source, but I can try to bisect the git commits or alternatively try to make a simpler test case.

@dgryski
Copy link
Member

dgryski commented Sep 24, 2024

Any luck getting a smaller reproducer you can share?

@gmlewis
Copy link

gmlewis commented Sep 24, 2024

Any luck getting a smaller reproducer you can share?

No, sorry. I switched this wasm app to MoonBit and the problem went away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasm WebAssembly
Projects
None yet
Development

No branches or pull requests

5 participants