Resolved: Aspell memleak; was: Memory fragmentation #1373
Replies: 5 comments 15 replies
-
Gnuplot for raw data: Scroll down for the raw data, then scroll some more for the graphs.
|
Beta Was this translation helpful? Give feedback.
-
I cannot understand how a significant memory fragmentation can be created, as for each sentence the same data structures are mostly of the same size, so free blocks can be just reused instead of splitting bigger free blocks. Some data structures are of "random" size, and may be reallocated frequently while tokenizing/parsing (like the tokenizer alternatives and wordgraph arrays or the link arrays), but there are relatively small. Is it possible to add an allocated sizes histogram? Regarding virtual memory size, there is a potential problem in the pool allocator when it is requested to allocated power-of-2 block sizes. sent->Disjunct_pool = pool_new(__func__, "Disjunct",
/*num_elements*/2048, sizeof(Disjunct),
/*zero_out*/false, /*align*/false, /*exact*/false);
sent->Connector_pool = pool_new(__func__, "Connector",
/*num_elements*/8192, sizeof(Connector),
/*zero_out*/true, /*align*/false, /*exact*/false); If malloc adds its own info in the allocated block, and it itself allocated power-of-2 blocks, then it would need to allocated blocks that are double in size, when most of them is not used. It may be interesting to change In any case, the current memory allocation can be improved in many places to prevent many small allocations that grow on each added element. E.g. in the tokenizer. This will also increase the total speed by a few percents. |
Beta Was this translation helpful? Give feedback.
-
FYI Blog entry describing glibc memory fragmentation and the reasons for it. https://blog.arkey.fr/drafts/2021/01/22/native-memory-fragmentation-with-glibc/ Examines two alternatives: tcmalloc and jemalloc. Some key takeaways, quoting from the blog:
An important difference in terms of thread management.
There is also the problem that since jemalloc spin new caches to accommodate new thread ids, having a sudden spike of threads will leave you with (mostly) empty caches in the subsequent calm phase. As a result, I would recommend tcmalloc in the general case, and reserve jemalloc for very specific usages (low variation on the number of threads during the lifetime of the application). |
Beta Was this translation helpful? Give feedback.
-
It's the freakin spell-guesser. Setting This can't be our fault, at least not directly. The instrumented debug code, which overloads malloc and free, does not find any problems in |
Beta Was this translation helpful? Give feedback.
-
I disabled aspell by default in #1376 -- I consider this issue closed. I'm happy that the resolution was this easy. |
Beta Was this translation helpful? Give feedback.
-
Issue #1366 caused me to take a good look at memory fragmentation in link-grammar. Basically, when running for a long time, RAM usage will slowly grow. This is not a memory leak (as far as I can tell), this is memory fragmentation. Here's a report if what I've found.
Test case is the
tests/multi-thread
binary, altered to run 50000 iterations.There's no conventional memleak:
valgrind --leak-check-full
reports no leakage. i.e. all alloc'ed memory is freed at the end.It could still happen that there is some mempool that grows out of bound. However, this does not seem to happen. This size of alloced memory stays constant. (Actually, it seems to shrink, slightly!??)
Measurement: redefine
malloc
,realloc
,strdup
andfree
and count who is mallocing freeing. This is done inmalloc-dbg2.c
. It is non-trivial to get this to compile, so themalloc-debug-installed
branch inlinas/link-grammar
contains the needed hacks to getmulti-thread
to compile and run. The printout will report which files are mallocig and freeing, and how much.Results:
top
starts at about 200MB and then grows to 4GB by the end.The average size of an alloc is slowly shrinking, probably because the reading of the dict uses larger allocs (e.g. the entire file) while parsing uses smaller allocs. I should probably restart counting after dict open. Hmmm.
The fragmentation rate seems to be 0.1% -- we have to malloc & free almost 1TB before 1GB RSS is lost.
While measuring this, I reorganized the code base so that most mallocs & the matching frees happen in the same file. Thus the memory report prints results, per file.
Beta Was this translation helpful? Give feedback.
All reactions