Resolved: Aspell memleak; was: Memory fragmentation #1373

linas · 2023-01-15T19:42:19Z

linas
Jan 15, 2023
Maintainer

Issue #1366 caused me to take a good look at memory fragmentation in link-grammar. Basically, when running for a long time, RAM usage will slowly grow. This is not a memory leak (as far as I can tell), this is memory fragmentation. Here's a report if what I've found.

Test case is the tests/multi-thread binary, altered to run 50000 iterations.

There's no conventional memleak: valgrind --leak-check-full reports no leakage. i.e. all alloc'ed memory is freed at the end.

It could still happen that there is some mempool that grows out of bound. However, this does not seem to happen. This size of alloced memory stays constant. (Actually, it seems to shrink, slightly!??)

Measurement: redefine malloc, realloc, strdup and free and count who is mallocing freeing. This is done in malloc-dbg2.c. It is non-trivial to get this to compile, so the malloc-debug-installed branch in linas/link-grammar contains the needed hacks to get multi-thread to compile and run. The printout will report which files are mallocig and freeing, and how much.

Results:

Alloced memory seems to start at about 130MBytes, and slowly shrink to 80MBytes by the end of the run. I don't understand why it shrinks.
At the end of the run, 2.9TB will have been allocated and freed.
The average size of an alloc is 1500 Bytes.
RSS (resident set size) RAM usage, as reported by top starts at about 200MB and then grows to 4GB by the end.
The growth is non-linear; is slows down, table below:

   RSS   churn  avgsz
   ----  -----  -----
   1GB   920GB   1809
   2GB   1.5TB   1600
   3GB   2.2TB   1537
   4GB   2.9TB   1476

The average size of an alloc is slowly shrinking, probably because the reading of the dict uses larger allocs (e.g. the entire file) while parsing uses smaller allocs. I should probably restart counting after dict open. Hmmm.

The fragmentation rate seems to be 0.1% -- we have to malloc & free almost 1TB before 1GB RSS is lost.

While measuring this, I reorganized the code base so that most mallocs & the matching frees happen in the same file. Thus the memory report prints results, per file.

linas · 2023-01-15T23:16:45Z

linas
Jan 15, 2023
Maintainer Author

Gnuplot for raw data: Scroll down for the raw data, then scroll some more for the graphs.

set term png
set out 'mem-use.png'

set key left

set title "Memory currently in use"
set xlabel "Time"
set ylabel "KBytes"

plot "mem6.dat" using 1:3 with lines lw 2 title ""

# ------------------------
set out 'mem-tot.png'

set title "Total memory alloced"
set xlabel "Time"
set ylabel "MBytes"

plot "mem6.dat" using 1:6 with lines lw 2 title ""

# ------------------------
set out 'mem-avg.png'

set title "Averge size of malloc"
set xlabel "Time"
set ylabel "Bytes"

plot "mem6.dat" using 1:12 with lines lw 2 title ""

# ------------------------
set out 'mem-rss.png'

set title "Resident set size RSS"
set xlabel "Time"
set ylabel "KBytes"

plot "mem6.dat" using 1:15 with lines lw 2 title ""

# ------------------------
set out 'mem-frg.png'

set title "RSS per Malloc"
set xlabel "Time"
set ylabel "RSS/Malloc"
set xrange [5:]

plot "mem6.dat" using 1:($15/(1024 * $6)) with lines lw 2 title ""

# ------------------------

3 replies

linas Jan 15, 2023
Maintainer Author

mem8.dat data

1 Use= 15232 KB; Tot= 15 MB in 118143 mlocs; avg= 140 Bytes; RSS= 49068 KB
2 Use= 0 KB; Tot= 0 MB in 0 mlocs; avg= 0 Bytes; RSS= 49068 KB
3 Use= 56389 KB; Tot= 64 MB in 333650 mlocs; avg= 202 Bytes; RSS= 111508 KB
4 Use= 0 KB; Tot= 0 MB in 0 mlocs; avg= 0 Bytes; RSS= 111508 KB
5 Use= 46066 KB; Tot= 30774 MB in 21123123 mlocs; avg= 1527 Bytes; RSS= 279276 KB
6 Use= 52340 KB; Tot= 61508 MB in 42246246 mlocs; avg= 1526 Bytes; RSS= 329308 KB
7 Use= 51839 KB; Tot= 92326 MB in 63369369 mlocs; avg= 1527 Bytes; RSS= 370188 KB
8 Use= 25554 KB; Tot= 123405 MB in 84492492 mlocs; avg= 1531 Bytes; RSS= 407864 KB
9 Use= 33352 KB; Tot= 154294 MB in 105615615 mlocs; avg= 1531 Bytes; RSS= 454620 KB
10 Use= 27563 KB; Tot= 185280 MB in 126738738 mlocs; avg= 1532 Bytes; RSS= 486828 KB
11 Use= 21282 KB; Tot= 216363 MB in 147861861 mlocs; avg= 1534 Bytes; RSS= 542132 KB
12 Use= 25134 KB; Tot= 247416 MB in 168984984 mlocs; avg= 1535 Bytes; RSS= 578832 KB
13 Use= 33738 KB; Tot= 278490 MB in 190108107 mlocs; avg= 1536 Bytes; RSS= 620768 KB
14 Use= 22411 KB; Tot= 309393 MB in 211231230 mlocs; avg= 1535 Bytes; RSS= 659628 KB
15 Use= 17985 KB; Tot= 340506 MB in 232354353 mlocs; avg= 1536 Bytes; RSS= 698404 KB
16 Use= 19136 KB; Tot= 371536 MB in 253477476 mlocs; avg= 1536 Bytes; RSS= 740656 KB
17 Use= 62861 KB; Tot= 402588 MB in 274600599 mlocs; avg= 1537 Bytes; RSS= 777084 KB
18 Use= 19938 KB; Tot= 433939 MB in 295723722 mlocs; avg= 1538 Bytes; RSS= 819380 KB
19 Use= 57812 KB; Tot= 465135 MB in 316846845 mlocs; avg= 1539 Bytes; RSS= 852852 KB
20 Use= 56084 KB; Tot= 496244 MB in 337969968 mlocs; avg= 1539 Bytes; RSS= 890708 KB
21 Use= 33340 KB; Tot= 527457 MB in 359093091 mlocs; avg= 1540 Bytes; RSS= 925856 KB
22 Use= 44847 KB; Tot= 558715 MB in 380216214 mlocs; avg= 1540 Bytes; RSS= 971812 KB
23 Use= 20441 KB; Tot= 589745 MB in 401339337 mlocs; avg= 1540 Bytes; RSS= 1009160 KB
24 Use= 61655 KB; Tot= 621211 MB in 422462460 mlocs; avg= 1541 Bytes; RSS= 1053924 KB
25 Use= 26708 KB; Tot= 652408 MB in 443585583 mlocs; avg= 1542 Bytes; RSS= 1095656 KB
26 Use= 30784 KB; Tot= 683644 MB in 464708706 mlocs; avg= 1542 Bytes; RSS= 1130716 KB
27 Use= 26102 KB; Tot= 714791 MB in 485831829 mlocs; avg= 1542 Bytes; RSS= 1166068 KB
28 Use= 34508 KB; Tot= 746050 MB in 506954952 mlocs; avg= 1543 Bytes; RSS= 1205300 KB
29 Use= 56940 KB; Tot= 777460 MB in 528078075 mlocs; avg= 1543 Bytes; RSS= 1245184 KB
30 Use= 49181 KB; Tot= 808708 MB in 549201198 mlocs; avg= 1544 Bytes; RSS= 1285196 KB
31 Use= 19263 KB; Tot= 840028 MB in 570324321 mlocs; avg= 1544 Bytes; RSS= 1330196 KB
32 Use= 40928 KB; Tot= 871406 MB in 591447444 mlocs; avg= 1544 Bytes; RSS= 1367908 KB
33 Use= 33868 KB; Tot= 902861 MB in 612570567 mlocs; avg= 1545 Bytes; RSS= 1413532 KB
34 Use= 17101 KB; Tot= 934268 MB in 633693690 mlocs; avg= 1545 Bytes; RSS= 1459384 KB
35 Use= 50518 KB; Tot= 965647 MB in 654816813 mlocs; avg= 1546 Bytes; RSS= 1496644 KB
36 Use= 30809 KB; Tot= 996964 MB in 675939936 mlocs; avg= 1546 Bytes; RSS= 1526676 KB
37 Use= 31062 KB; Tot= 1028279 MB in 697063059 mlocs; avg= 1546 Bytes; RSS= 1562268 KB
38 Use= 24174 KB; Tot= 1059557 MB in 718186182 mlocs; avg= 1546 Bytes; RSS= 1600940 KB
39 Use= 53107 KB; Tot= 1091100 MB in 739309305 mlocs; avg= 1547 Bytes; RSS= 1648816 KB
40 Use= 49938 KB; Tot= 1122467 MB in 760432428 mlocs; avg= 1547 Bytes; RSS= 1691080 KB
41 Use= 46868 KB; Tot= 1153824 MB in 781555551 mlocs; avg= 1548 Bytes; RSS= 1727576 KB
42 Use= 34891 KB; Tot= 1185273 MB in 802678674 mlocs; avg= 1548 Bytes; RSS= 1768288 KB
43 Use= 45792 KB; Tot= 1216750 MB in 823801797 mlocs; avg= 1548 Bytes; RSS= 1803180 KB
44 Use= 43415 KB; Tot= 1248111 MB in 844924920 mlocs; avg= 1548 Bytes; RSS= 1837620 KB
45 Use= 28845 KB; Tot= 1279477 MB in 866048043 mlocs; avg= 1549 Bytes; RSS= 1878004 KB
46 Use= 44455 KB; Tot= 1310857 MB in 887171166 mlocs; avg= 1549 Bytes; RSS= 1924560 KB
47 Use= 33749 KB; Tot= 1342241 MB in 908294289 mlocs; avg= 1549 Bytes; RSS= 1948848 KB
48 Use= 72276 KB; Tot= 1373783 MB in 929417412 mlocs; avg= 1549 Bytes; RSS= 2006544 KB
49 Use= 19989 KB; Tot= 1405025 MB in 950540535 mlocs; avg= 1549 Bytes; RSS= 2044352 KB
50 Use= 39962 KB; Tot= 1436521 MB in 971663658 mlocs; avg= 1550 Bytes; RSS= 2081440 KB
51 Use= 29590 KB; Tot= 1467994 MB in 992786781 mlocs; avg= 1550 Bytes; RSS= 2113744 KB
52 Use= 13526 KB; Tot= 1498946 MB in 1013909904 mlocs; avg= 1550 Bytes; RSS= 2155376 KB
53 Use= 77761 KB; Tot= 1528518 MB in 1035033027 mlocs; avg= 1548 Bytes; RSS= 2204100 KB
54 Use= 38619 KB; Tot= 1558158 MB in 1056156150 mlocs; avg= 1546 Bytes; RSS= 2241416 KB
55 Use= 23969 KB; Tot= 1587494 MB in 1077279273 mlocs; avg= 1545 Bytes; RSS= 2297052 KB
56 Use= 29135 KB; Tot= 1615993 MB in 1098402396 mlocs; avg= 1542 Bytes; RSS= 2322656 KB
57 Use= 14744 KB; Tot= 1642834 MB in 1119525519 mlocs; avg= 1538 Bytes; RSS= 2380808 KB
58 Use= 37350 KB; Tot= 1669812 MB in 1140648642 mlocs; avg= 1535 Bytes; RSS= 2421248 KB
59 Use= 29670 KB; Tot= 1696863 MB in 1161771765 mlocs; avg= 1531 Bytes; RSS= 2467552 KB
60 Use= 33910 KB; Tot= 1723848 MB in 1182894888 mlocs; avg= 1528 Bytes; RSS= 2510452 KB
61 Use= 22508 KB; Tot= 1750868 MB in 1204018011 mlocs; avg= 1524 Bytes; RSS= 2556944 KB
62 Use= 29313 KB; Tot= 1777802 MB in 1225141134 mlocs; avg= 1521 Bytes; RSS= 2606656 KB
63 Use= 40402 KB; Tot= 1804851 MB in 1246264257 mlocs; avg= 1518 Bytes; RSS= 2650728 KB
64 Use= 22656 KB; Tot= 1831800 MB in 1267387380 mlocs; avg= 1515 Bytes; RSS= 2705460 KB
65 Use= 32601 KB; Tot= 1858785 MB in 1288510503 mlocs; avg= 1512 Bytes; RSS= 2753124 KB
66 Use= 31599 KB; Tot= 1885722 MB in 1309633626 mlocs; avg= 1509 Bytes; RSS= 2795260 KB
67 Use= 31689 KB; Tot= 1912719 MB in 1330756749 mlocs; avg= 1507 Bytes; RSS= 2844276 KB
68 Use= 37001 KB; Tot= 1939689 MB in 1351879872 mlocs; avg= 1504 Bytes; RSS= 2889500 KB
69 Use= 28323 KB; Tot= 1966634 MB in 1373002995 mlocs; avg= 1501 Bytes; RSS= 2926036 KB
70 Use= 29491 KB; Tot= 1993571 MB in 1394126118 mlocs; avg= 1499 Bytes; RSS= 2976636 KB
71 Use= 26947 KB; Tot= 2020519 MB in 1415249241 mlocs; avg= 1497 Bytes; RSS= 3014952 KB
72 Use= 17239 KB; Tot= 2047447 MB in 1436372364 mlocs; avg= 1494 Bytes; RSS= 3073324 KB
73 Use= 30473 KB; Tot= 2074463 MB in 1457495487 mlocs; avg= 1492 Bytes; RSS= 3124024 KB
74 Use= 39885 KB; Tot= 2101399 MB in 1478618610 mlocs; avg= 1490 Bytes; RSS= 3162516 KB
75 Use= 43596 KB; Tot= 2128366 MB in 1499741733 mlocs; avg= 1488 Bytes; RSS= 3209196 KB
76 Use= 22129 KB; Tot= 2155393 MB in 1520864856 mlocs; avg= 1486 Bytes; RSS= 3255244 KB
77 Use= 16572 KB; Tot= 2182341 MB in 1541987979 mlocs; avg= 1484 Bytes; RSS= 3292692 KB
78 Use= 46382 KB; Tot= 2209275 MB in 1563111102 mlocs; avg= 1482 Bytes; RSS= 3339616 KB
79 Use= 22625 KB; Tot= 2236218 MB in 1584234225 mlocs; avg= 1480 Bytes; RSS= 3399440 KB
80 Use= 22501 KB; Tot= 2263151 MB in 1605357348 mlocs; avg= 1478 Bytes; RSS= 3439676 KB
81 Use= 43840 KB; Tot= 2290141 MB in 1626480471 mlocs; avg= 1476 Bytes; RSS= 3485452 KB
82 Use= 40179 KB; Tot= 2317181 MB in 1647603594 mlocs; avg= 1474 Bytes; RSS= 3537896 KB
83 Use= 24087 KB; Tot= 2344114 MB in 1668726717 mlocs; avg= 1472 Bytes; RSS= 3575616 KB
84 Use= 27894 KB; Tot= 2370959 MB in 1689849840 mlocs; avg= 1471 Bytes; RSS= 3619556 KB
85 Use= 32449 KB; Tot= 2397944 MB in 1710972963 mlocs; avg= 1469 Bytes; RSS= 3663028 KB
86 Use= 31990 KB; Tot= 2424924 MB in 1732096086 mlocs; avg= 1468 Bytes; RSS= 3714196 KB
87 Use= 29734 KB; Tot= 2451887 MB in 1753219209 mlocs; avg= 1466 Bytes; RSS= 3765700 KB
88 Use= 21477 KB; Tot= 2478822 MB in 1774342332 mlocs; avg= 1464 Bytes; RSS= 3812028 KB
89 Use= 31528 KB; Tot= 2505869 MB in 1795465455 mlocs; avg= 1463 Bytes; RSS= 3855904 KB
90 Use= 11925 KB; Tot= 2531477 MB in 1816588578 mlocs; avg= 1461 Bytes; RSS= 3896524 KB
91 Use= 17978 KB; Tot= 2556560 MB in 1837711701 mlocs; avg= 1458 Bytes; RSS= 3939464 KB
92 Use= 26038 KB; Tot= 2581663 MB in 1858834824 mlocs; avg= 1456 Bytes; RSS= 3990928 KB
93 Use= 12648 KB; Tot= 2606762 MB in 1879957947 mlocs; avg= 1453 Bytes; RSS= 4031232 KB
94 Use= 3947 KB; Tot= 2631876 MB in 1901081070 mlocs; avg= 1451 Bytes; RSS= 4073580 KB
95 Use= 5434 KB; Tot= 2656931 MB in 1922204193 mlocs; avg= 1449 Bytes; RSS= 4121788 KB
96 Use= 8961 KB; Tot= 2678718 MB in 1943327316 mlocs; avg= 1445 Bytes; RSS= 4166488 KB
97 Use= 1946 KB; Tot= 2699041 MB in 1964450439 mlocs; avg= 1440 Bytes; RSS= 4207876 KB
98 Use= 8899 KB; Tot= 2719379 MB in 1985573562 mlocs; avg= 1436 Bytes; RSS= 4246532 KB
99 Use= 8262 KB; Tot= 2739709 MB in 2006696685 mlocs; avg= 1431 Bytes; RSS= 4287944 KB
100 Use= 9089 KB; Tot= 2760049 MB in 2027819808 mlocs; avg= 1427 Bytes; RSS= 4324704 KB
101 Use= 18769 KB; Tot= 2780388 MB in 2048942931 mlocs; avg= 1422 Bytes; RSS= 4366296 KB

Unlike the earlier version, this one also counts strdup correctly (last time it counted strdup sizes but not how often it was called.)

linas Jan 15, 2023
Maintainer Author

Graphs

I think mem usage declines after 90 because some of the threads are finishing. This should drop to zero after all threads have finished. We are not counting dictionary use, the totals were zeroed after dict was loaded.

Above maxes out at 2.7 terabyes malloced and freed. That's a lot of malloc and free!

Iteration 1 is the english dict, iteration 2 zeros the count iteration 3 is the russian dict, iteraction 4 zeros the count, iteration 5 is when all the unit test threads are running. I have no clue why average bytes per malloc slowly decline after the 50th iteration. Recall loop count in multi-thread.c was changed to 50000 so the 50th memory report is 1/2 of the way into 50K loops, or about 25K loop iterations of multi-thread.c .Why does it start shrinking?

RSS gets up to almost 4.5GB even though link-grammar is sitting at about 60MB of dicts and about 30MB in use in all of the threads. I assume RSS is growing due to fragmentation ... but why does it not slow down?

The above shows RSS at that moment in time divided by total amount of mallocs done, at that point in time. So 0.001 means that RSS has grown by about 1GB after mallocing/freeing 1TB

Note the minimum at 50 is also where the average bytes/alloc starts falling. I don't know why these should be related.

linas Jan 16, 2023
Maintainer Author

~~Oh, the above contains an error. I forgot to count strdups as mallocs. I did measure strdup sizes, just forgot to count how many of them. Ooops. Will repost data.~~

Fixed.

ampli · 2023-01-16T00:59:23Z

ampli
Jan 16, 2023
Collaborator

I cannot understand how a significant memory fragmentation can be created, as for each sentence the same data structures are mostly of the same size, so free blocks can be just reused instead of splitting bigger free blocks. Some data structures are of "random" size, and may be reallocated frequently while tokenizing/parsing (like the tokenizer alternatives and wordgraph arrays or the link arrays), but there are relatively small.

Is it possible to add an allocated sizes histogram?

Regarding virtual memory size, there is a potential problem in the pool allocator when it is requested to allocated power-of-2 block sizes.
E.g. in preparations.c (the sizes of Disjunct and Connector is also a power-of-2):

	sent->Disjunct_pool = pool_new(__func__, "Disjunct",
	                   /*num_elements*/2048, sizeof(Disjunct),
	                   /*zero_out*/false, /*align*/false, /*exact*/false);
	sent->Connector_pool = pool_new(__func__, "Connector",
	                   /*num_elements*/8192, sizeof(Connector),
	                   /*zero_out*/true, /*align*/false, /*exact*/false);

If malloc adds its own info in the allocated block, and it itself allocated power-of-2 blocks, then it would need to allocated blocks that are double in size, when most of them is not used. It may be interesting to change pool_new() to allocate slightly smaller blocks when it is requested to allocate a power-of-2 block sizes, and see if the virtual memory use is getting smaller.

In any case, the current memory allocation can be improved in many places to prevent many small allocations that grow on each added element. E.g. in the tokenizer. This will also increase the total speed by a few percents.

4 replies

linas Jan 16, 2023
Maintainer Author

I don't understand the RSS growth either. You can try it, edit tests/multi-thread.cc and change line 142 to 50K, and then just watch. Maybe it's different for different OS'es.
Once upon a time, malloc needed 3 words, for pointers, per allocated block.
I'll try histogramming. There will be two: what the user asked for, and what malloc_usable_size() reports. Actually, four: I'll split out strdup.

linas Jan 16, 2023
Maintainer Author

Here's are two histograms. First is number of allocation vs requested size. Second is number of allocation vs malloc_usable_size() which is equal to or larger than requested size.

Note the log scale. There are ten strdups of strings length zero (i.e. one byte) 2.5 million for length 1 strings. etc. See below for details. The mallocs and reallocs are at multiples of 4 and 8 bytes; these mostly all structs, not strings.

Below is a histrogram of the sizes of what malloc actually returns

The smallest blob of usable memory is 24 bytes in size. This is what you get, even if you ask for a string of size 4. The next peak is at 40, and then at 56 and 72, 88,104 .... looks like steps of 16. That is, malloc is NOT returning power-of-2 blocks. Some interesting things happen, see below.

These are for niter=5K in test/multi-thread.c so this is 1/10th of the number of loops of the earlier data dump. Also, this is after the dictionary load. So, during parsing. Its a mixture of a 10 good English sentences, 2 good russian sentences, four lines of Chinese, 4 lines of invalid utf8 garbage. These are all parsed with both the English and the russian dicts.

linas Jan 16, 2023
Maintainer Author

Here's the start of the histogram dataset. It shows how many mallocs there were of various sizes.


#
# row size-bin malloc-ask malloc-usable relloc-ask realloc-usable strdup-ask strdup-usable
#
0	1	0	0	0	0	10	0
1	2	0	0	0	0	2627626	0
2	3	0	0	0	0	1408670	0
3	4	28496	0	0	0	462982	0
4	5	0	0	0	0	477475	0
5	6	0	0	0	0	171014	0
6	7	0	0	0	0	325945	0
7	8	1864819	0	0	0	245485	0
8	9	0	0	0	0	150482	0
9	10	0	0	0	0	80747	0
10	11	0	0	0	0	13249	0
11	12	0	0	0	0	47474	0
12	13	0	0	0	0	24980	0
13	14	0	0	0	0	15005	0
14	15	0	0	0	0	24980	0
15	16	5898226	0	6227818	0	22505	0
16	17	0	0	0	0	0	0
17	18	0	0	0	0	13000	0
18	19	0	0	0	0	12480	0
19	20	0	0	0	0	2000	0
20	21	0	0	0	0	2500	0
21	22	0	0	0	0	0	0
22	23	0	0	0	0	0	0
23	24	99260877	97536880	2038507	7601272	0	6019434
24	25	0	0	0	0	0	0
25	26	0	0	0	0	0	0
26	27	0	0	0	0	0	0
27	28	0	0	0	0	11250	0
28	29	0	0	0	0	2500	0
29	30	0	0	0	0	10	0
30	31	0	0	0	0	0	0
31	32	4979009	0	3023257	0	24980	0
32	33	0	0	0	0	2500	0
33	34	0	0	0	0	0	0
34	35	0	0	0	0	0	0
35	36	0	0	0	0	0	0
36	37	0	0	0	0	0	0
37	38	0	0	0	0	0	0
38	39	0	0	0	0	0	0
39	40	4824323	19267968	2531181	6126325	0	150414
40	41	0	0	0	0	0	0
41	42	0	0	0	0	0	0
42	43	0	0	0	0	0	0
43	44	0	0	0	0	0	0
44	45	0	0	0	0	0	0
45	46	0	0	0	0	0	0
46	47	0	0	0	0	0	0
47	48	1193302	0	19187209	0	0	0
48	49	0	0	0	0	0	0
49	50	0	0	0	0	0	0
50	51	0	0	0	0	0	0
51	52	25000	0	0	0	0	0
52	53	0	0	0	0	0	0
53	54	0	0	0	0	0	0
54	55	0	0	0	0	0	0
55	56	1293052	2577839	1859507	21087936	0	1
56	57	0	0	0	0	0	0
57	58	0	0	0	0	0	0
58	59	0	0	0	0	0	0
59	60	39248	0	0	0	0	0
60	61	0	0	0	0	0	0
61	62	0	0	0	0	0	0
62	63	0	0	0	0	0	0
63	64	1261064	0	1792507	0	0	0
64	65	0	0	0	0	0	0
65	66	0	0	0	0	0	0
66	67	0	0	0	0	0	0
67	68	0	0	0	0	0	0
68	69	0	0	0	0	0	0
69	70	0	0	0	0	0	0
70	71	0	0	0	0	0	0
71	72	1371283	2519781	3299684	4374388	0	0
72	73	0	0	0	0	0	0

Here's the tail end

972	3432	0	0	0	0	0	0
973	3440	0	0	0	0	0	0
974	3448	0	0	0	0	0	0
975	3456	1250	0	0	0	0	0
976	3464	1	1251	0	0	0	0
977	3472	0	0	0	0	0	0
978	3480	0	0	0	0	0	0
979	3488	0	0	0	0	0	0
980	3496	25000	24996	0	0	0	0
981	3504	0	0	0	0	0	0
982	3512	0	4	0	0	0	0
983	3520	250	0	0	0	0	0
984	3528	0	250	0	0	0	0
985	3536	0	0	0	0	0	0
986	3544	0	0	0	0	0	0
987	3552	1250	0	0	0	0	0
988	3560	0	1249	0	0	0	0
989	3568	0	0	0	0	0	0
990	3576	0	1	0	0	0	0
991	3584	0	0	0	0	0	0
992	3592	0	0	0	0	0	0
993	3600	0	0	0	0	0	0
994	3608	0	0	0	0	0	0
995	3616	250	0	0	0	0	0
996	3624	0	247	0	0	0	0
997	3632	0	0	0	0	0	0
998	3640	0	3	0	0	0	0
999	3648	1705001	1720751	93500	93500	0	0

Everything 3648 bytes or large goes into the last bin.

linas Jan 16, 2023
Maintainer Author

Here's something near the end:

953	3280	0	0	0	0	0	0
954	3288	2496	2493	0	0	0	0
955	3296	0	0	0	0	0	0
956	3304	0	3	0	0	0	0
957	3312	0	0	0	0	0	0
958	3320	0	0	0	0	0	0
959	3328	249	0	0	0	0	0
960	3336	0	247	0	0	0	0
961	3344	0	0	0	0	0	0
962	3352	0	2	0	0	0	0
963	3360	39248	0	0	0	0	0
964	3368	1	39183	0	0	0	0
965	3376	0	0	0	0	0	0
966	3384	0	66	0	0	0	0
967	3392	0	0	0	0	0	0
968	3400	1	1	0	0	0	0

Column 3 says we asked for 2496 blocks of size 3288. Malloc responded by giving us only 2493 of them, at that size, and 3 more at size 3304, so 16 bytes bigger than what we asked for,

When we asked for 249 blocks of size 3328, we got 247 of the next size up, plus two more.

When we asked for 39248 blocks of size 3360 .. we got most of them, and a few that were larger.

linas · 2023-01-16T10:08:43Z

linas
Jan 16, 2023
Maintainer Author

FYI Blog entry describing glibc memory fragmentation and the reasons for it. https://blog.arkey.fr/drafts/2021/01/22/native-memory-fragmentation-with-glibc/

Examines two alternatives: tcmalloc and jemalloc. Some key takeaways, quoting from the blog:

jemalloc (used by Facebook) maintains a cache per thread
tcmalloc (from Google) maintains a pool of caches, and threads develop a “natural” affinity for a cache, but may change

An important difference in terms of thread management.

jemalloc is faster if threads are static, for example using pools
tcmalloc is faster when threads are created/destructed

There is also the problem that since jemalloc spin new caches to accommodate new thread ids, having a sudden spike of threads will leave you with (mostly) empty caches in the subsequent calm phase.

As a result, I would recommend tcmalloc in the general case, and reserve jemalloc for very specific usages (low variation on the number of threads during the lifetime of the application).

4 replies

linas Jan 16, 2023
Maintainer Author

Debian has libtcmalloc-minimal4 and no other tcmlalloc packages. I hacked configure.ac and tried

--- a/configure.ac
+++ b/configure.ac
@@ -155,6 +155,8 @@ dnl If the visibility __attribute__  is supported, define HAVE_VISIBILITY
 dnl and a variable CFLAG_VISIBILITY, to be added to CFLAGS/CXXFLAGS.
 LG_C_ATTRIBUTE_VISIBILITY
 
+LIBS="$LIBS -ltcmalloc_minimal"
+
 # ====================================================================
 # Check for specific OSs

but the persistent RSS growth remains. BTW, less /proc/pid-of-test/smaps and search for heap shows the heap section size. So yes, its that.

linas Jan 16, 2023
Maintainer Author

With LIBS="$LIBS -ljemalloc" the python unit tests fail instantly, and dict-reopen and mem-leak both segfault in the jemalloc initialization code. However, multi-thread does run! However, the TSS growth continues, just as before. WTF.

linas Jan 16, 2023
Maintainer Author

Periodically inserting malloc_stats(); into the program shows that malloc itself thinks mem usage is growing. For example:

Total (incl. mmap):
system bytes     = 1722998784
in use bytes     = 1566213472
max mmap regions =          7
max mmap bytes   =   49586176

is printed just when top shows RSS of 1.7GB ... so apparently, malloc thinks the memory really is being used. But where???

linas Jan 16, 2023
Maintainer Author

BTW, This blog entry suggests some reasonable malloc tuning ideas: https://developers.redhat.com/blog/2017/03/02/malloc-internals-and-you#thread_local_cache

linas · 2023-01-16T20:06:05Z

linas
Jan 16, 2023
Maintainer Author

It's the freakin spell-guesser. Setting parse_options_set_spell_guess(opts[i], 0); in the unit test eliminates the memory blowup completely.

This can't be our fault, at least not directly. The instrumented debug code, which overloads malloc and free, does not find any problems in spellcheck-aspell.c (which is what is being used on my system). That is, the mallocs and fees in that file are balanced; there are not too many of either one. So ...???

4 replies

linas Jan 16, 2023
Maintainer Author

I've got a simple unit test. I'll stuff it into git, shortly. It leaks 64 bytes per spelling suggestion. Zooms up to a gigabyte of RSS in a few minutes.

linas Jan 16, 2023
Maintainer Author

The below leaks. It appears to be 100% correct, according to the aspell documentation. The AspellWordList is not freed, because the documentation says not to free it.

#include <malloc.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <aspell.h>

// Compile with
// cc aspell-memleak.c -laspell

int main()
{
	AspellConfig *config = new_aspell_config();

	aspell_config_replace(config, "lang", "en_US");

	AspellCanHaveError *spell_err = new_aspell_speller(config);
	AspellSpeller *speller = to_aspell_speller(spell_err);

	size_t k=0;
	char* word = "asdf";
	for (int l=0; l<1000000; l++)
	{
		/* Returns 1 is the word is in dict. */
		int found = aspell_speller_check(speller, word, -1);
		// printf("Found the word: %d\n", found);

		const AspellWordList *list = aspell_speller_suggest(speller, word, -1);
		AspellStringEnumeration *elem = aspell_word_list_elements(list);
		unsigned int size = aspell_word_list_size(list);

		const char *aword = NULL;
		while ((aword = aspell_string_enumeration_next(elem)) != NULL)
		{
			// printf("Spell suggesion: %s\n", aword);
			k++;
		}
		delete_aspell_string_enumeration(elem);

		if (0 == l%20000)
		{
			printf("Loop count= %d spell suggests= %lu\n", l, k);
			malloc_stats();
		}
	}

	delete_aspell_speller(speller);
	delete_aspell_config(config);
}

linas Jan 17, 2023
Maintainer Author

I posted the above code to the aspell-users mailing list. https://lists.gnu.org/archive/html/aspell-user/2023-01/msg00000.html

I'm not expecting a reply, though; The mailing list is dead: 10 emails over the last 3 years, not a single one has gotten a reply. There's no way to .. report bugs.

linas Jan 17, 2023
Maintainer Author

Oh wait, I was wrong: https://github.com/GNUAspell/aspell I just now opened GNUAspell/aspell#632 ... but again ... project appears to be dead. There are 15 unmerged pull reqs, and new issues don't seem to get discussion ...

linas · 2023-01-17T08:24:38Z

linas
Jan 17, 2023
Maintainer Author

I disabled aspell by default in #1376 -- I consider this issue closed. I'm happy that the resolution was this easy.

0 replies

Resolved: Aspell memleak; was: Memory fragmentation #1373

linas Jan 15, 2023 Maintainer

Replies: 5 comments · 15 replies

linas Jan 15, 2023 Maintainer Author

linas Jan 15, 2023 Maintainer Author

linas Jan 15, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

ampli Jan 16, 2023 Collaborator

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 16, 2023 Maintainer Author

linas Jan 17, 2023 Maintainer Author

linas Jan 17, 2023 Maintainer Author

linas Jan 17, 2023 Maintainer Author

linas
Jan 15, 2023
Maintainer

Replies: 5 comments 15 replies

linas
Jan 15, 2023
Maintainer Author

linas Jan 15, 2023
Maintainer Author

linas Jan 15, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

ampli
Jan 16, 2023
Collaborator

linas Jan 16, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

linas
Jan 16, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

linas
Jan 16, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

linas Jan 16, 2023
Maintainer Author

linas Jan 17, 2023
Maintainer Author

linas Jan 17, 2023
Maintainer Author

linas
Jan 17, 2023
Maintainer Author