Incremental saving approach #2

sshivam95 · 2024-06-11T08:52:28Z

From the issue #1 comment, this approach will use incremental saving on pickle files. It will create a dictionary in main memory upto a threshold triples, e.g., 10 million (1 chunk), then dump it all in a pickle file.

The text was updated successfully, but these errors were encountered:

sshivam95 · 2024-06-11T10:04:42Z

Issue: To update the pickle file, there is no direct functionality to update the file itself. To update the data in the file, it needs to be loaded first in a variable and then updated with the new data. This results in the same RAM overshooting problem.

sshivam95 · 2024-06-11T10:05:41Z

Alternative solution: using a (key, value) database like shelve to store the indices. Commit

New tests are running which worked successfully on small portions of the dataset (745 million triples). However, the reading of the whole dataset is very slow. A current test run on the whole dataset is running for 3 days and still has not read 5% of the data.

sshivam95 · 2024-06-11T10:08:55Z

Usage of mmappickle.mmapdict showed progress on smaller triple size file #4

sshivam95 · 2024-06-11T10:10:19Z

Issues with mmapickle.mmapdict

sshivam95 closed this as completed Jun 11, 2024

sshivam95 changed the title ~~Noctua_2_WHALE_Memory_Mapping_RDFa_10M-7844215 stopped working (float issue)~~ Incremental saving approach Jun 11, 2024

sshivam95 mentioned this issue Jun 11, 2024

Main memory overloading when training using DICE-embeddings library #1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental saving approach #2

Incremental saving approach #2

sshivam95 commented Jun 11, 2024 •

edited

Loading

sshivam95 commented Jun 11, 2024

sshivam95 commented Jun 11, 2024

sshivam95 commented Jun 11, 2024

sshivam95 commented Jun 11, 2024

Incremental saving approach #2

Incremental saving approach #2

Comments

sshivam95 commented Jun 11, 2024 • edited Loading

sshivam95 commented Jun 11, 2024

sshivam95 commented Jun 11, 2024

sshivam95 commented Jun 11, 2024

sshivam95 commented Jun 11, 2024

sshivam95 commented Jun 11, 2024 •

edited

Loading