Skip to content

How dictionary data and index files work

Erek Speed edited this page Dec 21, 2024 · 1 revision

Just notes about data files that I learned while cleaning up data.ts.

  • The data.dat file is just a alphabetical list of entries. It's read in as text and entries are extracted using substring method of string.
  • The index file contains a single line for each head word and a character offset for each entry in the dat file.
  • Words are found by doing a binary search in the index getting the offset and looking teh offset up in the dat file (with definflection)

Should consider just using a map if the memory usage is the same.

Clone this wiki locally