Move query-independent tasks out of the postprocessing step #138

Willem3141 · 2024-05-20T20:35:28Z

How searches work:

getResponsesFor()

preprocesses the query;
splits it into words;
iterates over words and looks them up individually;
postprocesses the results.

Postprocessing currently does two types of things:

things that are dependent of the specific query the user searched for, for example, the conjugation stuff;
things that are actually independent of the specific query.

The latter gets done in the postprocessing step, because the design of Reykunyu prohibits us from actually editing the dictionary directly. The reason for this is that whenever words actually get edited by the user, using the editor, we save the change by serializing dictionary back to disk.

This is not a great design. Firstly, if we accidentally do edit the dictionary, then these changes ‘leak’ into the JSON file. (This type of bug has happened in the past.) Secondly, as mentioned before, because we cannot edit the dictionary, we have to do things in the postprocessing step that we could execute perfectly well for all words on starting Reykunyu. This includes things such as adding IPA for FN/RN, which is not very costly, but still conceptually annoying. It also includes searching for derived words and sentence search; this is very costly and therefore we do actually preprocess this, but for that we separately maintain derivedWords and sentencesForWord and this is just overly complex.

Conclusion: we should do this differently. Read dictionary from JSON, then immediately iterate over all words and precompute everything we can, storing the results back in dictionary. Then for the editor, whenever the user would like to make an edit, read the thing straight from the JSON and save it back straight to JSON, without even going through reykunyu.js (server.js should handle this on its own).

This issue came up specifically as a sub-issue of #105 / #133, because we now need to store FN/RN/raw forms of each words, so precomputing these things is essential (otherwise we'd again have to add this to the postprocessing step, but we actually also need these FN forms for resolving word links so this would become a giant mess).

The text was updated successfully, but these errors were encountered:

Word data that is independent of the user query is now processed when the dictionary file is loaded, on Reykunyu startup, instead of every time the user does a query. Fixes #138.

Willem3141 added server Having to do with Reykunyu's server side (the word lookup code and API) cleanup Something that should be improved labels May 20, 2024

Willem3141 changed the title ~~Refactor the~~ Refactor: move away query-independent tasks from the postprocessing step May 20, 2024

Willem3141 changed the title ~~Refactor: move away query-independent tasks from the postprocessing step~~ Refactor: move query-independent tasks out of the postprocessing step May 20, 2024

Willem3141 changed the title ~~Refactor: move query-independent tasks out of the postprocessing step~~ Move query-independent tasks out of the postprocessing step May 20, 2024

Willem3141 mentioned this issue May 24, 2024

Move word data processing out of the postprocessing step #140

Merged

Willem3141 linked a pull request May 28, 2024 that will close this issue

Move word data processing out of the postprocessing step #140

Merged

Willem3141 mentioned this issue May 29, 2024

Finish and release Zeykerokyu #143

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move query-independent tasks out of the postprocessing step #138

Move query-independent tasks out of the postprocessing step #138

Willem3141 commented May 20, 2024 •

edited

Loading

Move query-independent tasks out of the postprocessing step #138

Move query-independent tasks out of the postprocessing step #138

Comments

Willem3141 commented May 20, 2024 • edited Loading

Willem3141 commented May 20, 2024 •

edited

Loading