Skip to content

Conversation

bmschmidt
Copy link
Member

Once merged, MySQL is done with. With bigrams restored, I think it's pretty close to being ready.

@organisciak
Copy link
Member

I trust you to do this merge, since you have the freshest understanding of the code. Perhaps loop in HTRC people like @borice?

How does DuckDB perform?

@bmschmidt
Copy link
Member Author

This is not yet completely ready for review, but close enough that I want to put it in tracking.

I'm still generally finding duckdb to work at, oh about 1.5x faster on standard queries on the Rate My Professor bookworm, and much faster on ingest. I just made a major change to the sort code though, by letting duckdb handle the word sorting (the stage that used to be index building in mysql, so often 6-12 hours.)

Duckdb has also just added forms of compression on numbers that drop the disk space requirements compared to MySQL significantly--rough guess, databases should be one-third the size they were with MySQL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants