Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate using the CRDT data structure #1171

Open
symbiogenesis opened this issue Apr 12, 2024 · 3 comments
Open

Investigate using the CRDT data structure #1171

symbiogenesis opened this issue Apr 12, 2024 · 3 comments

Comments

@symbiogenesis
Copy link
Contributor

symbiogenesis commented Apr 12, 2024

Conflict-free replicated data types are a sort of cutting edge data structure for distributed synchronization. It is what allows things like collaborative editing on Google Docs.

In theory using this as a back-end would allow excellent performance characteristics with minimal complexity. Offloading most of the heavy lifting to the algorithm.

This approach has already been implemented in Dart.

sql_crdt, an abstract implementation for using relational databases as a data storage backend.

sqlite_crdt, an implementation using Sqlite for storage, useful for mobile or small projects.

postgres_crdt, a sql_crdt that benefits from PostgreSQL's performance and scalability intended for backend applications.

The CRDT algorithm itself was also implemented in .NET over a SingalR websocket with Yjs although it is less relevant to the needs of relational databases.

The notion of using a websocket is kind of nuts, but quite appealing. All changes could be streamed as they happen.

@VagueGit
Copy link
Contributor

My understanding is that CRDT offers eventual consistency. This doesn't work for us. Our customers expect their data to be 'immediately' consistent across all devices in their company.

We even found when we experimented with SQLite WAL, our customers were unhappy because data written to the log wasn't immediately synced. We tried forcing checkpoints without success so abandoned WAL.

I mention that to flag how sensitive some are to having all their data now.

@symbiogenesis
Copy link
Contributor Author

Interesting. Do you sync on a particular interval or during particular events?

A SignalR-based approach might be kind of nice, by comparison. Irrespective of CRDT. But conflicts in this kind of scenario are likely. The eventual consistency of CRDT, in the real world, may be no slower than the current approach.

I don't think any distributed synchronization scheme can offer anything better than eventual consistency. They can offer immediate inconsistency, and then become consistent or not, given the algorithm and network conditions.

Until faster-than-light communication via quantum entanglement is invented, that is.

I am only brainstorming.

@VagueGit
Copy link
Contributor

Ours is a conventional business app. We sync when the user clicks Save ... save a customer, save an order etc. Each company that uses our app has their own db on our server. Each company may have many client devices syncing to that db.

The server db is the source of truth. Last write wins. So we have consistency between a client and the server when each client syncs with the server.

With WAL, Save and Sync saves to the log, but syncs from the local db. Users complained they had saved on one device and the updates had not appeared on another device.

Back in the day, our customers were using our app on their LAN. They got used to consistent data across devices. When we moved away from that model we had to maintain as far as possible a similar appearance of consistency. DMS does that better than alternatives we explored.

Drifting off-topic but we exist and don't exist in a multiverse where all topics are relevant and irrelevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants