Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple webserver workers has different db state for FlatFile #32

Open
rewiaca opened this issue May 1, 2021 · 4 comments
Open

Multiple webserver workers has different db state for FlatFile #32

rewiaca opened this issue May 1, 2021 · 4 comments
Labels
resolved Issue has been resolved but remain open for reference

Comments

@rewiaca
Copy link

rewiaca commented May 1, 2021

First of all, great lib for small prods and dev without installing and handling mongod or old-fashioned sqlite3!

Having a problem using supervisor with multiple workers, so basically running several instances of the same python script that connects to database, writing and reading it.
The problem is that every worker has different version of database. My config:

from montydb import MontyClient
client = MontyClient("data")
client.cache_modified = 1
db = client.db

With cache_modified = 0 also the same problem.
I think that montydb stores database is memory and consider FlatFile as a cache, so turning cache_modified to 1 would help, but not. Maybe the problem has another logic?

@davidlatwe
Copy link
Owner

Hey @rewiaca , thanks for trying !

I think the problem is that the FlatFile storage engine doesn't have any file lock so it's not multiple processes safe.
Maybe you could try SQLite engine or LMDB ?

@davidlatwe
Copy link
Owner

Also, I don't think this line could actually set the config

client.cache_modified = 1

@rewiaca
Copy link
Author

rewiaca commented May 1, 2021

Hey @rewiaca , thanks for trying !

I think the problem is that the FlatFile storage engine doesn't have any file lock so it's not multiple processes safe.
Maybe you could try SQLite engine or LMDB ?

Nice, LMDB work great! Thanks, just installed lmdb through pip and that's all. As I understand, each worker will not hold all database in memory but it will load up from file every request, what is the difference?

Also, I don't think this line could actually set the config

client.cache_modified = 1

How to I set config properly then?
from montydb import cache_modified - doesn't work

@davidlatwe
Copy link
Owner

Glad that LMDB storage engine helps !

As I understand, each worker will not hold all database in memory but it will load up from file every request, what is the difference?

Well, the FlatFile storage engine is a really dead simple one storage engine which will re-write the whole file when the changed document count has reached cache_modified limit, it is not atomic at all. So if there is more than one worker is able to write their own work result without any lock/sync, race condition emerged.

How to I set config properly then?

Ah, I thought the README did provide those info, but it is not clear ! (README only says what config entry they have but does't say how to set them)

The config should be set by the set_storage method, as keyword arguments.

For Flatfile would be like this :

from montydb import set_storage, MontyClient
set_storage("/db/repo", storage="flatfile", cache_modified=1)
client = MontyClient("/db/repo")

For LMDB:

from montydb import set_storage, MontyClient
set_storage("/db/repo", storage="lightning", map_size=10485760)  # Maximum size database may grow to.
client = MontyClient("/db/repo")

And you should found a file named monty.storage.cfg has been saved in your db repository path, it would be /db/repo for above examples.

@davidlatwe davidlatwe added the resolved Issue has been resolved but remain open for reference label May 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolved Issue has been resolved but remain open for reference
Projects
None yet
Development

No branches or pull requests

2 participants