Skip to content

[BUG] Issues when multithreading with simstring #30

Closed
@youngbinkim0

Description

@youngbinkim0

Bug Description
I saw that QuickUMLS supports multithreading through the use of unqlite rather than leveldb. However, I noticed that terms are not being returned on the second instance if I run two instances of QuickUMLS referring to the same database. The issue seems to be coming from SimstringDBReader as seen in the example below. How can I ensure that multiple SimstringDBReaders can read the same database? I want to multithread QuickUMLS without having to copy the database multiple times.
 
To Reproduce

import quickumls
simstring_reader = quickumls.toolbox.SimstringDBReader(path="your_path_here/umls-simstring.db", similarity_name="jaccard", threshold=0.9)
simstring_reader.get("diabetes")

When running two instances of this code concurrently, the first one returns ('diabetes',) while the second returns ()

Environment

  • QuickUMLS version [e.g. 1.4]
  • UMLS version [e.g. 2024AA]

Additional Note
Also posted the bug report in the original repo here

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions