Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse StringGazetteer Object #188

Open
luisenriqueramos1977 opened this issue Apr 16, 2023 · 5 comments
Open

Reuse StringGazetteer Object #188

luisenriqueramos1977 opened this issue Apr 16, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@luisenriqueramos1977
Copy link

Is your feature request related to a problem? Please describe.

I currently can use a text file with a list of terms to create a StringGazetteer, which I can use without any problem.
However, as this is part of a repetitive process, I would like to have the possibility of storing the StringGazetteer object to reuse it,
and I wonder if such a feature is available?, or if there is any other approach to reach this goal?.

Luis Ramos

@luisenriqueramos1977 luisenriqueramos1977 added the enhancement New feature or request label Apr 16, 2023
@johann-petrak
Copy link
Collaborator

Thank you for this feature request!

I think it would be a good idea to be able to store and load a gazetteer instance, however currently this is not immediately possible and the code would need some work to make it possible to e.g. pickle the instance.

The main issue I see at the moment is that in order to be able to pickle the object, any lambdas must be replaced with callables that can be pickled, or by plugging into the pickle process by implementing our own __getstate__ and __setstate__ methods.

@johann-petrak
Copy link
Collaborator

Actually, I think you could try doing this with the cloudpickle package:
Install the package into your environment then do something like this to save and restore the gazetteer instance:

Save:

import cloudpickle
with open("gaz1.pkl", "wb") as outf:
    cloudpickle.dump(gaz1, outf)

Load:

with open("gaz1.pkl", "rb") as inf:
    gaz1 = cloudpickle.load(inf)

Does that work for you?

@johann-petrak
Copy link
Collaborator

johann-petrak commented Apr 17, 2023

Alternately, you could try using the dill package in a similar way.

@luisenriqueramos1977
Copy link
Author

luisenriqueramos1977 commented May 30, 2023 via email

@johann-petrak
Copy link
Collaborator

johann-petrak commented May 30, 2023

It seems there is a blank space in the list,

What do you mean by that? Would you be able to cut the list down to the shortest list that produces that error and share it with me either heere, or by private email?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants