Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Metadata's _extend missing when saving Metadata to disk (in Json) #136

Open
MooooCat opened this issue Feb 5, 2024 · 1 comment
Open
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@MooooCat
Copy link
Contributor

MooooCat commented Feb 5, 2024

Description

Metadata's _extend is missing when saving Metadata to disk (in Json).

Reproduce

# import packages 
import pandas as pd 
from pathlib import Path
from sdgx.data_models.metadata import Metadata
from sdgx.utils import download_demo_data

# get a metadata, I use a demo dataset 
# every dataset is OK 
p = download_demo_data()
df = pd.read_csv(p)
m = Metadata.from_dataframe(df)

# I add a k-v pair 
# this will add the the  `.extend`  field 
m.add('a', "something") 
# then save the model 
m.save(Path('here.json'))

print(m.get('a'))
"""The output is:
{'something'}
"""

# load the model from disk 
n = Metadata.load(Path("here.json"))
# the value "something" is missing
print(n.get('a'))
"""The output is:
set()
"""
# the `_extend`is empty 
print(n._extend)
''' The output is :
defaultdict(<class 'set'>, {})
'''

Expected behavior

# load the model from disk 
n = Metadata.load(Path("here.json"))
# the value "something" is missing
print(n.get('a'))
"""The expected output should be:
{'something'}

Context

  • Operating System and version: macOS 14.2.1(23C71)+ python 3.9
  • Browser and version(if necessary): -
  • Which version are you using: 0.1.5 / 0.1.6 dev0

I initially think this bug is related to the model_dump_json() method in pydantic.BaseModel.

The json str output by this method does not contain _extend.

Maybe it is related to the fact that _extend is a private member of the class ?

@MooooCat MooooCat added bug Something isn't working help wanted Extra attention is needed labels Feb 5, 2024
@MooooCat MooooCat self-assigned this Feb 5, 2024
@Wh1isper
Copy link
Collaborator

Wh1isper commented Feb 7, 2024

Nice catch!

The _extend contains a lot of information, perhaps we could add a new field to let the user choose which attributes should be saved.

We could start by saving all attributes, given that they shouldn't be too big at the moment.

@MooooCat MooooCat changed the title Metadata's _extend is missing when saving Metadata to disk (in Json) [Bug] Metadata's _extend missing when saving Metadata to disk (in Json) Feb 26, 2024
c3kimball added a commit to sam3179/synthetic-data-generator that referenced this issue Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants