Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Datasets with CJK names & Evaluation comparisons with those datasets not displaying in UI #2567

Closed
TeoZosa opened this issue Oct 2, 2024 · 4 comments

Comments

@TeoZosa
Copy link

TeoZosa commented Oct 2, 2024

This was a weird one I ran into. At first I thought it was from datasets being too large1, but the problem stuck around even at smaller dataset sizes that work fine with English names. Hopefully the patch for this one isn't too much work.

Steps to reproduce

>>> import weave
>>> weave.init(project_name="my-project")
>>> for name in ["Deep Learning", "深度學習", "深層学習", "딥 러닝"]:
...     dataset = weave.Dataset(name=name, rows=[{"key": "value"}])
...     weave.publish(dataset)
📦 Published to https://wandb.ai/...
ObjectRef(entity='...', project='...', name='Deep-Learning', digest='...', extra=())
📦 Published to https://wandb.ai/...
ObjectRef(entity='...', project='...', name='深度學習', digest='...', extra=())
📦 Published to https://wandb.ai/...
ObjectRef(entity='...', project='...', name='深層学習', digest='...', extra=())
📦 Published to https://wandb.ai/...
ObjectRef(entity='...', project='...', name='딥-러닝', digest='...', extra=())

Behavior

Dataset

Datasets are confirmed loadable via the API

>>> import weave
>>> import weave.trace.weave_client
>>> weave.init(project_name="my-project")
>>> def fetch_dataset(dataset_ref: str) -> weave.Dataset:
...     dataset_ref_sanitized = weave.trace.weave_client.sanitize_object_name(dataset_ref)
...     dataset = weave.ref(dataset_ref_sanitized).get()
...     return dataset
>>> for name in ["Deep Learning", "深度學習", "深層学習", "딥 러닝"]:
...    print(f"{name}: {len(fetch_dataset(name).rows)}")
Deep Learning: 1
深度學習: 1
深層学習: 1
 러닝: 1  

But not viewable in the UI

Click here for screenshots Screenshot 2024-10-02 at 10 03 56 Screenshot 2024-10-02 at 10 01 55 Screenshot 2024-10-02 at 10 01 50 Screenshot 2024-10-02 at 10 01 52

Evaluation

The Evaluation comparison view tries to load and eventually errors-out

Click here for screenshots Screenshot 2024-10-02 at 9 51 55 Screenshot 2024-10-02 at 9 51 40

Footnotes

  1. [bug] "Large" Datasets can't be queried  #2566

@jamie-rasmussen
Copy link
Collaborator

Thank you very much for this and your other recent submissions, we are investigating.

@TeoZosa
Copy link
Author

TeoZosa commented Oct 2, 2024

Sounds good; thanks for jumping on this so quickly, @jamie-rasmussen!

@jamie-rasmussen
Copy link
Collaborator

Tracking internally as https://wandb.atlassian.net/browse/WB-21343

@tssweeney
Copy link
Collaborator

Hello - we have identified a fix and will deploy this week: 483ae68. This is actually a read-only issue and previously logged data should be correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants