Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 AI: wrong unit in one of the epoch datasets #3319

Merged
merged 1 commit into from
Sep 28, 2024

Conversation

veronikasamborska1994
Copy link
Contributor

No description provided.

@owidbot
Copy link
Contributor

owidbot commented Sep 20, 2024

Quick links (staging server):

Site Admin Wizard

Login: ssh owid@staging-site-epoch-bug

chart-diff: ✅ No charts for review.
data-diff: ❌ Found differences
= Dataset garden/artificial_intelligence/2024-02-15/epoch_llms
  = Table epoch_llms
    ~ Column dataset_size__tokens (changed metadata)
+       +   - |-
+       +     In the context of language models, this data size is often measured in tokens, which are chunks of text that the model processes. A 100 tokens is equivalent to around 75 words.
-       - unit: datapoints
+       + unit: tokens


Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2024-09-20 10:59:57 UTC
Execution time: 15.25 seconds

@veronikasamborska1994 veronikasamborska1994 merged commit 4d05a0c into master Sep 28, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants