Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Inquiries about utilizing 2022 collected common rawl snapshots #40

Open
hyunmokky opened this issue Mar 14, 2023 · 0 comments
Open

Inquiries about utilizing 2022 collected common rawl snapshots #40

hyunmokky opened this issue Mar 14, 2023 · 0 comments

Comments

@hyunmokky
Copy link

In the paper, it is stated that CCNet conducted the study with the "common crawl snapshot in February 2019" dataset.
I want to use the Common Crawl data snapshots collected after 2022.
Is it also possible to classify Common Crawl data collected after 2022 by language using the CCNet github code?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant