-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Most obvious CSV data is two years out of date #343
Comments
- Fixes ossf#343 Signed-off-by: nathannaveen <[email protected]>
I've rearranged the objects in the bucket - does this help? |
Are the files in "archive" the same 2020 files? It helps in that the old files are now in "archive", but now their dates are 2023, which is itself misleading. Is there a reason to keep the old files at all? Why not produce "top 200" files for current data? |
I've put them in a folder roughly correlating to the date they were originally created. As for producing "top 200" files for current data - I'm interested in how you might be using these. I had been leaning towards not producing top-200 sets for each language group, and just supplying a script for producing them locally. However if the top-200 sets are providing value, I'm more than happy to work on getting these produced automatically. |
TBH, I'm new to this data set, and am not sure how I would use the data. I wrote this issue as some feedback from a new user trying to understand the data set. The link from the README sounds enticing, then I am looking at a raw web server page with old files. My suggestion is simply to present the data you value in a way that makes it easy for people to find it and understand it. |
The home page says:
That page has handy per-language files, but they are dated 2020-12-30. Newer data should be made easier to find, or at least stale data should be removed as an attractive nuisance.
The text was updated successfully, but these errors were encountered: