Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem about the number of questions of Algebra2005 dataset #161

Open
xiangxin-oss opened this issue Dec 29, 2023 · 1 comment
Open

Problem about the number of questions of Algebra2005 dataset #161

xiangxin-oss opened this issue Dec 29, 2023 · 1 comment

Comments

@xiangxin-oss
Copy link

Hello, I known that the number of questions of Algebra2005 dataset was 210,710 from the website https://pykt-toolkit.readthedocs.io/en/latest/datasets.html#algebra2005, but I saw that the "num_q" of Algebra2005 dataset was 173113 from the "data_config.json" file in the "configs" folder.
Don't these two indicators mean the same thing?
If the index of "num_q" was true, How to calculate the "num_q" of Algebra2005 dataset?

@Li-XYi
Copy link
Collaborator

Li-XYi commented Oct 3, 2024

Thank you for your question! The number 210,710 represents the original number of questions in the Algebra2005 dataset. However, after applying the standard data processing pipeline of pykt, only 173,113 questions remain as the valid ones, which is why the “num_q” value in the data_config.json file is different.

I hope this clears up the confusion! Let me know if you have any more questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants