Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve memory dimensioning for SNB training #338

Closed
folmos-at-orange opened this issue Jul 23, 2024 · 0 comments · Fixed by #362
Closed

Improve memory dimensioning for SNB training #338

folmos-at-orange opened this issue Jul 23, 2024 · 0 comments · Fixed by #362
Assignees
Labels
Priority/0 To do NOW Type/Enhancement New feature or request

Comments

@folmos-at-orange
Copy link
Contributor

folmos-at-orange commented Jul 23, 2024

Description

Currently the dimensioning of the SNB task is too conservative as it slices the training database in various ChallengeAutoML datasets. We suspect that this could be improved by better estimating the necessary memory for the task.

Questions/Ideas

  • To be studied in the ChallengeAutoML suite with a 2GB limit.
- ChallengeAutoML/Dionis : Train database was sliced. Number of slices: 2 (needs extra 119.7 MB)
- ChallengeAutoML/Flora : Train database was sliced. Number of slices: 4 (needs extra 30.7 MB)
- ChallengeAutoML/Robert : Train database was sliced. Number of slices: 2 (needs extra 10.2 MB)
- ChallengeAutoML/Tania : Train database was sliced. Number of slices: 5 (needs extra 28.0 MB)
- ChallengeAutoML/Wallis : Train database was sliced. Number of slices: 2 (needs extra 1024.0 KB)
- MTClassification/Auslan : Train database was sliced. Number of slices: 2 (needs extra 40.8 MB)
- MTClassification/MTConnect4 : Train database was sliced. Number of slices: 3 (needs extra 789.1 MB)
- MTClassification/MTConnect4Extended : Train database was sliced. Number of slices: 2 (needs extra 2.5 GB)
- MTClassification/MTPokerHandExtended : Train database was sliced. Number of slices: 6 (needs extra 214.1 MB)
- SmallInstability/AIDS10000 : Train database was sliced. Number of slices: 4 (needs extra 1.3 GB)
- TextClassification/20newsgroups : Train database was sliced. Number of slices: 2 (needs extra 2.5 MB)
- TextClassification/20newsgroups : Train database was sliced. Number of slices: 12 (needs extra 273.0 MB)
- TextClassification/RegressionWineReviews : Train database was sliced. Number of slices: 7 (needs extra 536.3 MB)
- TextClassification/RegressionWineReviews : Train database was sliced. Number of slices: 2 (needs extra 4.6 MB)
- TextClassification/RegressionWineReviews : Train database was sliced. Number of slices: 8 (needs extra 1024.0 KB)
  • Make a small study on the necessary memory for the recoding class.
    • MB will implement a tool for this study

Context

  • Khiops 10.5.2-b.0
@folmos-at-orange folmos-at-orange added Type/Enhancement New feature or request Priority/0 To do NOW labels Jul 23, 2024
@folmos-at-orange folmos-at-orange self-assigned this Aug 28, 2024
@folmos-at-orange folmos-at-orange linked a pull request Aug 30, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority/0 To do NOW Type/Enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant