Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contrib: added xgboost-optuna contrib #479

Merged
merged 1 commit into from
Oct 21, 2023
Merged

Contrib: added xgboost-optuna contrib #479

merged 1 commit into from
Oct 21, 2023

Conversation

zilto
Copy link
Collaborator

@zilto zilto commented Oct 20, 2023

Flow that receives a set of X_train y_train X_test y_test and can train an XGBoost model with hyperparameter tuning

For new dataflows:

Do you have the following?

  • Added a directory mapping to my github user name in the contrib/hamilton/contrib/user directory.
    • If my author names contains hyphens I have replaced them with underscores.
    • If my author name starts with a number, I have prefixed it with an underscore.
    • If your author name is a python reserved keyword. Reach out to the maintainers for help.
    • Added an author.md file under my username directory and is filled out.
    • Added an init.py file under my username directory.
  • Added a new folder for my dataflow under my username directory.
    • Added a README.md file under my dataflow directory that follows the standard headings and is filled out.
    • Added a init.py file under my dataflow directory that contains the Hamilton code.
    • Added a requirements.txt under my dataflow directory that contains the required packages outside of Hamilton.
    • Added tags.json under my dataflow directory to curate my dataflow.
    • Added valid_configs.jsonl under my dataflow directory to specify the valid configurations.
    • Added a dag.png that shows one possible configuration of my dataflow.

For existing dataflows -- what has changed?

How I tested this

There is a notebook that loads data using sklearn. It should run successfully from start to finish

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Dataflow documentation has been updated if adding/changing functionality.

@zilto zilto requested a review from skrawcz October 20, 2023 19:04
@sweep-ai
Copy link
Contributor

sweep-ai bot commented Oct 20, 2023

Apply Sweep Rules to your PR?

  • Apply: Leftover TODOs in the code should be handled.
  • Apply: All new business logic should have corresponding unit tests in the tests/ directory.
  • Apply: Any clearly inefficient or repeated code should be optimized or refactored.

@skrawcz skrawcz merged commit 20d7efa into main Oct 21, 2023
2 checks passed
@skrawcz skrawcz deleted the contrib/optuna branch October 21, 2023 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants