Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Various improvements to the CLI endpoints #441

Merged
merged 7 commits into from
Aug 29, 2023

Conversation

cwognum
Copy link
Collaborator

@cwognum cwognum commented Aug 22, 2023

Changelogs

  • Added fine-tuning through Optuna, using the Optuna Sweeper Plugin for Hydra.
  • Migrated the CLI from Click to Typer.
  • Made it possible to specify a pre-trained model with a .ckpt path, instead of always relying on a manually defined mapping in the spaces.py.
  • Added some utilities for saving essential results in case there is no logger.
  • Removed the graphium-prepare-data command in favor of a new subcommand at: graphium data prepare.
  • Added graphium finetune fingerprint to easily extract fingerprints from a model.
  • (WIP: Added a notebook to investigate the correlation between pre-training and fine-tuning performance.)

Checklist:

  • Was this PR discussed in an issue? It is recommended to first discuss a new feature into a GitHub issue before opening a PR.
  • Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
  • Update the API documentation is a new function is added, or an existing one is deleted.
  • Write concise and explanatory changelogs above.
  • If possible, assign one of the following labels to the PR: feature, fix or test (or ask a maintainer to do it for you).

discussion related to that PR

@cwognum cwognum requested a review from DomInvivo as a code owner August 22, 2023 11:00
@cwognum cwognum marked this pull request as draft August 22, 2023 11:13
@codecov
Copy link

codecov bot commented Aug 22, 2023

Codecov Report

Merging #441 (3c9e327) into main (e424f21) will decrease coverage by 0.76%.
Report is 2 commits behind head on main.
The diff coverage is 25.25%.

@@            Coverage Diff             @@
##             main     #441      +/-   ##
==========================================
- Coverage   65.37%   64.61%   -0.76%     
==========================================
  Files          92       93       +1     
  Lines        8253     8404     +151     
==========================================
+ Hits         5395     5430      +35     
- Misses       2858     2974     +116     
Flag Coverage Δ
unittests 64.61% <25.25%> (-0.76%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
ipu 49.14% <ø> (ø)

@cwognum cwognum mentioned this pull request Aug 23, 2023
5 tasks
README.md Outdated Show resolved Hide resolved
docs/cli_references.md Outdated Show resolved Hide resolved
expts/hydra-configs/hparam_search/optuna.yaml Outdated Show resolved Hide resolved
graphium/cli/train_finetune.py Outdated Show resolved Hide resolved
graphium/cli/train_finetune.py Outdated Show resolved Hide resolved
notebooks/compare-pretraining-finetuning-performance.ipynb Outdated Show resolved Hide resolved
@cwognum cwognum mentioned this pull request Aug 24, 2023
5 tasks
@cwognum cwognum changed the title WIP: Config for hyper-parameter tuning with Optuna WIP: Various improvements to the CLI endpoints Aug 24, 2023
cwognum and others added 2 commits August 24, 2023 21:10
Let me also try credit @WenkelF properly again.
Co-authored-by: WenkelF <[email protected]>
graphium/cli/finetune_utils.py Outdated Show resolved Hide resolved
graphium/cli/finetune_utils.py Outdated Show resolved Hide resolved
graphium/finetuning/fingerprinting.py Outdated Show resolved Hide resolved
@DomInvivo DomInvivo marked this pull request as ready for review August 29, 2023 14:48
Copy link
Collaborator

@DomInvivo DomInvivo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a comment about the CLI for fingerprinting. Seems to miss the ability to chose a dataset?? Or does that come from the config?

* `--inclusive-filter / --no-inclusive-filter`: [default: inclusive-filter]
* `--help`: Show this message and exit.

### `graphium finetune fingerprint`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you specify which dataset you want to generate the fingerprints on?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comes from the config! It uses the predict_dataloader() of the datamodule, which in Graphium has been implemented to fallback to the test_dataloader if not explicitly specified.

@DomInvivo DomInvivo linked an issue Aug 29, 2023 that may be closed by this pull request
@DomInvivo DomInvivo merged commit 3aa098b into datamol-io:main Aug 29, 2023
@WenkelF WenkelF mentioned this pull request Aug 30, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-level fine-tuning and "fingerprinting"
2 participants