Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for Typer optional default values with Python calls #10788

Merged
merged 17 commits into from
Jun 17, 2022

Conversation

rmitsch
Copy link
Contributor

@rmitsch rmitsch commented May 12, 2022

Goals

Allow calling init_config_cfg() from Python with optional arguments not being set. This currently fails due to the behaviour of Typer for handling optional arguments in Python calls of functions decorated with @init_cli.command. See #10727 for a more in-depth description.

Description

Modified init_config_cli() to force default values for optional arguments if they are still of types typer.models.OptionInfo or typer.models.ArgumentInfo. This seems to be done by Typer automatically on CLI calls, but not on Python calls (see also the corresponding discussion in the Typer repo).

If such a type is detected, the corresponding variable is set to the configured primitive default value.

Types of change

Bug fix/enhancement, new test.

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@rmitsch rmitsch marked this pull request as ready for review May 12, 2022 09:16
@svlandeg svlandeg added enhancement Feature requests and improvements feat / cli Feature: Command-line interface labels May 12, 2022
@adrianeboyd adrianeboyd linked an issue May 17, 2022 that may be closed by this pull request
Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this! I might have a look at the typer side of things later this week, too, but either way it's a good idea to make sure our code base is robust against this type of failure.

spacy/cli/init_config.py Outdated Show resolved Hide resolved
spacy/tests/test_cli.py Outdated Show resolved Hide resolved
spacy/tests/test_cli.py Outdated Show resolved Hide resolved
spacy/tests/test_cli.py Outdated Show resolved Hide resolved
@svlandeg svlandeg added bug Bugs and behaviour differing from documentation and removed enhancement Feature requests and improvements labels May 17, 2022
rmitsch and others added 3 commits May 18, 2022 09:33
…ls: reverting some black formatting changes.

Co-authored-by: Sofie Van Landeghem <[email protected]>
…ls: removing return type hint.

Co-authored-by: Sofie Van Landeghem <[email protected]>
@rmitsch
Copy link
Contributor Author

rmitsch commented May 20, 2022

Comments/suggestions have been addressed.

Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into this in a bit more detail, I'm not sure this is the path we want to take. For one, it makes the implementation of the _cli function quite ugly and error-prone. Eg. having to think about using optional_args instead of the parameters directly -> I could see this leading to errors in the future.

I've started wondering about the original issue as well - I'm not convinced that functions with _cli in their name should work from a Python call. We definitely don't advertise this API anywhere.

However, the problem remains that even if users call into init_config and save_config directly, as you'd probably want to recommend them to, then they don't get access to the default values that you'd get from the CLI.

The current implementation does not repeat the default values in e.g. init_config, as this also avoids them going out-of-sync across functions. But what if we define some of these default variables as a global on top? Then both functions can access them, they'd always be in sync, and we'd have functions that you can call directly from within Python (though not the _cli ones)

@rmitsch
Copy link
Contributor Author

rmitsch commented May 20, 2022

Looking into this in a bit more detail...

  • Ugliness and error proneness of optional_args: agreed. The original intention was to not do this for each argument manually. A possible alternative would be to wrap the arguments in a function forcing the default value where necessary:

        ...
        config = init_config(
            lang=get_typer_option_value(lang),
           ...
        )

    It's not much better, but at least avoids a parallel data store for argument values.

  • Support of _cli calls from Python: it's IMO totally legitimate not to support that.

  • Default values for init_config(), save_config(): agreed, we shouldn't repeat default values. I'm not a huge fan of setting the default values as globals, but to be fair, I don't have a better idea.

So tl;dr: I think your suggestion of making the default values global is the least bad way to go forward. Should we bundle them in an object so they don't clutter the module namespace too much, like this?

class InitDefaultValues:
    output_file = "-"
    pretraining = False
    ...
    

…ed forcing of default values for optional arguments in init_config_cli(). Added default values for init_config(). Synchronized default values for init_config_cli() and init_config().
@rmitsch
Copy link
Contributor Author

rmitsch commented May 23, 2022

Removed forcing of optional default values, added synchronized default values for init_config_cli() and init_cli(). I.e. calling init_config_cli() from Python is not fully supported in this version, since the original issue with optional arguments hasn't been resolved. We can alternatively recommend using init_cli() for calls with optional arguments though, since it includes default values now.

spacy/cli/init_config.py Outdated Show resolved Hide resolved
@svlandeg svlandeg self-requested a review May 31, 2022 22:46
Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Removed some of the introduced newlines and just had one more question on the test.

spacy/cli/init_config.py Outdated Show resolved Hide resolved
spacy/cli/init_config.py Outdated Show resolved Hide resolved
spacy/tests/test_cli.py Outdated Show resolved Hide resolved
spacy/tests/test_cli.py Outdated Show resolved Hide resolved
spacy/cli/init_config.py Outdated Show resolved Hide resolved
Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good to merge once the CI goes green!

@rmitsch
Copy link
Contributor Author

rmitsch commented Jun 17, 2022

@svlandeg I think we can merge this?

@svlandeg svlandeg merged commit a7f6bc5 into explosion:master Jun 17, 2022
danieldk added a commit that referenced this pull request Jun 27, 2022
* Add "Aim-spaCy" to spaCy Universe (#10943)

* Add Aim-spaCy to spaCy universe

* Update Aim thumbnail

* Fix author links

Co-authored-by: Paul O'Leary McCann <[email protected]>

* Auto-format code with black (#10945)

Co-authored-by: explosion-bot <[email protected]>

* precomputable_biaffine: avoid concatenation (#10911)

The `forward` of `precomputable_biaffine` performs matrix multiplication
and then `vstack`s the result with padding. This creates a temporary
array used for the output of matrix concatenation.

This change avoids the temporary by pre-allocating an array that is
large enough for the output of matrix multiplication plus padding and
fills the array in-place.

This gave me a small speedup (a bit over 100 WPS) on de_core_news_lg on
M1 Max (after changing thinc-apple-ops to support in-place gemm as BLIS
does).

* Add failing test: `test_matcher_extension_in_set_predicate` (#10948)

* vectors: remove use of float as row number (#10955)

The float -1 was returned rather than the integer -1 as the row for
unknown keys. This doesn't introduce a realy bug, since such floats
cast (without issues) to int in the conversion to NumPy arrays. Still,
it's nice to to do the correct thing :).

* Update for CBlas changes in Thinc 8.1.0.dev2 (#10970)

* Workaround for Typer optional default values with Python calls (#10788)

* Workaround for Typer optional default values with Python calls: added test and workaround.

* @rmitsch Workaround for Typer optional default values with Python calls: reverting some black formatting changes.

Co-authored-by: Sofie Van Landeghem <[email protected]>

* @rmitsch Workaround for Typer optional default values with Python calls: removing return type hint.

Co-authored-by: Sofie Van Landeghem <[email protected]>

* Workaround for Typer optional default values with Python calls: fixed imports, added GitHub issue marker.

* Workaround for Typer optional default values with Python calls: removed forcing of default values for optional arguments in init_config_cli(). Added default values for init_config(). Synchronized default values for init_config_cli() and init_config().

* Workaround for Typer optional default values with Python calls: removed unused import.

* Workaround for Typer optional default values with Python calls: fixed usage of optimize in init_config_cli().

* Workaround for Typer optional default values with Pythhon calls: remove output_file from InitDefaultValues.

* Workaround for Typer optional default values with Python calls: rename class for default init values.

* Workaround for Typer optional default values with Python calls: remove newline.

* remove introduced newlines

* Remove test_init_config_from_python_without_optional_args().

* remove leftover import

* reformat import

* remove duplicate

Co-authored-by: Sofie Van Landeghem <[email protected]>

* Made _initialize_X() methods private. (#10978)

* Auto-format code with black (#10977)

Co-authored-by: explosion-bot <[email protected]>

* account for NER labels with a hyphen in the name (#10960)

* account for NER labels with a hyphen in the name

* cleanup

* fix docstring

* add return type to helper method

* shorter method and few more occurrences

* user helper method across repo

* fix circular import

* partial revert to avoid circular import

* `enable` argument for spacy.load() (#10784)

* Enable flag on spacy.load: foundation for include, enable arguments.

* Enable flag on spacy.load: fixed tests.

* Enable flag on spacy.load: switched from pretrained model to empty model with added pipes for tests.

* Enable flag on spacy.load: switched to more consistent error on misspecification of component activity. Test refactoring. Added  to default config.

* Enable flag on spacy.load: added support for fields not in pipeline.

* Enable flag on spacy.load: removed serialization fields from supported fields.

* Enable flag on spacy.load: removed 'enable' from config again.

* Enable flag on spacy.load: relaxed checks in _resolve_component_activation_status() to allow non-standard pipes.

* Enable flag on spacy.load: fixed relaxed checks for _resolve_component_activation_status() to allow non-standard pipes. Extended tests.

* Enable flag on spacy.load: comments w.r.t. resolution workarounds.

* Enable flag on spacy.load: remove include fields. Update website docs.

* Enable flag on spacy.load: updates w.r.t. changes in master.

* Implement Doc.from_json(): update docstrings.

Co-authored-by: Adriane Boyd <[email protected]>

* Implement Doc.from_json(): remove newline.

Co-authored-by: Adriane Boyd <[email protected]>

* Implement Doc.from_json(): change error message for E1038.

Co-authored-by: Adriane Boyd <[email protected]>

* Enable flag on spacy.load: wrapped docstring for _resolve_component_status() at 80 chars.

* Enable flag on spacy.load: changed exmples for enable flag.

* Remove newline.

Co-authored-by: Sofie Van Landeghem <[email protected]>

* Fix docstring for Language._resolve_component_status().

* Rename E1038 to E1042.

Co-authored-by: Adriane Boyd <[email protected]>
Co-authored-by: Sofie Van Landeghem <[email protected]>

* add counts to verbose list of NER labels (#10957)

* Update linguistic-features.md (#10993)

Change link for downloading fasttext word vectors

* Use thinc-apple-ops>=0.1.0.dev0 with `apple` extras (#10904)

* Use thinc-apple-ops>=0.1.0.dev0 with `apple` extras

Also test with thinc-apple-ops that is at least 0.1.0.dev0.

* Check thinc-apple-ops on macOS with Python 3.10

Co-authored-by: Adriane Boyd <[email protected]>

* Use `pip install --pre` for installing thinc-apple-ops in CI

Co-authored-by: Adriane Boyd <[email protected]>

Co-authored-by: Gor Arakelyan <[email protected]>
Co-authored-by: Paul O'Leary McCann <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: explosion-bot <[email protected]>
Co-authored-by: Madeesh Kannan <[email protected]>
Co-authored-by: Raphael Mitsch <[email protected]>
Co-authored-by: Sofie Van Landeghem <[email protected]>
Co-authored-by: Adriane Boyd <[email protected]>
Co-authored-by: Victoria <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and behaviour differing from documentation feat / cli Feature: Command-line interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Init Config from Python: typer.Option default value not working
3 participants