Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Set default command for each dataset #66

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,9 @@ seq2rel-ds --help
To preprocess a dataset (and in most cases, download it), call one of the commands, e.g.

```bash
seq2rel-ds preprocess cdr main "path/to/cdr"
seq2rel-ds cdr "path/to/cdr"
```

> Note, you have to include `main` because [`typer`](https://typer.tiangolo.com/) does not support default commands.

This will create the preprocessed `tsv` files under the specified output directory, e.g.

```
Expand Down
2 changes: 1 addition & 1 deletion seq2rel_ds/cdr.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ def _preprocess(
return seq2rel_annotations


@app.command()
@app.callback(invoke_without_command=True)
def main(
output_dir: Path = typer.Argument(..., help="Directory path to save the preprocessed data."),
sort_rels: bool = typer.Option(
Expand Down
3 changes: 2 additions & 1 deletion seq2rel_ds/dgm.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ def _preprocess(
return seq2rel_annotations


@app.command()
@app.callback(invoke_without_command=True)
def main(
input_dir: Path = typer.Argument(..., help="Path to a local copy of the DGM corpus."),
output_dir: Path = typer.Argument(..., help="Directory path to save the preprocessed data."),
Expand All @@ -111,6 +111,7 @@ def main(
),
) -> None:
"""Download and preprocess the DGM corpus for use with seq2rel.

The corpus can be downloaded at: https://hanover.azurewebsites.net/downloads/naacl2019.aspx.
Provide this path as argument `input_dir` to this command. More details about the corpus can be
found here: https://arxiv.org/abs/1904.02347. Note that we use the paragraph-length text.
Expand Down
8 changes: 6 additions & 2 deletions seq2rel_ds/docred.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,14 +83,18 @@ def _preprocess(
return seq2rel_annotations


@app.command()
@app.callback(invoke_without_command=True)
def main(
output_dir: Path = typer.Argument(..., help="Directory path to save the preprocessed data."),
sort_rels: bool = typer.Option(
True, help="Sort relations according to order of first appearance."
),
) -> None:
"""Download and preprocess the DocRED corpus for use with seq2rel."""
"""Download and preprocess the DocRED corpus for use with seq2rel.

This is the end-to-end split provided by https://arxiv.org/abs/2102.05980, which can be
accessed here: http://lavis.cs.hs-rm.de/storage/jerex/public/datasets/docred_joint/.
"""
msg.divider("Preprocessing DocRED")

with msg.loading("Downloading corpus..."):
Expand Down
2 changes: 1 addition & 1 deletion seq2rel_ds/gda.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def _preprocess(
return seq2rel_annotations


@app.command()
@app.callback(invoke_without_command=True)
def main(
output_dir: Path = typer.Argument(..., help="Directory path to save the preprocessed data."),
sort_rels: bool = typer.Option(
Expand Down