Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download BC5CDR #20

Merged
merged 5 commits into from
May 4, 2021
Merged

Download BC5CDR #20

merged 5 commits into from
May 4, 2021

Conversation

JohnGiorgi
Copy link
Owner

@JohnGiorgi JohnGiorgi commented May 4, 2021

Overview

The main purpose of this PR is to download BC5CDR as part of the preprocessing step, rather than require the user to provide a local copy. This is a better user experience, but it also simplifies something I am working on now (computing stats on these corpora).

Other changes

  • ♻️ Moves the logic for converting Dict[str, PubTatorAnnotation] to a format that can be used by seq2rel to its own function.
  • 🏷️ Fixes a ton of type hints. Still not ready to turn mypy back on (Re-enable MyPy type checking in CI #3), but it is getting there.

@JohnGiorgi JohnGiorgi self-assigned this May 4, 2021
@JohnGiorgi JohnGiorgi added the enhancement New feature or request label May 4, 2021
@JohnGiorgi JohnGiorgi merged commit 4247b75 into main May 4, 2021
@JohnGiorgi JohnGiorgi deleted the download-bc5cdr branch May 4, 2021 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant