From 22d180f88051472cae66ced177d16c5e91b90557 Mon Sep 17 00:00:00 2001 From: Ana Trisovic Date: Mon, 13 Jun 2022 16:27:19 -0400 Subject: [PATCH] add docs and set publish to false closes #6 --- CITATION.cff | 7 --- CONTRIBUTING.md | 3 +- README.md | 115 +++++++++++++++++++++++++++--------------------- dataverse.py | 2 +- 4 files changed, 69 insertions(+), 58 deletions(-) delete mode 100644 CITATION.cff diff --git a/CITATION.cff b/CITATION.cff deleted file mode 100644 index 8174430..0000000 --- a/CITATION.cff +++ /dev/null @@ -1,7 +0,0 @@ -cff-version: 1.2.0 -authors: - - family-names: Trisovic - given-names: Ana - orcid: https://orcid.org/0000-0003-1991-0533 -title: "Dataverse Uploader GitHub Action" -date-released: 2021-11-13 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 27f5c5f..4a31f41 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,6 +1,7 @@ ## Testing locally -To test your contribution locally run the Python script in the following format: +To test your contribution locally from the command line, run the Python script in the following format: + ``` python dataverse.py DATAVERSE_TOKEN DATAVERSE_SERVER DATASET_DOI REPO_NAME ``` diff --git a/README.md b/README.md index 64365c4..baf3b17 100644 --- a/README.md +++ b/README.md @@ -1,23 +1,29 @@ # Dataverse Uploader -This action uploads the repository content to a Dataverse dataset. +This action automatically uploads GitHub repository content to a Dataverse dataset. +It can upload the entire repository or its subdirectory into an existing dataset on a target +Dataverse installation. The action is customizable, allowing you to fully replace a dataset, +add to the dataset, publish it or leave it as a draft version on Dataverse. + +The action provides some additional metadata to the dataset, such as the origin GitHub repository, +and it preserves the directory tree structure. ## Input parameters To use this action, you will need the following input parameters: -| Parameter | Required | Description | -| --------- | -------- | ------------| +| Parameter | Required | Description | +| --------- | -------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `DATAVERSE_TOKEN` | **Yes** | This is your personal access token that you can create at your Dataverse instance (see [the Dataverse guide](https://guides.dataverse.org/en/latest/user/account.html#how-to-create-your-api-token)). Save your token as a secret variable called `DATAVERSE_TOKEN` in your GitHub repository that you want to upload to Dataverse (see [the GitHub guide](https://docs.github.com/en/actions/security-guides/encrypted-secrets#creating-encrypted-secrets-for-a-repository)). | -| `DATAVERSE_SERVER` | **Yes** | The URL of your Dataverse installation, i.e., [https://dataverse.harvard.edu](https://dataverse.harvard.edu). | -| `DATAVERSE_DATASET_DOI` | **Yes** | This action requires that a dataset (with a DOI) exists on the Dataverse server. Make sure to specify your DOI in this format: `doi:`, i.e., `doi:10.70122/FK2/LVUA`. | -| `GITHUB_DIR` | No | Use `GITHUB_DIR` if you would like to upload files from only a specific subdirectory in your GitHub repository (i.e., just `data/`). | -| `DELETE` | No | Can be `True` or `False` (by default `True`) depending on whether all files should be deleted in the dataset on Dataverse before upload. | -| `PUBLISH` | No | Can be `True` or `False` (by default `True`) depending on whether you'd like to automatically create a new version of the dataset upon upload. If `False`, the uploaded dataset will be a `DRAFT`. | +| `DATAVERSE_SERVER` | **Yes** | The URL of your Dataverse installation, i.e., [https://dataverse.harvard.edu](https://dataverse.harvard.edu). | +| `DATAVERSE_DATASET_DOI` | **Yes** | This action requires that a dataset (with a DOI) exists on the Dataverse server. Make sure to specify your DOI in this format: `doi:`, i.e., `doi:10.70122/FK2/LVUA`. | +| `GITHUB_DIR` | No | Use `GITHUB_DIR` if you would like to upload files from only a specific subdirectory in your GitHub repository (i.e., just `data/`). | +| `DELETE` | No | Can be `True` or `False` (by default `True`) depending on whether all files should be deleted in the dataset on Dataverse before upload. | +| `PUBLISH` | No | Can be `True` or `False` (by default `False`) depending on whether you'd like to automatically create a new version of the dataset upon upload. If `False`, the uploaded dataset will be a `DRAFT`. | ## Usage -To use the action, create a new YML file (i.e., `workflow.yml`) in the directory `.github/workflows/` in your GitHub repository. +To use the action, create a new YML file (i.e., `workflow.yml`) and place it in the directory `.github/workflows/` in your GitHub repository. The action workflow can be executed at trigger events such as `push` and `release`. If you'd only like to run the workflow manually from the Actions tab, use the `workflow_dispatch` event option. @@ -33,64 +39,71 @@ jobs: runs-on: ubuntu-latest steps: - name: Send repo to Dataverse - uses: IQSS/dataverse-uploader@v1.1 + uses: IQSS/dataverse-uploader@v1.2 with: DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}} DATAVERSE_SERVER: https://demo.dataverse.org DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA ``` -If you'd like to upload files from a specific subdirectory only, you should add the `GITHUB_DIR` argument in your workflow. +If you'd like to upload files from a specific subdirectory only (for instance, a `data` folder), +you should add the `GITHUB_DIR` argument in your workflow, as follows: ``` -jobs: - build: - runs-on: ubuntu-latest - steps: - - name: Send repo to Dataverse - uses: IQSS/dataverse-uploader@v1.1 - with: - DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}} - DATAVERSE_SERVER: https://demo.dataverse.org - DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA - GITHUB_DIR: data +steps: + - name: Send repo to Dataverse + uses: IQSS/dataverse-uploader@v1.2 + with: + DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}} + DATAVERSE_SERVER: https://demo.dataverse.org + DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA + GITHUB_DIR: data ``` -If you wouldn't want the action to delete your dataset before upload (i.e., if you already have a Dataverse `DRAFT` dataset), set the `DELETE` argument to `False` like: +By default, the action will sync the GitHub repository and the Dataverse dataset, meaning that it will +delete the Dataverse content before uploading the content from GitHub. If you don't want the action to +delete your dataset before upload (i.e., if you already have a Dataverse `DRAFT` dataset), +set the `DELETE` argument to `False` like: ``` -jobs: - build: - runs-on: ubuntu-latest - steps: - - name: Send repo to Dataverse - uses: IQSS/dataverse-uploader@v1.1 - with: - DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}} - DATAVERSE_SERVER: https://demo.dataverse.org - DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA - GITHUB_DIR: data - DELETE: False +steps: + - name: Send repo to Dataverse + uses: IQSS/dataverse-uploader@v1.2 + with: + DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}} + DATAVERSE_SERVER: https://demo.dataverse.org + DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA + GITHUB_DIR: data + DELETE: False ``` -Upon upload, the action will automatically publish a new version of the Dataverse dataset by default. If you'd like to create a new version manually, set the `PUBLISH` argument to `False`. +The action automatically uploads new content to a Dataverse dataset, but it will not publish it as a +new version by default. If you'd like the action to publish a new dataset version in Dataverse, +set the `PUBLISH` argument to `True`. ``` -jobs: - build: - runs-on: ubuntu-latest - steps: - - name: Send repo to Dataverse - uses: IQSS/dataverse-uploader@v1.1 - with: - DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}} - DATAVERSE_SERVER: https://demo.dataverse.org - DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA - GITHUB_DIR: data - DELETE: False - PUBLISH: False +steps: + - name: Send repo to Dataverse + uses: IQSS/dataverse-uploader@v1.2 + with: + DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}} + DATAVERSE_SERVER: https://demo.dataverse.org + DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA + GITHUB_DIR: data + DELETE: False + PUBLISH: True ``` +## Q&A + +> If you change the content of your GitHub repository, are the changes synchronized in Dataverse? +> Otherwise, is it possible to synchronize them automatically? + +Yes, the action is able to automatically update the Dataverse dataset. In other words, if the action +is triggered with every `push` to the GitHub repository, it will automatically upload its content to +Dataverse. You specify the action triggers in the workflow (`.yml`) file, and in this case, it would +contain `on: push` line to execute the action on every push to the repository. + ## Related projects Check out the following related projects: @@ -105,6 +118,10 @@ Visit [this page](https://atrisovic.github.io/dataverse-badge/) to create a Data Looking for a stand-alone Dataverse uploader that will work from the command line? Check out [DVUploader](https://github.com/GlobalDataverseCommunityConsortium/dataverse-uploader). +## References + +See the official Dataverse documentation [here](https://guides.dataverse.org/en/latest/admin/integrations.html#id10). + ## Contact Don't hesitate to create an issue, a pull request, or contact us if you notice any problems with the action. diff --git a/dataverse.py b/dataverse.py index 4c695dc..643f39d 100644 --- a/dataverse.py +++ b/dataverse.py @@ -25,7 +25,7 @@ def parse_arguments(): parser.add_argument( "-p", "--publish", help="Publish a new dataset version after upload.", \ choices=('True', 'TRUE', 'true', 'False', 'FALSE', 'false'), \ - default='true') + default='false') args_ = parser.parse_args() return args_