Skip to content

Commit

Permalink
add docs and set publish to false
Browse files Browse the repository at this point in the history
closes #6
  • Loading branch information
atrisovic committed Jun 13, 2022
1 parent 88c38e4 commit 22d180f
Show file tree
Hide file tree
Showing 4 changed files with 69 additions and 58 deletions.
7 changes: 0 additions & 7 deletions CITATION.cff

This file was deleted.

3 changes: 2 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
## Testing locally

To test your contribution locally run the Python script in the following format:
To test your contribution locally from the command line, run the Python script in the following format:

```
python dataverse.py DATAVERSE_TOKEN DATAVERSE_SERVER DATASET_DOI REPO_NAME
```
Expand Down
115 changes: 66 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,29 @@
# Dataverse Uploader

This action uploads the repository content to a Dataverse dataset.
This action automatically uploads GitHub repository content to a Dataverse dataset.
It can upload the entire repository or its subdirectory into an existing dataset on a target
Dataverse installation. The action is customizable, allowing you to fully replace a dataset,
add to the dataset, publish it or leave it as a draft version on Dataverse.

The action provides some additional metadata to the dataset, such as the origin GitHub repository,
and it preserves the directory tree structure.

## Input parameters

To use this action, you will need the following input parameters:

| Parameter | Required | Description |
| --------- | -------- | ------------|
| Parameter | Required | Description |
| --------- | -------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `DATAVERSE_TOKEN` | **Yes** | This is your personal access token that you can create at your Dataverse instance (see [the Dataverse guide](https://guides.dataverse.org/en/latest/user/account.html#how-to-create-your-api-token)). Save your token as a secret variable called `DATAVERSE_TOKEN` in your GitHub repository that you want to upload to Dataverse (see [the GitHub guide](https://docs.github.com/en/actions/security-guides/encrypted-secrets#creating-encrypted-secrets-for-a-repository)). |
| `DATAVERSE_SERVER` | **Yes** | The URL of your Dataverse installation, i.e., [https://dataverse.harvard.edu](https://dataverse.harvard.edu). |
| `DATAVERSE_DATASET_DOI` | **Yes** | This action requires that a dataset (with a DOI) exists on the Dataverse server. Make sure to specify your DOI in this format: `doi:<doi>`, i.e., `doi:10.70122/FK2/LVUA`. |
| `GITHUB_DIR` | No | Use `GITHUB_DIR` if you would like to upload files from only a specific subdirectory in your GitHub repository (i.e., just `data/`). |
| `DELETE` | No | Can be `True` or `False` (by default `True`) depending on whether all files should be deleted in the dataset on Dataverse before upload. |
| `PUBLISH` | No | Can be `True` or `False` (by default `True`) depending on whether you'd like to automatically create a new version of the dataset upon upload. If `False`, the uploaded dataset will be a `DRAFT`. |
| `DATAVERSE_SERVER` | **Yes** | The URL of your Dataverse installation, i.e., [https://dataverse.harvard.edu](https://dataverse.harvard.edu). |
| `DATAVERSE_DATASET_DOI` | **Yes** | This action requires that a dataset (with a DOI) exists on the Dataverse server. Make sure to specify your DOI in this format: `doi:<doi>`, i.e., `doi:10.70122/FK2/LVUA`. |
| `GITHUB_DIR` | No | Use `GITHUB_DIR` if you would like to upload files from only a specific subdirectory in your GitHub repository (i.e., just `data/`). |
| `DELETE` | No | Can be `True` or `False` (by default `True`) depending on whether all files should be deleted in the dataset on Dataverse before upload. |
| `PUBLISH` | No | Can be `True` or `False` (by default `False`) depending on whether you'd like to automatically create a new version of the dataset upon upload. If `False`, the uploaded dataset will be a `DRAFT`. |

## Usage

To use the action, create a new YML file (i.e., `workflow.yml`) in the directory `.github/workflows/` in your GitHub repository.
To use the action, create a new YML file (i.e., `workflow.yml`) and place it in the directory `.github/workflows/` in your GitHub repository.

The action workflow can be executed at trigger events such as `push` and `release`. If you'd only like to run the workflow manually from the Actions tab, use the `workflow_dispatch` event option.

Expand All @@ -33,64 +39,71 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Send repo to Dataverse
uses: IQSS/dataverse-uploader@v1.1
uses: IQSS/dataverse-uploader@v1.2
with:
DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}}
DATAVERSE_SERVER: https://demo.dataverse.org
DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA
```

If you'd like to upload files from a specific subdirectory only, you should add the `GITHUB_DIR` argument in your workflow.
If you'd like to upload files from a specific subdirectory only (for instance, a `data` folder),
you should add the `GITHUB_DIR` argument in your workflow, as follows:

```
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Send repo to Dataverse
uses: IQSS/[email protected]
with:
DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}}
DATAVERSE_SERVER: https://demo.dataverse.org
DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA
GITHUB_DIR: data
steps:
- name: Send repo to Dataverse
uses: IQSS/[email protected]
with:
DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}}
DATAVERSE_SERVER: https://demo.dataverse.org
DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA
GITHUB_DIR: data
```

If you wouldn't want the action to delete your dataset before upload (i.e., if you already have a Dataverse `DRAFT` dataset), set the `DELETE` argument to `False` like:
By default, the action will sync the GitHub repository and the Dataverse dataset, meaning that it will
delete the Dataverse content before uploading the content from GitHub. If you don't want the action to
delete your dataset before upload (i.e., if you already have a Dataverse `DRAFT` dataset),
set the `DELETE` argument to `False` like:

```
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Send repo to Dataverse
uses: IQSS/[email protected]
with:
DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}}
DATAVERSE_SERVER: https://demo.dataverse.org
DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA
GITHUB_DIR: data
DELETE: False
steps:
- name: Send repo to Dataverse
uses: IQSS/[email protected]
with:
DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}}
DATAVERSE_SERVER: https://demo.dataverse.org
DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA
GITHUB_DIR: data
DELETE: False
```

Upon upload, the action will automatically publish a new version of the Dataverse dataset by default. If you'd like to create a new version manually, set the `PUBLISH` argument to `False`.
The action automatically uploads new content to a Dataverse dataset, but it will not publish it as a
new version by default. If you'd like the action to publish a new dataset version in Dataverse,
set the `PUBLISH` argument to `True`.

```
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Send repo to Dataverse
uses: IQSS/[email protected]
with:
DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}}
DATAVERSE_SERVER: https://demo.dataverse.org
DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA
GITHUB_DIR: data
DELETE: False
PUBLISH: False
steps:
- name: Send repo to Dataverse
uses: IQSS/[email protected]
with:
DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}}
DATAVERSE_SERVER: https://demo.dataverse.org
DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA
GITHUB_DIR: data
DELETE: False
PUBLISH: True
```

## Q&A

> If you change the content of your GitHub repository, are the changes synchronized in Dataverse?
> Otherwise, is it possible to synchronize them automatically?
Yes, the action is able to automatically update the Dataverse dataset. In other words, if the action
is triggered with every `push` to the GitHub repository, it will automatically upload its content to
Dataverse. You specify the action triggers in the workflow (`.yml`) file, and in this case, it would
contain `on: push` line to execute the action on every push to the repository.

## Related projects

Check out the following related projects:
Expand All @@ -105,6 +118,10 @@ Visit [this page](https://atrisovic.github.io/dataverse-badge/) to create a Data

Looking for a stand-alone Dataverse uploader that will work from the command line? Check out [DVUploader](https://github.com/GlobalDataverseCommunityConsortium/dataverse-uploader).

## References

See the official Dataverse documentation [here](https://guides.dataverse.org/en/latest/admin/integrations.html#id10).

## Contact

Don't hesitate to create an issue, a pull request, or contact us if you notice any problems with the action.
2 changes: 1 addition & 1 deletion dataverse.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def parse_arguments():
parser.add_argument(
"-p", "--publish", help="Publish a new dataset version after upload.", \
choices=('True', 'TRUE', 'true', 'False', 'FALSE', 'false'), \
default='true')
default='false')

args_ = parser.parse_args()
return args_
Expand Down

0 comments on commit 22d180f

Please sign in to comment.