-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #15 from 4dn-dcic/ff_utils_docs
Changes from 4DN meeting and also docs
- Loading branch information
Showing
6 changed files
with
305 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,9 @@ | ||
# utils | ||
Various utility modules shared amongst several projects in the 4DN-DCIC. | ||
This repository contains various utility modules shared amongst several projects in the 4DN-DCIC. It is meant to be used internally by the DCIC team and externally as a Python API to [Fourfront](https://data.4dnucleome.org), the 4DN data portal. | ||
|
||
pip installable with: `pip install dcicutils` | ||
pip installable as the `dcicutils` package with: `pip install dcicutils` | ||
|
||
See [this document](./docs/getting_started.md) for tips on getting started. [Go here](./docs/examples.md) for examples of some of the most useful functions. | ||
|
||
[![Build Status](https://travis-ci.org/4dn-dcic/utils.svg?branch=master)](https://travis-ci.org/4dn-dcic/utils) | ||
[![Coverage](https://coveralls.io/repos/github/4dn-dcic/utils/badge.svg?branch=master)](https://coveralls.io/github/4dn-dcic/utils?branch=master) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
"""Version information.""" | ||
|
||
# The following line *must* be the last in the module, exactly as formatted: | ||
__version__ = "0.2.5" | ||
__version__ = "0.2.6" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
# Example usage of dcicutils functions | ||
|
||
See [getting started]('./getting_started.md') for help with getting up and running with dcicutils. | ||
|
||
As a first step, we will import our modules from the dcicutils package. | ||
|
||
``` | ||
from dcicutils import ff_utils | ||
``` | ||
|
||
### <a name="key"></a>Making your key | ||
|
||
Authentication methods differ if you are an external user or an internal 4DN team member. If you are an external user, create a Python dictionary called `key` using your access key. This will be used in the examples below. | ||
|
||
``` | ||
key = {'key': <YOUR KEY>, 'secret' <YOUR SECRET>, 'server': 'https://data.4dnucleome.org/'} | ||
``` | ||
|
||
If you are an internal user, you may simply use the string Fourfront environment name for your metadata functions to get administrator access. For faster requests or if you want to emulate another user, you can also pass in keys manually. The examples below will use `key`, but could also use `ff_env`. It assumes you want to use the data Fourfront environment. | ||
|
||
``` | ||
key = ff_utils.get_authentication_with_server(ff_env='data') | ||
``` | ||
|
||
### <a name="metadata"></a>Examples for metadata functions | ||
|
||
You can use `get_metadata` to get the metadata for a single object. It returns a dictionary of metadata on a successful get request. In our example, we get a publicly available HEK293 biosource, which has an internal accession of 4DNSRVF4XB1F. | ||
|
||
``` | ||
metadata = ff_utils.get_metadata('4DNSRVF4XB1F', key=key) | ||
# the response is a python dictionary | ||
metadata['accession'] == '4DNSRVF4XB1F' | ||
>> True | ||
``` | ||
|
||
To post new data to the system, use the `post_metadata` function. You need to provide the body of data you want to post, as well as the schema name for the object. We want to post a fastq file. | ||
|
||
``` | ||
post_body = { | ||
'file_format': 'fastq', | ||
'lab': '/labs/4dn-dcic-lab/', | ||
'award': '/awards/1U01CA200059-01/' | ||
} | ||
response = ff_utils.post_metadata(post_body, 'file_fastq', key=key) | ||
# response is a dictionary containing info about your post | ||
response['status'] | ||
>> 'success' | ||
# the dictionary body of the metadata object created is in response['@graph'] | ||
metadata = response['@graph'][0] | ||
``` | ||
|
||
|
||
If you want to edit data, use the `patch_metadata` function. Let's say that the fastq file you just made has an accession of `4DNFIP74UWGW` and we want to add a description to it. | ||
|
||
``` | ||
patch_body = {'description': 'My cool fastq file'} | ||
# you can explicitly pass the object ID (in this case accession)... | ||
response = ff_utils.patch_metadata(patch_body, '4DNFIP74UWGW', key=key) | ||
# or you can include the ID in the data you patch | ||
patch_body['accession'] = '4DNFIP74UWGW' | ||
response = ff_utils.patch_metadata(patch_body, key=key) | ||
# the response has the same format as in post_metadata | ||
metadata = response['@graph'][0] | ||
``` | ||
|
||
Similar to `post_metadata` you can "UPSERT" metadata, which will perform a POST if the metadata doesn't yet exist within the system and will PATCH if it does. The `upsert_metadata` function takes the exact same arguments as `post_metadata` but will not raise an error on a metadata conflict. | ||
|
||
``` | ||
upsert_body = { | ||
'file_format': 'fastq', | ||
'lab': '/labs/4dn-dcic-lab/', | ||
'award': '/awards/1U01CA200059-01/', | ||
'accession': '4DNFIP74UWGW' | ||
} | ||
# this will POST if file 4DNFIP74UWGW does not exist and will PATCH if it does | ||
response = ff_utils.post_metadata(post_body, 'upsert_body', key=key) | ||
# the response has the same format as in post_metadata | ||
metadata = response['@graph'][0] | ||
``` | ||
|
||
You can use `search_metadata` to easily search through metadata in Fourfront. This function takes a string search url starting with 'search', as well as the the same authorization information as the other metadata functions. It returns a list of metadata results. Optionally, the `page_limit` parameter can be used to internally adjust the size of the pagination used in underlying generator used to get search results. | ||
|
||
``` | ||
# let's search for all biosamples | ||
# hits is a list of metadata dictionaries | ||
hits = ff_utils.search_metadata('search/?type=Biosample', key=key) | ||
# you can also specify a limit on the number of results for your search | ||
# other valid query params are also allowed, including sorts and filters | ||
hits = ff_utils.search_metadata('search/?type=Biosample&limit=10', key=key) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Getting started | ||
|
||
The dcicutils package contains a number of helpful utility functions that are useful for both internal use (both infrastructure and scripting) and external user use. Before getting into the functions themselves, we will go over how to set up your authentication as both as internal DCIC user and external user. | ||
|
||
First, install dcicutils using pip. Python 2.7 and 3.x are supported. | ||
|
||
`pip install dcicutils` | ||
|
||
### Internal DCIC set up | ||
|
||
To fully utilize the utilities, you should have your AWS credentials set up. In addition, you should also have the `SECRET` environment variable needed for decrypting the administrator access keys stored on Amazon S3. If you would rather not set these up, using a local administrator access key generated from Fourfront is also an option; see the instructions for external set up below. | ||
|
||
### External set up | ||
|
||
The utilities require an access key, which is generated using your use account on Fourfront. If you do not yet have an account, the first step is to [request one](https://data.4dnucleome.org/help/user-guide/account-creation). You can then generate an access key on your [user information page](https://data.4dnucleome.org/me) when your account is set up and you are logged in. Make sure to take note of the information generated when you make an access key. Store it in a safe place, because it will be needed when you make a request to Fourfront. | ||
|
||
The main format of the authorization used for the utilities is: | ||
|
||
`{'key': <YOUR KEY>, 'secret' <YOUR SECRET>, 'server': 'https://data.4dnucleome.org/'}` | ||
|
||
You can replace server with another Fourfront environment if you have an access key made on that environment. | ||
|
||
### Central metadata functions | ||
|
||
The most useful utilities functions for most users are the metadata functions, which generally are used to access, create, or edit object metadata on the Fourfront portal. Since this utilities module is a pip-installable Python package, they can be leveraged as an API to the portal in your scripts. All of these functions are contained within `dcicutils.ff_utils.py`. | ||
|
||
See example usage of these functions [here](./examples.md#metadata) |
Oops, something went wrong.