-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor interface #88
Conversation
formatter_class = argparse.ArgumentDefaultsHelpFormatter | ||
) | ||
|
||
parser.add_argument('path', help='One or more paths of files to validate', type=str, nargs='+') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a critical point, but for input paths I generally use something like this extant_file
type rather than str
.
def extant_file(x):
"""
'Type' for argparse - checks that file exists but does not open.
"""
if not os.path.exists(x):
# Argparse uses the ArgumentTypeError to give a rejection message like:
# error: argument input: x does not exist
raise argparse.ArgumentTypeError("{0} does not exist".format(x))
return x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point; since the tool is intended to validate files, the existence check so far has been put into the actual validation rather than letting it happen through argparse. I'll add an issue for moving the existence validation to happen through the argument though!
Description
Refactoring interface from separate
validate
andgenerate-checksum
commands into a singlepipeval
command withvalidate
andgenerate-checksum
subcommandsTest Results
Checklist
File Commits
This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
Disclosing PHI is a major problem1 - Even a small leak can be costly2.
This PR does NOT contain germline genetic data3, RNA-Seq, DNA methylation, microbiome or other molecular data4.
.png
, .jpeg
),.pdf
,.RData
,.xlsx
,.doc
,.ppt
, or other output files.To automatically exclude such files using a .gitignore file, see here for example.
Code Review Best Practices
I have read the code review guidelines and the code review best practice on GitHub check-list.
I have set up or verified the
main
branch protection rule following the github standards before opening this pull request.The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
I have added the major changes included in this pull request to the
CHANGELOG.md
under the next release version or unreleased, and updated the date.Testing
I have added unit tests for the new feature(s).
I modified the integration test(s) to include the new feature.
All new and previously existing tests passed locally and/or on the cluster.
The docker image built successfully on the cluster.
Footnotes
UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records ↩
The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records. ↩
Genetic information is considered PHI.
Forensic assays can identify patients with as few as 21 SNPs ↩
RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity. ↩