-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bind DVC data registry #880
Conversation
Build Failed 💥 |
ac0bcee
to
6cf3050
Compare
Build Failed 💥 |
Build Failed 💥 |
b13c40f
to
0a14c55
Compare
Build passed ! Good Job 🍻 ! |
0a14c55
to
ea2d0e3
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #880 +/- ##
==========================================
+ Coverage 69.46% 69.49% +0.03%
==========================================
Files 390 392 +2
Lines 20944 20966 +22
Branches 3205 3207 +2
==========================================
+ Hits 14548 14570 +22
Misses 5093 5093
Partials 1303 1303
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comments, I would like @EmmaRenauld or/and @karanphil to give their feedback about the init.py first before moving forward.
I found a bug in |
b485339
to
c67eaaf
Compare
c67eaaf
to
1cbd4c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GTG
Quick description
Everything required in Scilpy to throw away gdrive and embrace DVC !
From discussions :
Looking at everything (go through scil_data, it seems like a lot of duplication, disk space must be wasted !!!!! Not at all.
Everything contained in the repository is index files for DVC, kilobyte sized, super light for Git. The data per say is all contained here. You won't be able to recognize anything, that's the point. Every file indexed by DVC gets hashed and the md5 is used then to create the filepath.
Doing so, we can allow duplication in the Git repository - since md5 always match, even when the names and timestamps differ - and there will never be duplication on the data server.
Using DVC is very similar to how we used the fetcher, with the addition that we now need to keep track of test packages revisions. To create a new test data package :
data/test_descriptors.yml
under your test case name.pull_test_case_package
function fromscilpy.io.dvc
in your test script to make the data available when testing....
Type of change
Check the relevant options.
Provide data, screenshots, command line to test (if relevant)
The aodf_metrics test case is implemented as POC :
pytest scripts/tests/test_aodf_metrics.py
Checklist