Conversation
New presentation: ReproFlow & YODA: Structure your studies, observable
and reproducible they become
- Title slide and metadata updated for Feb 6, 2026 ReproNim webinar
- Abstract emphasizes observability and reproducibility themes
- QR code generated for slides URL
- Planning materials organized in YODA-compliant structure:
- notes/act2-refinement-notes.md: Research on BEP028, BABS, Nipoppy,
BIDS-flux, FAIRly big framework, and SciOps principles
- planning/proposed-structure.md: 5-act narrative structure proposal
- README.md: Overview and entry point for all materials
Theme: YODA principles + BIDS composition + ReproFlow/reprostim tooling
enable observable and reproducible neuroimaging workflows from
acquisition to publication. Emphasis on provenance (BEP028),
dashboard separation pattern, and AI as amplifier of structured data.
Elaborate on how modular composition creates "condensed frontiers" -
transformed, summarized, or extracted forms that are more appropriate
for downstream use while maintaining exact version-controlled links to
source materials.
Key insight: Each module/subdataset serves as both:
- A stopping point ("stopping the bleeding" of data/complexity)
- A usable interface for next level (condensed/transformed)
- A versioned link back to full source (reproducible)
Examples across domains:
- Neuroscience: TB of ephys recordings → spike trains (1000x smaller)
- Software: Source code → compiled binaries (platform-appropriate)
- BIDS: Multi-stage cascade (DICOM → BIDS → derivatives → paper)
- Data analysis: Individual measurements → summary statistics
- AI/ML: Full corpora → embeddings/indices
- Meetings: Video recordings → minutes
- Genomics: Full genomes → variant calls (1000x smaller)
- Dashboards: .tsv data → interactive visualizations
Pattern enables:
- Cognitive load reduction (work at appropriate level)
- Performance (smaller, transformed data)
- Reproducibility (exact source association via git hexsha)
- Flexibility (multiple frontiers from same source)
- Evolvability (regenerate as methods improve)
Anti-pattern: Orphaned frontiers without source links
Best practice: Version control both source and frontier as modules
Visual metaphor: "Surface you create, depth you preserve"
This concept integrates throughout Acts II-IV of the presentation.
Software section: - Add NeuroDebian as example of source → package transformation - Add reproducible-builds.org for bit-identical binaries - Add snapshot.debian.org (~20PB) as non-git archival approach - Emphasize pattern is universal, not DataLad/git-specific Literature section (complete rewrite): - Replace generic example with DANDI Archive citation workflow - Detail dandi-bib: metadata → BibTeX/RIS/Zotero (daily automation) - Detail citations-collector: DOI → citation discovery (WiP) - 8 citation types (Publication, Preprint, Protocol, etc.) - 11 relationship types (Cites, Uses, IsDocumentedBy, etc.) - Show multi-layer frontier condensation in action - Zotero as "dashboard" - regenerable view of version-controlled data Meetings section: - Add real-world practice of maintaining local Zoom archive - Emphasize reusable resource (decisions, training, quotes) - Storage cheap, context priceless Universal pattern section: - New section: "The Pattern is Universal, Not Tool-Specific" - Compare git/DataLad, snapshot.debian.org, container registries, data repositories, academic citations - Emphasize principles over tools: explicit linking, retrievability, versioning, automation, modularity - Message: "Pattern is ancient, tools evolve—embrace principles" All examples now concrete, traceable projects with URLs.
Document how to weave frontier condensation throughout all 5 acts: - Act I: Introduce concept with Principle 3 (modular composition) - Act II: Show in practice (ReproFlow, tools, dashboards) - Act III: BIDS as 4-stage frontier cascade - Act IV: AI as frontier generator (structured vs unstructured) - Act V: Universal pattern across domains Key reframings: - BIDS pipeline = cascade of frontiers (DICOM → derivatives → paper) - Dashboards = visualization frontiers (consume, don't own data) - AI summaries = version-controlled frontiers with source links - Tools comparison = different condensation strategies Visual motif: Two-layer diagrams (frontier ⇅ source) Terminology: Surface/depth, frontier/source, condensation/link New slides proposed: - Frontier condensation pattern intro - Software example: NeuroDebian + reproducible-builds + snapshot - Literature example: dandi-bib workflow - Meeting archives as resource - Universal pattern comparison table Narrative thread: 'Surface you create, depth you preserve' Questions for refinement discussion tomorrow morning.
Comprehensive overview of current status and tomorrow's agenda: Status: - Presentation header updated and committed - Materials organized in YODA-compliant structure - Research complete on BEP028, BABS, Nipoppy, BIDS-flux, etc. - Frontier condensation concept developed and documented Key breakthrough: - Frontier condensation = hierarchical transformation pattern - Each module: stopping point, transformation, usable interface, linked source - Tagline: 'Surface you create, depth you preserve' - Unifies YODA, BIDS, ReproFlow, dashboards, AI under one framework Tomorrow's agenda: 1. Review/refine frontier condensation concept 2. Decide presentation structure (explicit theme vs. woven throughout) 3. Prioritize new slides (high/medium/low) 4. Content decisions (keep/reduce/enhance) 5. Visual design (two-layer diagrams) 6. Time allocation (~45 min webinar) Questions to resolve: - Terminology: 'frontier condensation' or alternative? - Emphasis: DataLad-specific vs. universal pattern? - Depth: Tool details vs. conceptual overview? - Personal anecdotes: Include Zoom archive practice? - Slide count: Realistic for Feb 6 deadline? Resources ready: 4 planning docs, all references documented Timeline: 4 days to Feb 6 (realistic but tight) Differentiator: YODA as transformation framework, not just organization
|
Perhaps some data not added? I ran datalad get error (probably not related) |
|
tip of trade: when doing collapsed you need to add https://datasets.datalad.org/centerforopenneuroscience/talks/.git remote which would have annexed content, here we have none. |
oops frequent error fix now.
Can you add that to the README (that did work) |
|
|
||
| <div style="position: relative; width: 100%; height: 90vh;"> | ||
|
|
||
| <img src="pics/bids-nipoppy.png" class="" width="80%" style="position: absolute; top: 3%; left: 0%" /> |
There was a problem hiding this comment.
Nipoppy bids study layout has been updated. suggest screenshot of nipoppy/nipoppy#687
asmacdo
left a comment
There was a problem hiding this comment.
Overall, I think theres too much info. Especially in the images, many contain more text than is readable/digestable during a live presentation, which can make the key point easy to miss (e.g., the tree structure for OpenNeuroDerivatives, or the container dataset composition for repronim-containers).
| - `git reset --hard && git clean -dfx` -- no evil was done | ||
| - `git reset --hard HEAD^` -- forget we did it | ||
| - `git reset --hard COMMITISH` -- get-the-hell-out-of-here |
There was a problem hiding this comment.
I like the humor, but the intended audience of this may not get the joke.
| - `git reset --hard && git clean -dfx` -- no evil was done | |
| - `git reset --hard HEAD^` -- forget we did it | |
| - `git reset --hard COMMITISH` -- get-the-hell-out-of-here | |
| - `git reset --hard && git clean -dfx` -- discard all uncommitted changes and untracked files | |
| - `git reset --hard HEAD^` -- undo the last commit (and its changes) | |
| - `git reset --hard COMMITISH` -- reset to a specific earlier state |
|
|
||
| ---- | ||
|
|
||
| ### DataLad runs in the wild: [datalad-usage-registry](http://github.com/datalad/datalad-usage-dashboard) |
There was a problem hiding this comment.
Its not obvious that this is a link
| 14.33% | ||
| ``` | ||
|
|
||
| - and if we know the command on how to `get` the file ... |
There was a problem hiding this comment.
I dont get where you're going with this
|
|
||
| ### [datalad containers-run](https://docs.datalad.org/projects/container/en/stable/generated/man/datalad-containers-run.html#datalad-containers-run) it now! | ||
|
|
||
|  |
|
|
||
| <small> | ||
|
|
||
| N.B. Talk to Alex Waite about their work on guaranteeing reproducibility and network encapsulation of containers. |
There was a problem hiding this comment.
Should this be in the slide? maybe link to his github?
There was a problem hiding this comment.
slides now are still just a copy from distribits talk and there Alex was in the audience and it made sense... for this one indeed doesn't directly, I will remove
| ---- | ||
|
|
||
| ### We already deal with "global" layouts | ||
|
|
||
| #### defined "globally" while relying on "packages" to adhere to the specifications. | ||
|
|
||
| - Operating Systems layouts, e.g. | ||
| - [Filesystem Hierarchy Standard (FHS)](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) | ||
| - [XDG (Cross-Desktop Group)](https://specifications.freedesktop.org/basedir-spec/latest/) | ||
|
|
||
| <small> | ||
|
|
||
| N.B. YODA skill: Use Joey's [etckeeper](https://etckeeper.branchable.com) to keep your `/etc` under git | ||
|
|
||
| </small> | ||
|
|
||
| ---- | ||
|
|
||
| ### We already have "project" layouts | ||
|
|
||
| #### defined locally per project | ||
|
|
||
| - Programming language/platform specific | ||
| - e.g. think of a typical Python project | ||
| - Typically not nested (unless "vendoring") | ||
|
|
||
|
|
||
| ---- |
There was a problem hiding this comment.
I think we could lead into layouts needed for YODA in 1 slide, no need to get into the weeds.
|
Mary Poppin Bag is just wonderful for this indeed! |
| <small> | ||
|
|
||
| N.B. Talk to Alex Waite about their work on guaranteeing reproducibility and network encapsulation of containers. | ||
|
|
||
| </small> | ||
|
|
There was a problem hiding this comment.
| <small> | |
| N.B. Talk to Alex Waite about their work on guaranteeing reproducibility and network encapsulation of containers. | |
| </small> |
Co-authored-by: Austin Macdonald <[email protected]>



No description provided.