Skip to content

Commit

Permalink
Improve documentation (#290)
Browse files Browse the repository at this point in the history
* Improve documentation.

* More improvements.

* Add glossary.

* Keep working on docs.

* Add API page.

* More work.

* Run isort.

* Fix more stuff.

* Fix module name.

* Update conf.py

* Reorg docs a bit.

* Fix import.

* Try this.

* Revert attempted link.

* Update example.rst

* Address review.

* Fix paths.
  • Loading branch information
tsalo authored Feb 1, 2024
1 parent 0988440 commit 4d9704d
Show file tree
Hide file tree
Showing 18 changed files with 815 additions and 463 deletions.
24 changes: 13 additions & 11 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,34 +2,36 @@
CuBIDS: Curation of BIDS
========================


.. image:: https://img.shields.io/pypi/v/cubids.svg
:target: https://pypi.python.org/pypi/cubids
:target: https://pypi.python.org/pypi/cubids

.. image:: https://circleci.com/gh/PennLINC/CuBIDS.svg?style=svg
:target: https://circleci.com/gh/PennLINC/CuBIDS
:target: https://circleci.com/gh/PennLINC/CuBIDS

.. image:: https://readthedocs.org/projects/cubids/badge/?version=latest
:target: https://cubids.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status
:target: https://cubids.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status


About
-----

Curation of BIDS, or ``CuBIDS``, is a workflow and software package designed to facilitate
``CuBIDS`` (Curation of BIDS) is a workflow and software package designed to facilitate
reproducible curation of neuroimaging `BIDS <https://bids-specification.readthedocs.io/>`_ datasets.
CuBIDS breaks down BIDS dataset curation into four main components and addresses each one using
various command line programs complete with version control capabilities.
These components are not necessarily linear but all are critical
in the process of preparing BIDS data for successful preprocessing and analysis pipeline runs.

1. CuBIDS facilitates the validation of BIDS data.
2. CuBIDS visualizes and summarizes the heterogeneity in a BIDS dataset.
3. CuBIDS helps users test pipelines on the entire parameter space of a BIDS dataset.
4. CuBIDS allows users to perform metadata-based quality control on their BIDS data.
1. CuBIDS facilitates the validation of BIDS data.
2. CuBIDS visualizes and summarizes the heterogeneity in a BIDS dataset.
3. CuBIDS helps users test pipelines on the entire parameter space of a BIDS dataset.
4. CuBIDS allows users to perform metadata-based quality control on their BIDS data.
5. CuBIDS helps users clean protected information in BIDS datasets,
in order to prepare them for public sharing.

.. image:: https://github.com/PennLINC/CuBIDS/raw/main/docs/_static/cubids_workflow.png
:width: 600

For full documentation, please visit our
`ReadTheDocs <https://cubids.readthedocs.io/en/latest/?badge=latest>`_
`ReadTheDocs <https://cubids.readthedocs.io/en/latest/?badge=latest>`_.
17 changes: 17 additions & 0 deletions cubids/__about__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# emacs: -*- mode: python; py-indent-offset: 4; indent-tabs-mode: nil -*-
# vi: set ft=python sts=4 ts=4 sw=4 et:
"""Base module variables."""
try:
from cubids._version import __version__
except ImportError:
__version__ = "0+unknown"

__packagename__ = "CuBIDS"
__copyright__ = "Copyright 2023, The CuBIDS Developers"
__credits__ = (
"Contributors: please check the ``.zenodo.json`` file at the top-level folder "
"of the repository."
)
__url__ = "https://github.com/PennLINC/CuBIDS"

DOWNLOAD_URL = f"https://github.com/PennLINC/{__packagename__}/archive/{__version__}.tar.gz"
29 changes: 23 additions & 6 deletions cubids/__init__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,28 @@
"""Top-level package for CuBIDS."""

__author__ = """PennLINC"""
__email__ = "[email protected]"
__version__ = "0.1.0"

from cubids.cubids import CuBIDS
from cubids import (
cli,
config,
constants,
cubids,
metadata_merge,
utils,
validator,
workflows,
)
from cubids.__about__ import __copyright__, __credits__, __packagename__, __version__

__all__ = [
"CuBIDS",
"__copyright__",
"__credits__",
"__packagename__",
"__version__",
"cli",
"config",
"constants",
"cubids",
"metadata_merge",
"utils",
"validator",
"workflows",
]
2 changes: 1 addition & 1 deletion cubids/tests/test_bond.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
import pandas as pd
import pytest

from cubids import CuBIDS
from cubids.cubids import CuBIDS
from cubids.metadata_merge import merge_json_into_json, merge_without_overwrite
from cubids.tests.utils import (
_add_deletion,
Expand Down
2 changes: 1 addition & 1 deletion cubids/workflows.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
import pandas as pd
import tqdm

from cubids import CuBIDS
from cubids.cubids import CuBIDS
from cubids.metadata_merge import merge_json_into_json
from cubids.utils import _get_container_type
from cubids.validator import (
Expand Down
31 changes: 0 additions & 31 deletions docs/README.rst

This file was deleted.

101 changes: 44 additions & 57 deletions docs/about.rst
Original file line number Diff line number Diff line change
@@ -1,85 +1,72 @@
===================
==========
Background
===================
==========

Motivation
-------------
----------

The Brain Imaging Data Structure (BIDS) is a simple and intuitive way to
organize and describe MRI data [#f1]_. Because of its ease of use, a wide array of
preprocessing and analysis tools and pipelines have been developed specifically
to operate on data curated in BIDS [#f2]_. These tools are able to automatically
self-configure to the user's BIDS dataset, which saves time and effort on the
part of the user. However, as datasets increase in size and complexity, it
can be dangerous to blindly run these pipelines without a careful understanding of
what's really in your BIDS data. Having knowledge of this potential **heterogeneity**
ahead of time gives researchers the ability to **predict pipeline configurations**,
**predict potential errors**, avoid running **unwanted or unusable data**, and **budget
their computational time and resources** effectively.

``CuBIDS`` is designed to facilitate the curation of large, neuroimaging data so
that users can infer useful information from descriptive and accurate BIDS labels
before running pipelines *en masse*. ``CuBIDS`` accomplishes this by summarizing
BIDS data using :ref:`keygroup`, :ref:`paramgroup`, and :ref:`acquisitiongroup` categorizations in your data (we'll explain what these
are in more detail in the next section).
organize and describe MRI data [#f1]_.
Because of its ease of use, a wide array of preprocessing and analysis tools and
pipelines have been developed specifically to operate on data curated in BIDS [#f2]_.
These tools are able to automatically self-configure to the user's BIDS dataset,
which saves time and effort on the part of the user.

However, as datasets increase in size and complexity,
it can be dangerous to blindly run these pipelines without a careful understanding of
what's really in your BIDS data.
Having knowledge of this potential **heterogeneity** ahead of time gives researchers
the ability to **predict pipeline configurations**, **predict potential errors**,
avoid running **unwanted or unusable data**,
and **budget their computational time and resources** effectively.

``CuBIDS`` is designed to facilitate the curation of large,
neuroimaging datasets so that users can infer useful information from descriptive and
accurate BIDS labels before running pipelines *en masse*.
``CuBIDS`` accomplishes this by summarizing BIDS data using :ref:`keygroup`,
:ref:`paramgroup`, and :ref:`acquisitiongroup` categorizations in your data
(we'll explain what these are in more detail in the next section).

The image below demonstrates the ``CuBIDS`` workflow that we'll discuss on the next page.

.. image:: _static/cubids_workflow.png
:width: 600

``CuBIDS`` also incorporates ``DataLad`` as an optional dependency for maintaining data provenance, enhancing
reproducibility, and supporting collaboration [#f3]_.
``CuBIDS`` also incorporates ``DataLad`` as an optional dependency for maintaining data provenance,
enhancing reproducibility, and supporting collaboration [#f3]_.

Definitions
------------

What CuBIDS Is Not
------------------

``CuBIDS`` is not designed to convert raw data into BIDS format.
For that, we recommend using `conversion tools <https://bids.neuroimaging.io/benefits.html#converters>`_.
``CuBIDS`` then takes over once you have a valid BIDS dataset,
prior to running any preprocessing or analysis pipelines, or to sharing the dataset.

.. topic:: Key Group
.. note::

* A set of scans whose filenames share all `BIDS filename key-value pairs <https://bids-specification.readthedocs.io/en/stable/02-common-principles.html#file-name-structure>`_, excluding subject and session
* Derived from the BIDS Filename
* Example structure: ``acquisition-*_datatype-*_run-*_task-*_suffix``
CuBIDS _should_ work on BIDS-ish (not quite BIDS compliant, but in a similar format) datasets,
but this is by no means guaranteed.

.. topic:: Parameter (Param) Group

* The set of scans with identical metadata parameters in their sidecars
* Defined within a Key Group
* Numerically identified (each Key Group will have n Param Groups, where n is the number of unique sets of scanning parameters present in that Key Group. e.g. 1, 2, etc.)

.. topic:: Dominant Group

* The Param Group that contains the most scans in its Key Group

.. topic:: Variant Group

* Any Param Group that is non-dominant

.. topic:: Rename Key Group

* Auto-generated, recommended new Key Group name for Variant Groups
* Based on the metadata parameters that cause scans in Variant Groups to vary from those in their respective Dominant Groups

.. topic:: Acquisition Group

* A collection of sessions across participants that contains the exact same set of Key and Param Groups

Examples
""""""""

Dominant Group resting state BOLD:
* Example Filename: ``sub-01_ses-A_task-rest_acq-singleband_bold.nii.gz``
* Key Group: ``acquisition-singleband_datatype-func_suffix-bold_task-rest``
* Param Group: ``1`` (Dominaint Group)

* Example Filename: ``sub-01_ses-A_task-rest_acq-singleband_bold.nii.gz``
* Key Group: ``acquisition-singleband_datatype-func_suffix-bold_task-rest``
* Param Group: ``1`` (Dominant Group)

Variant Group resting state BOLD (all scans in this Param Group are missing a fieldmap)
* Example Filename: ``sub-02_ses-A_task-rest_acq-singleband_bold.nii.gz``
* Key Group: ``acquisition-singleband_datatype-func_suffix-bold_task-rest``
* Param Group: ``2`` (Variant Group)
* Rename Key Group: ``acquisition-singlebandVARIANTNoFmap_datatype-func_suffix-bold_task-rest``

In the next section, we'll discuss these definitions in more detail and demonstrate ``CuBIDS`` usage.
* Example Filename: ``sub-02_ses-A_task-rest_acq-singleband_bold.nii.gz``
* Key Group: ``acquisition-singleband_datatype-func_suffix-bold_task-rest``
* Param Group: ``2`` (Variant Group)
* Rename Key Group: ``acquisition-singlebandVARIANTNoFmap_datatype-func_suffix-bold_task-rest``

These definitions are described in more detail in :doc:`glossary` and :doc:`usage`.

.. rubric:: Footnotes

Expand Down
74 changes: 74 additions & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
.. include:: links.rst

===
API
===

*********************************
:mod:`cubids.cubids`: Main Module
*********************************

.. currentmodule:: cubids

.. autosummary::
:toctree: generated/
:template: class.rst

cubids.cubids.CuBIDS


*******************************************
:mod:`cubids.workflows`: Workflow Functions
*******************************************

.. currentmodule:: cubids

.. autosummary::
:toctree: generated/
:template: function.rst

cubids.workflows.validate
cubids.workflows.bids_sidecar_merge
cubids.workflows.group
cubids.workflows.apply
cubids.workflows.datalad_save
cubids.workflows.undo
cubids.workflows.copy_exemplars
cubids.workflows.add_nifti_info
cubids.workflows.purge
cubids.workflows.remove_metadata_fields
cubids.workflows.print_metadata_fields


**********************************************
:mod:`cubids.metadata_merge`: Merging Metadata
**********************************************

.. currentmodule:: cubids

.. autosummary::
:toctree: generated/
:template: function.rst

cubids.metadata_merge.check_merging_operations
cubids.metadata_merge.merge_without_overwrite
cubids.metadata_merge.merge_json_into_json
cubids.metadata_merge.get_acq_dictionary
cubids.metadata_merge.group_by_acquisition_sets


***********************************
:mod:`cubids.validator`: Validation
***********************************

.. currentmodule:: cubids

.. autosummary::
:toctree: generated/
:template: function.rst

cubids.validator.build_validator_call
cubids.validator.build_subject_paths
cubids.validator.run_validator
cubids.validator.parse_validator_output
cubids.validator.get_val_dictionary
14 changes: 14 additions & 0 deletions docs/cli.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
======================
Command Line Interface
======================

.. code-block:: bash
cubids --help
This will print the instructions for using the command line interface in your command line.

.. argparse::
:ref: cubids.cli._get_parser
:prog: cubids
:func: _get_parser
Loading

0 comments on commit 4d9704d

Please sign in to comment.