Skip to content

Commit 3d0ec6b

Browse files
committed
moved tagging to single cell dir
1 parent 97ce34c commit 3d0ec6b

File tree

2 files changed

+343
-343
lines changed

2 files changed

+343
-343
lines changed

GIt/README.md

Lines changed: 0 additions & 343 deletions
Original file line numberDiff line numberDiff line change
@@ -70,346 +70,3 @@ git push origin main --force
7070

7171

7272

73-
74-
75-
76-
77-
78-
79-
80-
81-
82-
83-
84-
85-
86-
87-
88-
# Version‑tagging a single‑cell analysis project with SemVer (and slight tweaks)
89-
90-
Below is a lightweight **release playbook**—written for a computational‑biology codebase or pipeline that ingests raw single‑cell data, produces processed objects (e.g., `.h5ad`/`.rds`), and generates figures or downstream resources. Adjust the specifics to match your stack (Scanpy, Seurat, Nextflow, Snakemake, etc.).
91-
92-
---
93-
94-
## 1. Version‑string format
95-
96-
```
97-
vMAJOR.MINOR.PATCH[-PRERELEASE][+BUILD]
98-
```
99-
100-
| Component | When you bump it | Example for single‑cell work |
101-
| -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------- |
102-
| **MAJOR** | Breaking changes in input/output contracts – e.g. switching the default reference genome, altering AnnData slot names, renaming pipeline parameters. | 2.0.0 |
103-
| **MINOR** | Backward‑compatible feature additions – e.g. adding doublet filtering module, supporting CITE‑seq alongside RNA. | 1.3.0 |
104-
| **PATCH** | Pure bug fixes or tiny, non‑breaking tweaks – e.g. fixing a gene‑name case bug, bumping a dependency without changing results. | 1.3.2 |
105-
| **PRERELEASE** | Unstable preview (`-alpha`, `-beta.2`, `-rc.1`). CI can publish Docker images like `myproj:1.4.0-rc.1`. | |
106-
| **BUILD** | Optional build metadata (`+nlp1`, `+20250611`). Ignored by SemVer precedence rules; handy if you must embed run date, dataset ID, or commit hash in the artifact’s filename. | |
107-
108-
---
109-
110-
## 2. Recommended Git tag convention
111-
112-
* **Prefix with `v`** (`v1.4.0`, not just `1.4.0`) – many tooling ecosystems expect it.
113-
* **Always create an *annotated* tag**, never a lightweight one:
114-
115-
```bash
116-
git tag -a v1.4.0 -m "Release v1.4.0 – add CITE‑seq support, bump Scanpy 1.11"
117-
git push origin v1.4.0
118-
```
119-
120-
---
121-
122-
## 3. Map versions to your project pieces
123-
124-
| Artifact | Versioning strategy |
125-
| ------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |
126-
| **Pipeline code / scripts** | Tag with SemVer exactly as above. |
127-
| **Frozen reference files** (gene annotations, barcodes) | Keep in `assets/` and bump **MINOR** when you update them without breaking downstream; **MAJOR** if formats change. |
128-
| **Processed datasets** | Embed both pipeline tag *and* dataset release date in the file (e.g. `pbmc10k_v1.2.1_2025‑06‑11.h5ad`). |
129-
| **Docker/Singularity image** | Same tag as the pipeline (`ghcr.io/...:v1.2.1`). |
130-
| **Figures/manuscript panels** | Put a small caption or legend note like “generated with pipeline v1.2.1”. |
131-
132-
---
133-
134-
## 4. Release checklist (single‑cell edition)
135-
136-
1. **Freeze parameters**
137-
Commit the exact `config.yaml` or `params.json` used for the release; bump in code if defaults changed.
138-
2. **Update `__version__` / `VERSION` file**
139-
Keep one source of truth inside the repo; unit tests can assert the tag matches.
140-
3. **Run pipeline end‑to‑end**
141-
Verify reproducibility on canonical dataset(s).
142-
4. **Generate/refresh CHANGELOG.md**
143-
Use a template: “Added”, “Changed”, “Fixed”, “Breaking”.
144-
5. **Tag & push**
145-
146-
```bash
147-
git tag -a v1.4.0 -m "Release v1.4.0"
148-
git push origin main --follow-tags
149-
```
150-
6. **Publish on GitHub/GitLab**
151-
GitHub Releases can attach `.h5ad` or Docker digest; CI/CD can auto‑publish on `push: tags: ["v*"]`.
152-
7. **Archive in Zenodo (optional)**
153-
Zenodo‑GitHub integration will mint a DOI for each tag beginning with `v`.
154-
155-
---
156-
157-
## 5. Handling inevitable mistakes
158-
159-
| Scenario | Remedy |
160-
| --------------------------------------- | --------------------------------------------------------------------------- |
161-
| Tagged wrong commit | `git tag -d v1.4.0 && git push origin :refs/tags/v1.4.0` → retag correctly. |
162-
| Forgot to bump in‑code version constant | Patch release: commit bump, tag `v1.4.1`. |
163-
| Need to hot‑fix an old major line | Create maintenance branch `release/1.x`, cherry‑pick fix, tag `v1.5.3`. |
164-
165-
---
166-
167-
## 6. Beyond classical SemVer
168-
169-
If your project’s *datasets* evolve independently of *code*, you can use **double versioning**:
170-
171-
```
172-
PIPE_VER-DATA_VER
173-
e.g. v2.1.0-d20250611
174-
```
175-
176-
Or follow **CalVer** (calendar versioning) for data: `2025.06.11`.
177-
178-
---
179-
180-
### TL;DR
181-
182-
1. Stick to **vMAJOR.MINOR.PATCH** annotated Git tags for the pipeline.
183-
2. Bump MAJOR = breaking, MINOR = new features, PATCH = bug fixes.
184-
3. Embed the tag in every downstream artifact (HDF5, Docker, figures).
185-
4. Automate the release through CI to guarantee reproducibility.
186-
187-
With this scheme, collaborators (and Future‑You) can always trace any figure or result back to the exact commit, parameters, and reference resources used. Happy versioning!
188-
189-
## 7. Versioning your **conda** environment files
190-
191-
When a single‑cell pipeline depends on a curated Conda environment, treat the *.yml* file as a **first‑class release artifact** and version it in lock‑step with the code.
192-
193-
| Goal | Recommended practice |
194-
| ---------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
195-
| **Human‑readable file name** | `envs/sc‑pipeline‑v1.4.0.yml` – mirror the Git tag (`v1.4.0`). |
196-
| **Self‑identifying YAML** | Inside the file set `name: sc‑pipeline‑v1.4.0` so `conda env create` produces an environment whose `conda env list` entry already tells you its lineage. |
197-
| **Exact package set** | Pin every dependency (`scanpy=1.11.0`, `python=3.11.*`) or, for complete reproducibility, export with `conda list --explicit > env‑v1.4.0.txt` or generate a lock‑file via **conda‑lock**. Commit these alongside the *.yml*. |
198-
| **Git storage layout** | Keep an immutable copy under `envs/`:<br/>`repo root/ <br/>├── envs/ <br/>│ ├── sc-pipeline-v1.3.2.yml <br/>│ └── sc-pipeline-v1.4.0.yml <br/>└── ...` |
199-
| **Tagging & CI** | In your release workflow, after bumping `envs/sc‑pipeline‑vX.Y.Z.yml`, tag the commit as usual (`git tag -a vX.Y.Z …`). Your CI job can then do:<br/>`bash<br/>mamba env create -f envs/sc-pipeline-v${{TAG_NAME}}.yml<br/>` |
200-
| **User docs** | Tell users: “For version *v1.4.0*, run<br/>`bash<br/>mamba env create -f envs/sc-pipeline-v1.4.0.yml<br/>conda activate sc-pipeline-v1.4.0<br/>`”. |
201-
202-
### Quick recipe for a new release
203-
204-
```bash
205-
# 1. Update dependencies
206-
conda env export --from-history | \
207-
sed '/prefix:/d' > envs/sc-pipeline-v1.5.0.yml # remove machine‑specific prefix
208-
209-
# 2. Bump the name field inside the YAML
210-
# name: sc-pipeline-v1.5.0
211-
212-
# 3. Commit & tag
213-
git add envs/sc-pipeline-v1.5.0.yml
214-
git commit -m "Add Conda env for v1.5.0"
215-
git tag -a v1.5.0 -m "Release v1.5.0 – new env"
216-
git push origin main --follow-tags
217-
```
218-
219-
### Pro tips
220-
221-
* **conda‑lock** generates platform‑specific lock files (`*.conda.lock`) that pin exact build strings; include them for bit‑for‑bit reproducibility on CI/cloud.
222-
* If your environment changes between **PATCH** releases only for bug‑fix versions of packages, you can keep the YAML name stable (`sc‑pipeline‑1.4.x`) and just regenerate `conda.lock` on each patch tag.
223-
* To ship a self‑contained tarball, run `conda‑pack -n sc-pipeline-v1.4.0 -o sc_pipeline_v1.4.0.tar.gz` in CI and attach it to the GitHub Release.
224-
225-
With these conventions, anyone can reconstruct the **exact** software stack that produced a given single‑cell result—even years later—by simply checking out tag *vX.Y.Z* and creating the accompanying Conda environment.
226-
227-
228-
229-
230-
231-
232-
233-
234-
235-
236-
237-
238-
239-
240-
241-
# Tags vs Branches for “backup” snapshots
242-
243-
| Purpose | What it is | When strong teams use it |
244-
| ---------- | ------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------- |
245-
| **Tag** | An immutable pointer to a single commit.<br>Has metadata (author, date, message, GPG sig). | Version numbers (`v1.4.0`), publication checkpoints (“paper‑revision‑submitted”), dataset freezes. |
246-
| **Branch** | A *moving* pointer that can accept new commits. |**Long‑lived**: `main`, `develop`, `release/1.x` for maintenance.<br>‑ **Short‑lived**: feature or bug‑fix branches that disappear after merge. |
247-
248-
> **Rule of thumb:**
249-
> *Use a tag when you just need a bookmark; use a branch only if further work will continue on top of that snapshot.*
250-
251-
---
252-
253-
## Common “backup” patterns that stay tidy
254-
255-
| Pattern | How it works | Pros | Cons |
256-
| ---------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | ------------------------------------------------------------------------------ |
257-
| **Annotated tag + pushed to origin** | `git tag -a backup‑2025‑06‑11 -m "Pre‑refactor snapshot"``git push origin backup‑2025‑06‑11` | Zero clutter in branch list, cannot be overwritten by accident. | Static—can’t add more commits. |
258-
| **Archive branch kept *only* on remote** | `git checkout -b archive/2025‑06‑11 && git push -u origin archive/2025‑06‑11`<br>Then *delete it locally*: `git checkout main && git branch -D archive/...` | You can still cherry‑pick or hot‑fix on top of it later. | Shows up in remote branch list forever; can pile up if you’re not disciplined. |
259-
| **Release branch (Git Flow style)** | `release/2.0` branched off `main`, receives only bug fixes; each hot‑fix is tagged (`v2.0.1`). | Clear maintenance lane; production code stays shielded from new features. | More branches to protect/manage; overkill for very small teams. |
260-
| **Full mirror backup** | `git remote add backup git@bitbucket:lab/sc‑proj.git` then `git push backup --mirror`. | Disaster‑recovery copy of *all* refs, not just selective tags/branches. | Needs automation (cron/CI) and access to another server. |
261-
| **GitHub/GitLab Release** | Push a tag, click *“Draft release”*, attach artifacts (conda‑pack, h5ad). | Combines code snapshot *and* binary outputs in one place with DOI option (Zenodo). | Still relies on the tag mechanism; not ideal for WIP. |
262-
263-
---
264-
265-
## What productive teams actually do
266-
267-
1. **Work trunk‑based or Git Flow**
268-
269-
* Small teams: a single `main`/`dev` branch + short‑lived feature branches.
270-
* Bigger teams: `main` → release branches (`release/1.x`) + hot‑fix branches.
271-
272-
2. **Tag every production or analytical milestone**
273-
*Automated in CI:* when `main` is tagged `vX.Y.Z`, pipelines run, containers build, releases publish.
274-
275-
3. **Delete merged feature branches**
276-
Keeps `git branch -r` list ≤ \~10. Historic commits + tags are enough to recover code.
277-
278-
4. **Protect critical branches, lock tags**
279-
GitHub branch protection + “Require signed tags” prevents accidental force‑pushes.
280-
281-
5. **Secondary off‑site mirror**
282-
Nightly `git push --mirror` to a second host (Bitbucket, GitLab, AWS CodeCommit) for “air‑gapped” backup.
283-
284-
---
285-
286-
## A pragmatic backup recipe for your single‑cell project
287-
288-
```bash
289-
# 1. Before a risky refactor
290-
git tag -a backup-pre-refactor-20250611 -m "Stable before new QC module"
291-
git push origin backup-pre-refactor-20250611 # safe snapshot
292-
293-
# 2. Start work on a *temporary* feature branch
294-
git checkout -b feat/new-qc
295-
296-
# 3. When feature merges, delete the branch
297-
git checkout main && git merge --no-ff feat/new-qc
298-
git push origin main
299-
git push origin --delete feat/new-qc
300-
```
301-
302-
*Result:* your branch list stays clean, but the tag permanently records the pre‑refactor state.
303-
304-
---
305-
306-
## TL;DR
307-
308-
* **Need a quick, permanent snapshot?** → Use an **annotated tag** and push it.
309-
* **Need to continue editing that snapshot?** → Create a **branch**, but delete it locally when merged, or move it under an `archive/` namespace.
310-
* **Worried about losing everything?** → Automate a `--mirror` push to a second remote.
311-
312-
Combine these with protected branches and automated releases, and you’ll have both tidy history **and** rock‑solid backups—just like the big teams.
313-
314-
315-
316-
317-
318-
319-
320-
321-
322-
323-
324-
325-
326-
327-
328-
329-
330-
# Which reference type is fastest for **jumping back‑and‑forth** and **diffing**?
331-
332-
| Action | Lightweight **tag** | **Branch** | Why it matters |
333-
| -------------------------------- | -------------------------------------------------------------- | ---------------------------------------------------- | -------------------------------------------------------- |
334-
| **Checkout (read‑only)** | `git switch --detach v1.4.0`*detached HEAD* | `git switch backup/2025‑06‑11` | Both are one‑line and instant; no real speed difference. |
335-
| **Make edits / hot‑fix** | Must first make a branch: `git switch -c hotfix/v1.4.0 v1.4.0` | Already writable | Branch wins for anything beyond inspection. |
336-
| **Diff against `main`** | `git diff v1.4.0..main path/to/file.py` | `git diff backup/2025‑06‑11..main` | Same syntax; both easy. |
337-
| **List history graphically** | `git log --decorate --graph --oneline --all` shows tags inline | Branch shows up in `git branch -a` list, tags do not | Too many *backup* branches can clutter the view. |
338-
| **Accidental edits on snapshot** | Impossible—the tag is immutable | Possible if you forget to protect the branch | Tag is safer. |
339-
340-
---
341-
342-
### Recommended workflow for **“switch, inspect, maybe patch”**
343-
344-
1. **Tag first – for a pristine snapshot**
345-
346-
```bash
347-
git tag -a snapshot‑2025‑06‑11 -m "Before major refactor"
348-
git push origin snapshot‑2025‑06‑11
349-
```
350-
351-
2. **Create a *throw‑away* branch only when you need to touch code**
352-
353-
```bash
354-
# jump to the snapshot in read‑only mode (detached HEAD)
355-
git switch --detach snapshot‑2025‑06‑11
356-
357-
# realise you need to patch something?
358-
git switch -c hotfix/snapshot‑2025‑06‑11 # now you're on a branch
359-
```
360-
361-
3. **Compare specific files or directories**
362-
363-
```bash
364-
# Compare a single module
365-
git diff snapshot‑2025‑06‑11..main src/qc/filtering.py
366-
367-
# or interactive tool (requires difftool configured)
368-
git difftool snapshot‑2025‑06‑11..HEAD -- src/qc/
369-
```
370-
371-
4. **Merge or cherry‑pick if the patch is valuable**
372-
373-
```bash
374-
git checkout main
375-
git merge --no-ff hotfix/snapshot‑2025‑06‑11 # or cherry‑pick one commit
376-
git push origin main
377-
```
378-
379-
5. **Delete the temporary branch to keep things tidy**
380-
381-
```bash
382-
git branch -d hotfix/snapshot‑2025‑06‑11
383-
git push origin --delete hotfix/snapshot‑2025‑06‑11
384-
```
385-
386-
---
387-
388-
### Power tip: **`git worktree`** for *simultaneous* views
389-
390-
```bash
391-
# Check out the tag in a sibling directory without touching your main working tree
392-
git worktree add ../proj‑snapshot snapshot‑2025‑06‑11
393-
```
394-
395-
You now have two folders:
396-
397-
* `sc‑project/` – the normal `main`
398-
* `proj‑snapshot/` – frozen at the tag
399-
400-
Open both in your IDE and diff visually; no branch juggling required. When done:
401-
402-
```bash
403-
git worktree remove ../proj‑snapshot
404-
```
405-
406-
---
407-
408-
## TL;DR
409-
410-
* **Tag** every backup point — it’s immutable and invisible in the branch list.
411-
* **Branch only as a scratchpad** when you discover you actually need to edit or test code on that snapshot.
412-
* Use `git diff TAG..main <file>` or `git worktree` for side‑by‑side comparison.
413-
414-
This hybrid (“tag + ad‑hoc branch”) keeps history clean **and** lets you jump, compare, or patch in seconds—exactly how mature teams stay productive without drowning in stale branches.
415-

0 commit comments

Comments
 (0)