|
| 1 | +# scplotter - AI Coding Agent Instructions |
| 2 | + |
| 3 | +## Project Overview |
| 4 | + |
| 5 | +`scplotter` is an R package for publication-quality visualizations of single-cell sequencing data. It's built as a **wrapper layer** around [`plotthis`](https://github.com/pwwang/plotthis), extracting data from Seurat/Giotto objects and .h5ad files to pass to plotthis functions. |
| 6 | + |
| 7 | +**Key Architecture Principle**: This package does NOT implement core plotting logic. It transforms single-cell data structures into formats compatible with `plotthis`, which handles the actual visualization. |
| 8 | + |
| 9 | +## Core Components |
| 10 | + |
| 11 | +### Plot Function Categories |
| 12 | +1. **scRNA-seq**: `CellDimPlot`, `CellStatPlot`, `FeatureStatPlot`, `EnrichmentPlot`, `MarkersPlot`, `CCCPlot` (cell-cell communication) |
| 13 | +2. **scTCR/BCR-seq**: `Clonal*Plot` family (15+ functions for TCR/BCR repertoire analysis via `scRepertoire`) |
| 14 | +3. **Spatial**: `SpatDimPlot`, `SpatFeaturePlot` with multi-platform support (Visium, VisiumHD, Xenium, CosMx, CODEX, etc.) |
| 15 | +4. **LLM Integration**: `SCPlotterChat` R6 class using `tidyprompt` for natural language plot generation |
| 16 | + |
| 17 | +### Data Flow Pattern |
| 18 | +All plot functions follow this structure: |
| 19 | +```r |
| 20 | +Plot*() -> UseMethod() -> Plot*.Seurat / Plot*.giotto / Plot*.H5File -> plotthis::PlotFunction() |
| 21 | +``` |
| 22 | + |
| 23 | +Functions extract embeddings, metadata, graphs from objects, then delegate to `plotthis`. See `R/celldimplot.R` (lines 1-100) for the canonical pattern. |
| 24 | + |
| 25 | +## Critical Development Patterns |
| 26 | + |
| 27 | +### 1. Object Support via S3 Methods |
| 28 | +Functions must support multiple object types through S3 dispatch: |
| 29 | +- `Seurat` objects (via `SeuratObject` package) |
| 30 | +- `giotto` objects (via `GiottoClass`) |
| 31 | +- `.h5ad` files (paths or `hdf5r::H5File` objects) |
| 32 | + |
| 33 | +Example: `SpatPlot.Seurat`, `SpatPlot.giotto`, `SpatPlot.H5File` in `R/spatialplot.R` |
| 34 | + |
| 35 | +### 2. Import Strategy |
| 36 | +- Use `@importFrom` for specific functions (never `@import` for entire packages except R6) |
| 37 | +- Import from `plotthis` heavily: `@importFrom plotthis DimPlot ChordPlot Heatmap ...` |
| 38 | +- Import helper functions via `getFromNamespace()`: `check_columns <- getFromNamespace("check_columns", "plotthis")` (see `R/utils.R`) |
| 39 | +- Tidyverse imports: `@importFrom dplyr filter mutate group_by`, `@importFrom tidyr pivot_wider` |
| 40 | + |
| 41 | +### 3. Clone Selector DSL (scRepertoire) |
| 42 | +Unique feature: expression-based clone selection in `Clonal*Plot` functions |
| 43 | +- Users write: `clones = "top(3)"` or `clones = "shared() & gt(5)"` |
| 44 | +- Implementation: `rlang::parse_expr()` evaluates selectors as functions |
| 45 | +- Selector functions: `top()`, `shared()`, `uniq()`, `gt()`, `ge()`, `lt()`, `le()`, `and()`, `or()`, `sel()` |
| 46 | +- See `R/clonalutils.R` (lines 360+) and test files in `tests/testthat/test-clone_selectors_*.R` |
| 47 | + |
| 48 | +### 4. H5AD File Support |
| 49 | +- Read .h5ad files using `hdf5r` package |
| 50 | +- Helpers: `h5group_to_dataframe()`, `h5group_to_matrix()` in `R/utils.R` |
| 51 | +- Handle sparse matrices (CSR format) and categorical data from AnnData |
| 52 | +- See vignette: `vignettes/Working_with_anndata_h5ad_files.Rmd` |
| 53 | + |
| 54 | +### 5. Spatial Data Complexity |
| 55 | +Spatial functions (`R/spatialplot.R`) handle multiple coordinate systems: |
| 56 | +- **Visium**: `imagerow`/`imagecol` coordinates, scale_factor=1 |
| 57 | +- **VisiumHD**: Flexible binning, requires `scale_factor` parameter |
| 58 | +- **FOV-based** (Xenium, CosMx): `x`/`y` coordinates, optional molecule points (`nmols`) |
| 59 | +- **Giotto**: Uses `spat_unit`, `feat_type`, `spat_loc_name` parameters |
| 60 | + |
| 61 | +Check object type first: `class(object@images[[first_image]])` → dispatch to appropriate handler |
| 62 | + |
| 63 | +## Build & Test Workflow |
| 64 | + |
| 65 | +### Development Commands (Makefile) |
| 66 | +```fish |
| 67 | +make readme # Build README from README.Rmd |
| 68 | +make docs # Generate documentation + pkgdown site |
| 69 | +make install # Build and install package locally |
| 70 | +make test # Run testthat tests |
| 71 | +make notebooks # Convert notebooks to HTML (use EXECUTE=true to run) |
| 72 | +``` |
| 73 | + |
| 74 | +### Testing |
| 75 | +- Framework: `testthat` (see `tests/testthat/test-*.R`) |
| 76 | +- Focus on clone selector logic, not plot outputs |
| 77 | +- Run via `devtools::test()` or `make test` |
| 78 | + |
| 79 | +### Documentation |
| 80 | +- Roxygen2 for function docs (roxygen2 7.3.3) |
| 81 | +- pkgdown site config: `_pkgdown.yml` (Bootstrap 5, litera theme) |
| 82 | +- Spatial examples in `vignettes/articles/` (16 platform-specific tutorials) |
| 83 | + |
| 84 | +### CI/CD (.github/workflows/main.yml) |
| 85 | +- Runs on Ubuntu (R 4.4.1) |
| 86 | +- Installs system dep: `libglpk40` for igraph |
| 87 | +- Builds pkgdown site, deploys to gh-pages |
| 88 | +- Requires `OPENAI_API_KEY` secret for LLM vignette |
| 89 | + |
| 90 | +## Package Dependencies |
| 91 | + |
| 92 | +**Core**: `plotthis` (main plotting engine), `scRepertoire` (>= 2.0.8, < 2.3.2), `Seurat` (>= 5.0.0), `SeuratObject` |
| 93 | + |
| 94 | +**Spatial**: `GiottoClass`, `GiottoData`, `hdf5r`, `terra` |
| 95 | + |
| 96 | +**TCR/BCR**: `scRepertoire`, `iNEXT` (rarefaction), `metap` (>= 1.11, p-value combination) |
| 97 | + |
| 98 | +**LLM**: `tidyprompt` (from GitHub: KennispuntTwente/tidyprompt), `callr` |
| 99 | + |
| 100 | +**Remotes**: `plotthis`, `GiottoClass`, `GiottoData`, `tidyprompt` installed from GitHub via `Remotes:` field |
| 101 | + |
| 102 | +## Common Pitfalls |
| 103 | + |
| 104 | +1. **Don't implement plots**: Call `plotthis::*Plot()`, don't reinvent visualization logic |
| 105 | +2. **Version constraints**: `scRepertoire` locked to 2.0.8-2.3.1 (API changes), `ggVennDiagram >= 1.5.0` |
| 106 | +3. **Lazy data**: `LazyData: true` with `xz` compression for included datasets (`data/*.rda`) |
| 107 | +4. **Clone selector scope**: Selectors evaluate in caller environment - use `rlang::caller_env()` carefully |
| 108 | +5. **Spatial coords**: Always check platform-specific coordinate naming (x/y vs imagerow/imagecol) |
| 109 | + |
| 110 | +## When Adding New Plot Functions |
| 111 | + |
| 112 | +1. Create S3 generic + methods for Seurat/giotto/H5File |
| 113 | +2. Extract necessary data (embeddings, metadata, matrices) |
| 114 | +3. Transform to format expected by `plotthis::*Plot()` |
| 115 | +4. Add `@importFrom` statements for all dependencies |
| 116 | +5. Write examples using included datasets: `pancreas_sub`, `ifnb_sub`, `cellphonedb_res` |
| 117 | +6. Add vignette if introducing new data type or analysis workflow |
| 118 | + |
| 119 | +## LLM Chat Integration |
| 120 | + |
| 121 | +`SCPlotterChat` (R6 class in `R/chat.R`) enables: |
| 122 | +- Natural language plot requests: `chat$ask("Plot cell-cell communication as heatmap")` |
| 123 | +- Auto-detects datasets from `.GlobalEnv` and package data |
| 124 | +- Maintains conversation history for context |
| 125 | +- Tool discovery from package namespace |
| 126 | + |
| 127 | +Uses `tidyprompt` for provider abstraction (OpenAI, Anthropic, etc.) |
0 commit comments