Skip to content

Design idea: pq – a general-purpose Rust-native jq pipeline processor #3574

@phlax

Description

@phlax

Motivation

We frequently use jq (and sometimes yq) across workflows—from ad-hoc Bash filters, to Bazel rules, to GitHub Actions. jq itself is robust, but yq is a pain to support and jq module portability (currently tied to GH Actions) is sorely lacking. Existing tools like aspect's jq/yq rules are brittle and hard to maintain hermetically.

Proposal: pq

A new tool (pq – "pipeline queries") implemented as a Rust-native pipeline processor. The core lens is:

  • Use jq syntax and its core semantics (jaq and/or shelling to jq binary)
  • Hermetic: single Rust binary, integrates cleanly with Bazel, CI, or CLI
  • Modular: borrow jq's module system, but stateless and portable
  • Pipeline-first: allow staged processing where each step's output can feed subsequent steps (json/yaml/toml to json, process, exec, branch, post-process, etc)
  • Inputs: paths, env vars, URLs, stdin, literals
  • expose state efficiently to steps
  • Multi-format inputs: parse yaml, json, toml natively, minimize format glue
  • Option for strict jq compat by finding and shelling out to jq binary for obscure/exotic cases

Example Pipeline (YAML)

inputs:
  manifest: {yaml: path/to/deployment.yaml}
  policy:   {yaml: $POLICY_PATH}
pipeline:
  - jq:      '. as $m | $policy | .items[] | select(.enabled)'
    inputs:  [manifest, policy]
    outputs:  filtered, ...
  - exec:    ["python", "-c", "import sys; print(sys.stdin.read().upper())"]
    inputs:   filtered, ...
    outputs:  transformed, ...
  - jq:      '{result: ., timestamp: now}'
    input:   transformed, ...

Design Notes

  • Default backend: jaq (pure Rust, no FFI)
  • "Strict" mode: pass args directly to jq binary for rare cases needing 100% compatibility
  • Designed for use in:
    • Bazel (hermetic, no Python/Go nonsense)
    • CI jobs (no need for GH-only Actions magic)
    • One-off CLI/data-wrangling
    • Everything keeps working exactly the same on dev's laptop, CI machines, remote runners
  • jq modules: allow portable module import path (project-local, per-user, per-step)
  • Steers clear of replicating every jq CLI quirk—API is clean, jq-mode is always there as fallback
  • Boring, reliable, documented, fast

Open Questions

  • Should the pipeline spec allow for conditional/branching/validation features?
  • How to handle jq args/module search path environment like jq, vs a simpler system (env vars, CLI flags, project config)?
  • What's the best ergonomic structure for importing modules per-pipeline or per-step?
  • Should pipelines explicitly support parallel/concurrent steps or be strictly serial? (parallel/async!!!)
  • Should module system allow for sharing reusable jq snippets between unrelated pipelines?

Next Steps

  • High-level code structure sketch and draft initial CLI UX
  • Work out minimal viable pipeline runner
  • Integrate jaq with input normalization and output marshalling
  • (Optional) Implement native/jq/compat backend with strict pass-through mode

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrustPull requests that update rust code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions