DataDog · rachelyangdog · Jun 5, 2026 · platinummonkey · Jun 5, 2026 · platinummonkey
@@ -0,0 +1,103 @@
+# `pup install` — design
+
+This doc covers the design and rollout plan for `pup install <component> <platform>` — pup's unified entry point for installing Datadog components on a target. The first concrete leaf is `pup install ssi linux`, the rest follow in subsequent PRs.
+
+## Motivation
+
+Today, multiple agent-skills install Datadog components by embedding `bash -c "$(curl -L install.datadoghq.com/scripts/install_script_agent7.sh)"` in their SKILL.md. Security scanners (Snyk, etc.) correctly flag this as RCE-shaped — and the script itself lives outside any repo we control, with no pinned version or audit trail.
+
+Workarounds we evaluated:
+
+- **Pointing skills at docs** — breaks the "user does nothing" promise that SSI sells.
+- **Vendoring the install script in the agent-skills repo** — Datadog ships updates per release; we'd inherit a sync treadmill.
+- **Manual apt/yum install in the skill** — only works for the agent; the SSI components (apm-inject, language libs) aren't deb/rpm packages, they're OCI artifacts delivered by the `datadog-installer` binary.
+- **Snyk policy exception** — pragmatic but doesn't fix the underlying audit-trail issue.
+
+The clean answer is to move install logic out of markdown and into pup. The skill becomes a single line (`pup install ssi linux --host <X>`), and the install steps execute in compiled, reviewed Rust — using HTTP fetch + SHA-256 verification + signed package install rather than fetch-and-execute shell. This is the same shape every modern install CLI (helm, brew, AWS CLI) already uses.
+
+## Command surface
+
+```
+pup install <component> <platform> [args]
+```
+
+Two dimensions:
+
+- **Component** — what to install (`ssi`, `agent`, `dbm`, `llm-obs`, `csm`, …)
+- **Platform** — where to install (`linux`, `k8s`, `docker`, `windows`, …)
+
+Today, only `ssi linux` is wired up (and as scaffolding — see "Status" below). The matrix grows as future PRs add components/platforms.
+
+### Examples
+
+Today (after this scaffolding PR):
+
+```
+pup install ssi linux --host bastion.example.com --user ec2-user --key ~/.ssh/id_ed25519
+pup install ssi linux --host bastion.example.com --dry-run
+```
+
+Future (out of scope for this PR):
+
+```
+pup install agent  linux  --host <X>           # just the agent, no SSI
+pup install ssi    k8s    --context <CTX>      # SSI via helm + DatadogAgent CR
+pup install dbm    linux  --host <X> --db pg   # Database Monitoring
+pup install llm-obs local --runtime python     # LLM Obs SDK
+```
+
+## Linux SSI install flow
+
+This is what `pup install ssi linux --host <X>` will do once implemented:
+
+1. Open an SSH session to the target via the `russh` client (already an optional dep of pup; used today by the auth tunnel server).
+2. Detect the host's package manager by reading `/etc/os-release` (`$ID` + `$ID_LIKE`).
+3. Configure the Datadog package repo on the target:
+   - **Debian / Ubuntu**: write `/etc/apt/sources.list.d/datadog.list` (with `signed-by=`), download `keys.datadoghq.com/DATADOG_APT_KEY_*.public` keys, import via `gpg --no-default-keyring`.
+   - **RHEL / CentOS / Amazon / Rocky / Alma**: write `/etc/yum.repos.d/datadog.repo` referencing the same key URLs.
+4. Install signed packages via the host's package manager: `datadog-agent` (+ `datadog-signing-keys` on apt).
+5. Download the `datadog-installer` binary from the OCI registry (`gcr.io/datadoghq/installer-package/blobs/sha256:<pin>`), verify against a pinned SHA-256, then upload it to the remote host.
+6. Execute `datadog-installer setup` on the remote host to install `apm-inject` + the language libraries and to write `/etc/ld.so.preload`. (The exact subcommand and flags — `--flavor "APM SSI"`, `--apm-instrumentation-enabled=host`, etc. — need verification with whoever owns `datadog-installer`. See "Open questions" below.)
+7. Patch `/etc/datadog-agent/datadog.yaml` with `DD_API_KEY` and `DD_SITE` (sourced from pup's existing auth config).
+8. Restart the agent and confirm `datadog-agent status` reports healthy.
+
+The critical property: nowhere in this flow does pup execute a remote shell script. Every step is either (a) an `apt-get` / `yum` install against signed package repos, (b) an HTTP fetch of a key file or binary followed by signature/hash verification, or (c) executing a known binary with known arguments. The `curl | bash` pattern is gone.
+
+## Security model
+
+- **No fetch-and-execute.** No `curl | bash`, no `bash -c "$(curl ...)"`, no `wget | sh`. All downloads write to a file, then verify, then either parse or execute.
+- **Signed packages.** Agent + SSI components install via apt/yum with the Datadog signing keys imported into a keyring; signature verification is enforced by the package manager.
+- **Pinned binary.** The `datadog-installer` binary is downloaded by SHA-256 (pinned in pup's source) rather than "whatever the latest URL serves." This trades update friction for a real audit trail.
+- **Auth surface.** SSH credentials are passed via flags / config (same shape as existing ssh tooling); Datadog API credentials use pup's existing `auth` infrastructure.
+- **No new outbound endpoints beyond what the official install script already hits** — `apt.datadoghq.com`, `yum.datadoghq.com`, `keys.datadoghq.com`, `gcr.io` / `public.ecr.aws`.
+
+## Status (as of this PR)
+
+This is a **scaffolding PR**. What's in vs out:
+
+**In:**
+- The `pup install` command surface in `src/main.rs` (clap subcommands `Install` → `Ssi` → `Linux`).
+- A new `src/commands/install.rs` module with the `ssi_linux` dispatcher.
+- `--dry-run` prints the planned install steps for the given target.
+- Without `--dry-run`, the command bails with an actionable error that points at this doc.
+- Unit tests for both behaviors.
+
+**Out (follow-up PRs):**
+- The actual `russh` client session (connect, exec, file transfer).
+- The OCI registry manifest fetch + blob download + SHA-256 verification.
+- The repo file / keyring setup on the remote host.
+- The `datadog-installer setup` invocation.
+- The agent restart + health check.
+- Integration tests against a real Linux host.
+- Multi-distro test matrix (Debian 11/12, Ubuntu 22.04/24.04, RHEL 8/9, Amazon Linux 2/2023, Rocky 9).
+
+The intent is to land the *surface* now so the pup team can review the shape, and to land the implementation in small, reviewable follow-ups.
+
+## Open questions for the pup team
+
+1. **Where should the install module live?** This PR puts everything in `src/commands/install.rs`. Once the implementation lands the install logic might want its own top-level `src/install/` module with submodules (`ssh.rs`, `oci.rs`, `linux_ssi.rs`). Happy to refactor at that point.
+2. **SSH transport.** `russh` is already an optional dep but currently used only as a server (`src/tunnel.rs`). Should the install module reuse `russh` as a client, or shell out to the system `ssh` binary? `russh` keeps pup self-contained (especially on Windows); shelling out is simpler but adds a runtime dependency.
+3. **`datadog-installer` CLI surface.** The exact subcommand and flags used by `install-ssi.sh` (`setup --flavor "APM SSI"` per one read of the script) need to be confirmed with whoever owns that binary as a public, stable contract. If the contract isn't stable, we need a different integration point.
+4. **OCI version pinning.** Who owns bumping the `datadog-installer` SHA-256 in pup when Datadog ships a new release? Manual on each agent release, or auto-bumped from a manifest?
+5. **k8s parity.** `pup install ssi k8s` is the natural next leaf. The mechanics are entirely different (helm + DatadogAgent CR + admission webhook) — does it deserve its own command tree, or stay under `install ssi`?
+6. **Auth integration.** Pup's existing `auth` config gives access to Datadog API. The install flow needs `DD_API_KEY` + `DD_SITE` to write `/etc/datadog-agent/datadog.yaml` — should the install command reuse `cfg.api_key` / `cfg.site`, or take its own flags?
@@ -0,0 +1,150 @@
+//! `pup install <component> <platform>` — unified install entry point.
+//!
+//! **Status: scaffolding.** The CLI surface and module structure land in this PR;
+//! the actual install flow (russh client, OCI registry fetch, signed package
+//! install, `datadog-installer setup`) ships in follow-up PRs.
+//!
+//! Motivation lives in `docs/INSTALL.md` — short version: skills currently
+//! embed `bash -c "$(curl …install_script_agent7.sh)"` to install the agent +
+//! SSI on remote hosts, which scanners (correctly) flag as RCE-shaped and which
+//! references a script that lives outside any repo we control. Moving the
+//! install flow into pup turns the SKILL.md into a one-liner (`pup install …`)
+//! and shifts the install logic into compiled Rust that already gets reviewed.
+//!
+//! Today this returns a structured "not yet implemented" error that lists the
+//! planned steps. `--dry-run` prints the same plan without erroring so callers
+//! can wire up the surface end-to-end.
+
+use anyhow::{bail, Result};
+
+use crate::config::Config;
+
+/// Install Single Step Instrumentation on a remote Linux host.
+///
+/// The `--dry-run` flag returns successfully after printing the install plan.
+/// Without it, returns an error today — the actual install lands in a follow-up
+/// PR per the checklist in `docs/INSTALL.md`.
+pub async fn ssi_linux(
+    _cfg: &Config,
+    host: String,
+    user: String,
+    key: Option<String>,
+    port: u16,
+    dry_run: bool,
+) -> Result<()> {
+    let plan = LinuxSsiPlan {
+        host: &host,
+        user: &user,
+        key: key.as_deref(),
+        port,
+    };
+
+    if dry_run {
+        plan.print();
+        return Ok(());
+    }
+
+    bail!(
+        "`pup install ssi linux` is scaffolded but not yet executable in this build. \
+         Re-run with --dry-run to see the planned steps for {user}@{host}:{port}, \
+         or track the follow-up implementation work in docs/INSTALL.md.",
+    )
+}
+
+/// Captured arguments for a single Linux-SSI install. Kept narrow on purpose —
+/// the actual install code lives in follow-up PRs and will likely want a
+/// richer config struct (e.g. proxy settings, sudo strategy, distro override).
+struct LinuxSsiPlan<'a> {
+    host: &'a str,
+    user: &'a str,
+    key: Option<&'a str>,
+    port: u16,
+}
+
+impl LinuxSsiPlan<'_> {
+    fn print(&self) {
+        println!("# Planned `pup install ssi linux` steps");
+        println!("#");
+        println!(
+            "# Target: {user}@{host}:{port}{key}",
+            user = self.user,
+            host = self.host,
+            port = self.port,
+            key = match self.key {
+                Some(k) => format!("  (key: {k})"),
+                None => "  (key: SSH agent / default identity)".to_string(),
+            }
+        );
+        println!("#");
+        println!("# 1. Open an SSH session via russh client.");
+        println!("# 2. Detect distro family by reading /etc/os-release ($ID + $ID_LIKE).");
+        println!("# 3. Configure Datadog package repo for the host:");
+        println!("#      - Debian/Ubuntu → add apt repo signed by");
+        println!("#        keys.datadoghq.com/DATADOG_APT_KEY_CURRENT.public");
+        println!("#      - RHEL/CentOS/Amazon/Rocky/Alma → write /etc/yum.repos.d/datadog.repo");
+        println!("# 4. Install signed packages: datadog-agent (+ datadog-signing-keys on apt).");
+        println!("# 5. Download the `datadog-installer` binary from the OCI registry");
+        println!("#      (gcr.io/datadoghq/installer-package), verify its SHA-256 against a");
+        println!("#      pinned manifest, and execute it locally on the remote host.");
+        println!("# 6. Run `datadog-installer setup` to install apm-inject + language");
+        println!("#      libraries and to write /etc/ld.so.preload.");
+        println!("# 7. Patch /etc/datadog-agent/datadog.yaml with DD_API_KEY + DD_SITE.");
+        println!("# 8. Restart the agent and confirm `datadog-agent status` is healthy.");
+        println!("#");
+        println!("# None of these steps run today — this command is scaffolding so the pup team");
+        println!("# can review the shape before the implementation lands. See docs/INSTALL.md.");
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::test_support::test_config;
+
+    #[tokio::test]
+    async fn dry_run_returns_ok_and_does_not_bail() {
+        // ssi_linux's scaffolding doesn't read from cfg yet, so we just need
+        // *any* Config to satisfy the signature.
+        let cfg = test_config("http://unused.invalid");
+        let result = ssi_linux(
+            &cfg,
+            "bastion.example.com".to_string(),
+            "ec2-user".to_string(),
+            Some("~/.ssh/id_ed25519".to_string()),
+            22,
+            /* dry_run */ true,
+        )
+        .await;
+        assert!(
+            result.is_ok(),
+            "dry-run should succeed; got {:?}",
+            result.err()
+        );
+    }
+
+    #[tokio::test]
+    async fn non_dry_run_bails_with_actionable_message() {
+        let cfg = test_config("http://unused.invalid");
+        let err = ssi_linux(
+            &cfg,
+            "host.example.com".to_string(),
+            "root".to_string(),
+            None,
+            22,
+            /* dry_run */ false,
+        )
+        .await
+        .expect_err("non-dry-run should bail until the real impl lands");
+        let msg = format!("{err:#}");
+        // The error message must tell the user what to do next, not just say
+        // "not implemented" with no context.
+        assert!(
+            msg.contains("--dry-run"),
+            "error should point at --dry-run as the working flow, got: {msg}"
+        );
+        assert!(
+            msg.contains("docs/INSTALL.md"),
+            "error should reference the design doc, got: {msg}"
+        );
+    }
+}
@@ -47,6 +47,8 @@ pub mod hamr;
 pub mod idp;
 pub mod incidents;
 pub mod infrastructure;
+#[cfg(not(target_arch = "wasm32"))]
+pub mod install;
 pub mod integrations;
 pub mod investigations;
 pub mod kafka;

@@ -1551,6 +1551,37 @@ enum Commands {
         #[command(subcommand)]
         action: InfraActions,
     },
+    /// Install Datadog components on a target (host, cluster, …)
+    ///
+    /// Install Datadog observability components on a target — the unified entry
+    /// point for the install flows historically delivered by curl-bash one-liners
+    /// (`install_script_agent7.sh`, `install-ssi.sh`, etc.).
+    ///
+    /// COMMAND SHAPE:
+    ///   pup install <component> <platform> [args]
+    ///
+    ///   <component>  what to install: ssi, agent, dbm, llm-obs, …
+    ///   <platform>   where to install: linux, k8s, …
+    ///
+    /// EXAMPLES (today):
+    ///   # Install Datadog Agent + Single Step Instrumentation on a remote Linux host
+    ///   pup install ssi linux --host bastion.example.com --user ec2-user --key ~/.ssh/id_ed25519
+    ///
+    /// EXAMPLES (planned, follow-up PRs):
+    ///   pup install agent  linux  --host <X>           # just the agent
+    ///   pup install ssi    k8s    --context <CTX>      # SSI via helm + DatadogAgent CR
+    ///   pup install dbm    linux  --host <X> --db pg   # Database Monitoring
+    ///   pup install llm-obs local --runtime python     # LLM Obs SDK
+    ///
+    /// STATUS: scaffolding. Command surface and module structure land in this PR;
+    /// the russh client + OCI registry client + installer-binary invocation are
+    /// follow-up work. See docs/INSTALL.md for the full plan.
+    #[cfg(not(target_arch = "wasm32"))]
+    #[command(verbatim_doc_comment)]
+    Install {
+        #[command(subcommand)]
+        action: InstallActions,
+    },
     /// Manage third-party integrations
     ///
     /// Manage third-party integrations with external services.
@@ -4431,6 +4462,48 @@ enum InfraHostActions {
     Get { hostname: String },
 }
 
+// ---- Install: Datadog component install on a target (host, cluster, …) ----
+//
+// Shape: `pup install <component> <platform> [args]`. Scaffolding only in this
+// PR — the working install flows ship in follow-up PRs. See docs/INSTALL.md.
+#[cfg(not(target_arch = "wasm32"))]
+#[derive(Subcommand)]
+enum InstallActions {
+    /// Install Single Step Instrumentation (Datadog Agent + APM auto-injection)
+    Ssi {
+        #[command(subcommand)]
+        action: InstallSsiActions,
+    },
+}
+
+#[cfg(not(target_arch = "wasm32"))]
+#[derive(Subcommand)]
+enum InstallSsiActions {
+    /// Install SSI on a remote Linux host over SSH
+    ///
+    /// Replaces the historical `bash -c "$(curl install_script_agent7.sh)"` flow
+    /// with a native Rust install: SSH client, signed package install (apt/yum),
+    /// SHA-256-verified OCI binary download for `datadog-installer`, then
+    /// `installer setup` to enable SSI. No remote shell scripts are executed.
+    Linux {
+        /// SSH host or `user@host` of the target machine
+        #[arg(long)]
+        host: String,
+        /// SSH user (default: root)
+        #[arg(long, default_value = "root")]
+        user: String,
+        /// Path to SSH private key
+        #[arg(long)]
+        key: Option<String>,
+        /// SSH port (default: 22)
+        #[arg(long, default_value_t = 22)]
+        port: u16,
+        /// Print the install plan without connecting or making changes
+        #[arg(long)]
+        dry_run: bool,
+    },
+}
+
 // ---- IDP (Internal Developer Portal) ----
 #[derive(Subcommand)]
 enum IdpActions {
@@ -11914,6 +11987,21 @@ async fn main_inner() -> anyhow::Result<()> {
                 },
             }
         }
+        // --- Install (Datadog component install on a target) ---
+        #[cfg(not(target_arch = "wasm32"))]
+        Commands::Install { action } => match action {
+            InstallActions::Ssi { action } => match action {
+                InstallSsiActions::Linux {
+                    host,
+                    user,
+                    key,
+                    port,
+                    dry_run,
+                } => {
+                    commands::install::ssi_linux(&cfg, host, user, key, port, dry_run).await?;
+                }
+            },
+        },
         // --- IDP (Internal Developer Portal) ---
         Commands::Idp { action } => {
             cfg.validate_auth()?;