Name	Name	Last commit message	Last commit date
parent directory ..
.claude-plugin	.claude-plugin
agents	agents
docs	docs
references	references
scripts	scripts
skills	skills
.mcp.json	.mcp.json
README.md	README.md

two-node

Two-node topology (TNA/TNF) workflow automation for OpenShift edge deployments.

Installation

Install via Claude Code's plugin system:

/plugin marketplace add openshift-eng/edge-tooling
/plugin install two-node

Prerequisites

podman for running the MCP server container
JIRA_USERNAME environment variable set with your Red Hat email
JIRA_API_TOKEN environment variable set with a Jira API token

The plugin includes an .mcp.json that automatically configures the mcp-atlassian MCP server.

Additional prerequisites for `/two-node:bug-reproducer`

Claude Code session open at the Two-Node Toolbox (TNT) repo (two-node-toolbox/deploy/ or two-node-toolbox/deploy/openshift-clusters/). Running from any other directory will result in an error.
EC2 instance running with make inventory completed
EC2 configured (./configure) and SSH-accessible
Pull secret at deploy/openshift-clusters/roles/dev-scripts/install-dev/files/pull-secret.json (relative to repo root)

Skills

`/two-node:create-rhel-stories`

Create OCPEDGE stories for TNF RHEL verification tickets, link them to the RHEL bugs, and set the required components (Two Node Fencing, QE, RHEL-Verification).

# Auto-discover untested tickets
/two-node:create-rhel-stories

# Dry run (preview without changes)
/two-node:create-rhel-stories --dry-run

# Specific tickets
/two-node:create-rhel-stories RHEL-12345 RHEL-12346 RHEL-12347

# JQL query
/two-node:create-rhel-stories jql:project = RHEL AND component = "resource-agents" AND status != Closed

Features:

Auto-discovery of untested TNF resource-agents RHEL tickets
Clone expansion across the full clone tree
Dry-run mode for previewing without modifying Jira
Closed story handling (creates new stories for untested tickets)
Subtask creation (verification + automation)

`/two-node:verify-rhel-bugfix`

Verify a RHEL resource-agents bug fix on a TNF cluster. Given a JIRA ID, the skill fetches the full bug context from Jira (title, z-stream, upstream PR, linked OCPEDGE tracking ticket, test instructions), then walks through the verification workflow.

/two-node:verify-rhel-bugfix RHEL-157145
/two-node:verify-rhel-bugfix https://issues.redhat.com/browse/RHEL-157145

Workflow:

Gather info from Jira — Fetches the RHEL ticket, follows links to upstream OCPBUGS bug (extracts PR, commit, author), finds OCPEDGE tracking ticket and sprint. Only asks the user for the RPM location and verification type.
Check cluster — Runs verify-cluster.sh to check OCP version, RHCOS, pacemaker status, etcd health, and current RPM version
Patch nodes — Runs patch-nodes.sh to distribute the RPM, apply rpm-ostree override replace -C, reboot, and verify
Run test — Code-only verification (grep for fix code) or functional test (shutdown/fence/standby scenarios)
Generate report — Produces a Markdown JIRA comment with environment, fix details, test results, and conclusion. Optionally posts it to the RHEL ticket via MCP.

Scripts (in scripts/):

verify-cluster.sh — Cluster health check (OCP, nodes, pcs, etcd, RPM versions)
patch-nodes.sh — RPM patching with persistent override + reboot + verification
collect-logs.sh — Collect pacemaker/etcd logs from both nodes

Prerequisites:

SSH access to a hypervisor running a TNF cluster
RPM with the fix downloaded locally (typically ~/Downloads/)
HYPERVISOR env var or two-node-toolbox/ submodule available for auto-detection

`/two-node:cluster-diagnostic`

Diagnose TNF cluster issues (shutdown/recovery, etcd, fencing, network, operators). Gathers live cluster state via SSH, analyzes it against a knowledge base of 7 bare metal test scenarios, and reports findings with severity classification. Read-only — never modifies cluster state.

/two-node:cluster-diagnostic
/two-node:cluster-diagnostic validate "cordon, drain, then shutdown -h 1 on each node"
/two-node:cluster-diagnostic recovery-guide standby
/two-node:cluster-diagnostic recovery-guide full-shutdown

Modes:

diagnose (default) — SSH to cluster, gather pcs/etcd/corosync/OCP state, analyze, report
validate — Check a proposed procedure against known failure modes
recovery-guide — Step-by-step recovery for: standby, full-shutdown, single-node, network-partition, power-outage, rolling-restart, after-recovery, connectivity, etcd-nospace, pending-csr, split-brain, stale-data, partition-fencing-failure

Access patterns (auto-detected):

Dev-scripts: HYPERVISOR env var or two-node-toolbox/ submodule
Bare metal: NODE_0 and NODE_1 env vars (direct SSH to nodes)
Optional: BMC_0/BMC_1/BMC_USER/BMC_PASS for Redfish power state
Optional: KUBECONFIG for OCP-layer diagnostics (nodes, operators, CSRs, events)
Optional: SSH_KEY for SSH private key path (defaults to ~/.ssh/id_rsa)

Environment file: Create ~/.tnf-cluster.env to avoid exposing credentials in the conversation and to persist settings across sessions:

export NODE_0=10.1.155.141
export NODE_1=10.1.155.142
export BMC_0=<bmc-0-hostname>
export BMC_1=<bmc-1-hostname>
export BMC_USER=<bmc-username>
export BMC_PASS=<bmc-password>
export KUBECONFIG=<path-to-kubeconfig>

Then run diagnostics with: source ~/.tnf-cluster.env && bash scripts/diagnose-cluster.sh

Scripts: diagnose-cluster.sh (in scripts/)

Acknowledgements: etcd split-brain detection, CIB recovery attribute checks, and stale data recovery scenarios were adapted from etcd troubleshooting documentation authored by Fonta and Carlo in two-node-toolbox and validated on bare metal (2026-05-28).

`/two-node:bug-reproducer`

Automated OpenShift bug reproduction for Two-Node with Arbiter (TNA) and Two-Node with Fencing (TNF) topologies.

/two-node:bug-reproducer OCPBUGS-66217

One argument: a Jira issue key. The skill handles everything else:

Bug Analysis -- Fetches the bug from Jira (description + comments), detects topology (arbiter or fencing), classifies bug category, extracts reproduction steps, detects install method (IPI/agent/kcli), and determines the OCP version. Stops if the bug is a test issue (not a product bug) or if the dev-scripts environment cannot reproduce the conditions.
Cluster Deployment -- Updates the dev-scripts config, uploads day-0 manifests if needed, and runs the Ansible deployment playbook. Monitors deployment every 10 minutes for early failure detection. Cleans and retries on failure (with user approval).
Cluster Ready -- Waits for all nodes Ready, MCPs updated, and COs healthy. Detects during-install bugs. Applies day-1 manifests if needed.
Bug Reproduction -- Executes the reproduction steps extracted from the Jira bug on the healthy cluster. This is the core phase for most bugs (post-install steps like pcs commands, node reboots, backup/restore, oc apply, etc.).
Log Collection -- Collects category-targeted logs (etcd, fencing, MCO, NTO, networking, etc.), rsyncs locally, and generates a findings report.

The cluster is always left running after the skill completes so the user can SSH in and inspect.

Supported topologies:

arbiter -- Two-Node with Arbiter (TNA): 2 masters + 1 arbiter node
fencing -- Two-Node with Fencing (TNF): 2 masters with BMC-based fencing

Output:

Logs saved to /tmp/two-node-bug-reproduce-<BUG_ID>/
Findings report written to docs/<bug-id-lowercase>-findings.md in the TNT repo (e.g., docs/ocpbugs-66217-findings.md)
Cluster left running for manual inspection

Authors

lucaconsalvi, nhamza

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

two-node

Installation

Prerequisites

Additional prerequisites for `/two-node:bug-reproducer`

Skills

`/two-node:create-rhel-stories`

`/two-node:verify-rhel-bugfix`

`/two-node:cluster-diagnostic`

`/two-node:bug-reproducer`

Authors

FilesExpand file tree

two-node

Directory actions

More options

Directory actions

More options

Latest commit

History

two-node

Folders and files

parent directory

README.md

two-node

Installation

Prerequisites

Additional prerequisites for /two-node:bug-reproducer

Skills

/two-node:create-rhel-stories

/two-node:verify-rhel-bugfix

/two-node:cluster-diagnostic

/two-node:bug-reproducer

Authors

Additional prerequisites for `/two-node:bug-reproducer`

`/two-node:create-rhel-stories`

`/two-node:verify-rhel-bugfix`

`/two-node:cluster-diagnostic`

`/two-node:bug-reproducer`