design: experimentation UI screens (Phase 1 + Phase 2)

## Context

Design the UI screens for the warehouse-native experimentation feature across both phases. Designs are being built in Pencil. This is internal-first (dogfood on Flagsmith SaaS, behind feature flag).

## PRDs

- [Phase 1: Evaluation & Event Streaming](https://www.notion.so/flagsmith/PRD-Evaluation-Event-Streaming-Phase-1-33ba2b634bf980009f54ecf00e6ce651)
- [Phase 2: Experiment Primitives](https://www.notion.so/flagsmith/PRD-Experiment-Primitives-Phase-2-33ba2b634bf980f9a408cfa6881e47bb)

## Phase 1 Screens

### Warehouse Connection Settings (P0)
- Location: Organisation Settings → Data Warehouse (new section)
- States: empty → configuration form → testing → connected → error
- Warehouse type selector (Snowflake active, others coming soon)
- Credential fields, test connection, status indicator

### Streaming Status (P1)
- Events streamed (24h/7d), current status, last event timestamp, error log
- Mostly for engineering/debugging during dogfood

### Environment Streaming Filter (P2)
- Per-environment toggles: stream evaluations on/off, stream custom events on/off

## Phase 2 Screens

### Experiment Creation Flow (P0)
- 4-step wizard: Basics (name, hypothesis, flag) → Variants (control selection) → Metrics (attach primary metric) → Review & Create
- Entry from flag detail page or Experiments nav

### Metric Definition UI (P0)
- SQL editor with syntax highlighting
- Validate SQL button (dry run against warehouse)
- Column mapping (auto-detected, manually adjustable)
- Type selector: Conversion / Numeric / Count

### Experiment Detail Page (P0)
- Header with status badge and lifecycle actions (Start, Pause, Stop, Record Decision)
- Sections: Overview, Metrics, SRM Check, Results (Phase 3 placeholder), Decision

### Experiment List (P1)
- Filter by status, sort by date/name
- Experiment catalogue showing name, status, flag, primary metric, duration, decision

### Flag Detail — Experiment Link (P1)
- Show linked experiment card on flag detail page
- CTA to create experiment if none exists (only for multivariate flags)

## Key Design Principles

- The flag is always the source of truth for variants and traffic split
- Experiment is a layer of intent and measurement on top of the flag
- Hypothesis field should feel intentional, not like a chore
- SQL editor is the core of metric definition — needs to be good
- Status transitions should feel deliberate (confirmations on Start, Decision)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design: experimentation UI screens (Phase 1 + Phase 2) #7181

Context

PRDs

Phase 1 Screens

Warehouse Connection Settings (P0)

Streaming Status (P1)

Environment Streaming Filter (P2)

Phase 2 Screens

Experiment Creation Flow (P0)

Metric Definition UI (P0)

Experiment Detail Page (P0)

Experiment List (P1)

Flag Detail — Experiment Link (P1)

Key Design Principles

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

design: experimentation UI screens (Phase 1 + Phase 2) #7181

Description

Context

PRDs

Phase 1 Screens

Warehouse Connection Settings (P0)

Streaming Status (P1)

Environment Streaming Filter (P2)

Phase 2 Screens

Experiment Creation Flow (P0)

Metric Definition UI (P0)

Experiment Detail Page (P0)

Experiment List (P1)

Flag Detail — Experiment Link (P1)

Key Design Principles

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions