Skip to content

Commit 6ed1eca

Browse files
committed
chore: add agentic document
1 parent 6cf5b45 commit 6ed1eca

File tree

1 file changed

+378
-0
lines changed

1 file changed

+378
-0
lines changed

agentic/port_eugene_rules.md

Lines changed: 378 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,378 @@
1+
# Porting Rules from Eugene to postgres-language-server
2+
3+
This document tracks the progress of porting lint rules from [Eugene](https://github.com/kaaveland/eugene) to the postgres-language-server analyzer.
4+
5+
## Overview
6+
7+
Eugene is a PostgreSQL migration linter that detects dangerous operations. We are porting its rules to the postgres-language-server to provide similar safety checks within the language server environment.
8+
9+
**Eugene source location**: `eugene/eugene/src/lints/rules.rs`
10+
**Hint metadata location**: `eugene/eugene/src/hint_data.rs`
11+
**Example SQL files**: `eugene/eugene/examples/*/`
12+
13+
## Step-by-Step Porting Process
14+
15+
### 1. Understand the Rule
16+
17+
1. **Read Eugene's implementation** in `eugene/eugene/src/lints/rules.rs`
18+
- Find the rule function (e.g., `added_serial_column`)
19+
- Understand the AST patterns it matches
20+
- Note any special logic (e.g., checking previous statements)
21+
22+
2. **Read the hint metadata** in `eugene/eugene/src/hint_data.rs`
23+
- Get the ID (e.g., "E11")
24+
- Get name, condition, effect, and workaround text
25+
- This provides documentation content
26+
27+
3. **Review example SQL** in `eugene/eugene/examples/<ID>/`
28+
- `bad.sql` - Invalid cases that should trigger the rule
29+
- `good.sql` - Valid cases that should NOT trigger
30+
31+
### 2. Create the Rule
32+
33+
```bash
34+
# Create rule with appropriate severity (error/warn)
35+
just new-lintrule safety <ruleName> <severity>
36+
37+
# Example:
38+
just new-lintrule safety addSerialColumn error
39+
```
40+
41+
This generates:
42+
- `crates/pgt_analyser/src/lint/safety/<rule_name>.rs`
43+
- `crates/pgt_analyser/tests/specs/safety/<ruleName>/basic.sql`
44+
- Updates to configuration files
45+
- Diagnostic category registration
46+
47+
### 3. Implement the Rule
48+
49+
**File**: `crates/pgt_analyser/src/lint/safety/<rule_name>.rs`
50+
51+
#### Key Components:
52+
53+
```rust
54+
use pgt_analyse::{context::RuleContext, declare_lint_rule, Rule, RuleDiagnostic, RuleSource};
55+
use pgt_console::markup;
56+
use pgt_diagnostics::Severity;
57+
58+
declare_lint_rule! {
59+
/// Brief one-line description (shown in lists).
60+
///
61+
/// Detailed explanation of what the rule detects and why it's problematic.
62+
/// Explain the PostgreSQL behavior and performance/safety implications.
63+
///
64+
/// ## Examples
65+
///
66+
/// ### Invalid
67+
///
68+
/// ```sql,expect_diagnostic
69+
/// -- SQL that should trigger the rule
70+
/// ALTER TABLE users ADD COLUMN id serial;
71+
/// ```
72+
///
73+
/// ### Valid
74+
///
75+
/// ```sql
76+
/// -- SQL that should NOT trigger
77+
/// CREATE TABLE users (id serial PRIMARY KEY);
78+
/// ```
79+
///
80+
pub RuleName {
81+
version: "next",
82+
name: "ruleName",
83+
severity: Severity::Error, // or Warning
84+
recommended: true, // or false
85+
sources: &[RuleSource::Eugene("<ID>")], // e.g., "E11"
86+
}
87+
}
88+
89+
impl Rule for RuleName {
90+
type Options = ();
91+
92+
fn run(ctx: &RuleContext<Self>) -> Vec<RuleDiagnostic> {
93+
let mut diagnostics = Vec::new();
94+
95+
// Pattern match on the statement type
96+
if let pgt_query::NodeEnum::AlterTableStmt(stmt) = &ctx.stmt() {
97+
// Rule logic here
98+
99+
if /* condition */ {
100+
diagnostics.push(
101+
RuleDiagnostic::new(
102+
rule_category!(),
103+
None,
104+
markup! {
105+
"Error message with "<Emphasis>"formatting"</Emphasis>"."
106+
},
107+
)
108+
.detail(None, "Additional context about the problem.")
109+
.note("Suggested fix or workaround."),
110+
);
111+
}
112+
}
113+
114+
diagnostics
115+
}
116+
}
117+
```
118+
119+
#### Important Patterns:
120+
121+
**Accessing previous statements** (for rules like `multipleAlterTable`):
122+
```rust
123+
let file_ctx = ctx.file_context();
124+
let previous = file_ctx.previous_stmts();
125+
```
126+
127+
**Schema normalization** (treating empty schema as "public"):
128+
```rust
129+
let schema_normalized = if schema.is_empty() {
130+
"public"
131+
} else {
132+
schema.as_str()
133+
};
134+
```
135+
136+
**Checking for specific ALTER TABLE actions**:
137+
```rust
138+
for cmd in &stmt.cmds {
139+
if let Some(pgt_query::NodeEnum::AlterTableCmd(cmd)) = &cmd.node {
140+
if cmd.subtype() == pgt_query::protobuf::AlterTableType::AtAddColumn {
141+
// Handle ADD COLUMN
142+
}
143+
}
144+
}
145+
```
146+
147+
**Extracting column type**:
148+
```rust
149+
if let Some(pgt_query::NodeEnum::ColumnDef(col_def)) = &cmd.def.as_ref().and_then(|d| d.node.as_ref()) {
150+
if let Some(type_name) = &col_def.type_name {
151+
let type_str = get_type_name(type_name);
152+
}
153+
}
154+
155+
fn get_type_name(type_name: &pgt_query::protobuf::TypeName) -> String {
156+
type_name
157+
.names
158+
.iter()
159+
.filter_map(|n| {
160+
if let Some(pgt_query::NodeEnum::String(s)) = &n.node {
161+
Some(s.sval.as_str())
162+
} else {
163+
None
164+
}
165+
})
166+
.collect::<Vec<_>>()
167+
.join(".")
168+
}
169+
```
170+
171+
### 4. Create Comprehensive Tests
172+
173+
**Directory**: `crates/pgt_analyser/tests/specs/safety/<ruleName>/`
174+
175+
Create multiple test files covering:
176+
177+
#### Invalid Cases (should trigger):
178+
```sql
179+
-- expect_lint/safety/<ruleName>
180+
-- Description of what this tests
181+
<SQL that should trigger>
182+
```
183+
184+
#### Valid Cases (should NOT trigger):
185+
```sql
186+
-- Description of what this tests
187+
-- expect_no_diagnostics
188+
<SQL that should NOT trigger>
189+
```
190+
191+
#### Example Test Structure:
192+
```
193+
tests/specs/safety/addSerialColumn/
194+
├── basic.sql # Basic case triggering the rule
195+
├── bigserial.sql # Variant (bigserial type)
196+
├── generated_stored.sql # Another variant (GENERATED)
197+
├── valid_regular_column.sql # Valid: regular column
198+
└── valid_create_table.sql # Valid: CREATE TABLE context
199+
```
200+
201+
**Run tests and accept snapshots**:
202+
```bash
203+
cargo insta test -p pgt_analyser --accept
204+
```
205+
206+
### 5. Verify and Generate Code
207+
208+
```bash
209+
# Check compilation
210+
cargo check
211+
212+
# Generate lint code and documentation
213+
just gen-lint
214+
215+
# Run all tests
216+
cargo test -p pgt_analyser --test rules_tests
217+
218+
# Final verification
219+
just ready
220+
```
221+
222+
### 6. Test the Rule Manually
223+
224+
Create a test SQL file:
225+
```sql
226+
-- test.sql
227+
ALTER TABLE users ADD COLUMN id serial;
228+
```
229+
230+
Run the CLI:
231+
```bash
232+
cargo run -p pgt_cli -- check /path/to/test.sql
233+
```
234+
235+
## Common Pitfalls and Solutions
236+
237+
### 1. Schema Matching
238+
239+
**Problem**: Need to match tables across statements with different schema notations.
240+
241+
**Solution**: Normalize schema names:
242+
```rust
243+
let schema_normalized = if schema.is_empty() { "public" } else { schema.as_str() };
244+
```
245+
246+
### 2. AST Navigation
247+
248+
**Problem**: Eugene uses a simplified AST (`StatementSummary`), but pgt uses the full PostgreSQL AST.
249+
250+
**Solution**: Use pattern matching and helper functions. Look at existing rules for examples.
251+
252+
### 3. File Context Rules
253+
254+
**Problem**: Rules that need to track state across statements (like `multipleAlterTable`).
255+
256+
**Solution**: Use `ctx.file_context()` to access `AnalysedFileContext`:
257+
```rust
258+
let file_ctx = ctx.file_context();
259+
let previous_stmts = file_ctx.previous_stmts();
260+
```
261+
262+
### 4. Transaction State
263+
264+
**Problem**: Some Eugene rules check transaction state (e.g., `RUNNING_STATEMENT_WHILE_HOLDING_ACCESS_EXCLUSIVE`).
265+
266+
**Solution**: This is more complex and may require extending the `AnalysedFileContext` to track transaction state. Consider implementing simpler rules first.
267+
268+
### 5. Test Expectations
269+
270+
**Problem**: `expect_lint` expects exactly ONE diagnostic, but rule generates multiple.
271+
272+
**Solution**: Either:
273+
- Split into separate test files (one diagnostic each)
274+
- Adjust the test to only trigger once
275+
- Use expect_diagnostic for each occurrence (check existing tests)
276+
277+
## Rule Mapping Considerations
278+
279+
### Overlapping Rules
280+
281+
Some Eugene rules may overlap with existing pgt rules:
282+
283+
| Eugene Rule | Potential PGT Overlap | Action |
284+
|-------------|----------------------|--------|
285+
| `SET_COLUMN_TYPE_TO_JSON` | `preferJsonb` | Review both, may enhance existing |
286+
| `CREATE_INDEX_NONCONCURRENTLY` | `requireConcurrentIndexCreation` | Review both |
287+
| `CHANGE_COLUMN_TYPE` | `changingColumnType` | Review both |
288+
| `ADD_NEW_UNIQUE_CONSTRAINT_WITHOUT_USING_INDEX` | `disallowUniqueConstraint` | Review both |
289+
290+
**When overlap exists**:
291+
1. Compare implementations
292+
2. If Eugene's is more comprehensive, consider updating the existing rule
293+
3. If they cover different aspects, keep both
294+
4. Document any differences
295+
296+
### Transaction-Aware Rules
297+
298+
These rules require tracking transaction state across multiple statements:
299+
300+
- `RUNNING_STATEMENT_WHILE_HOLDING_ACCESS_EXCLUSIVE` (E4)
301+
- `LOCKTIMEOUT_WARNING` (E9)
302+
303+
**Approach**:
304+
1. First implement simpler, statement-level rules
305+
2. Design transaction state tracking in `AnalysedFileContext`
306+
3. Add fields to track:
307+
- Current transaction state (BEGIN/COMMIT/ROLLBACK)
308+
- Locks held
309+
- Lock timeout settings
310+
4. Update state as statements are processed
311+
312+
## Eugene AST vs PostgreSQL AST
313+
314+
### Eugene's Simplified AST
315+
316+
Eugene uses `StatementSummary` enum with simplified representations:
317+
```rust
318+
enum StatementSummary {
319+
AlterTable { schema: String, name: String, actions: Vec<AlterTableAction> },
320+
CreateIndex { schema: String, idxname: String, concurrently: bool, target: String },
321+
// ...
322+
}
323+
324+
enum AlterTableAction {
325+
AddColumn { column: String, type_name: String, stored_generated: bool, ... },
326+
SetType { column: String, type_name: String },
327+
// ...
328+
}
329+
```
330+
331+
### PostgreSQL AST (pgt_query)
332+
333+
We use the full PostgreSQL protobuf AST:
334+
```rust
335+
pgt_query::NodeEnum::AlterTableStmt(stmt)
336+
-> stmt.cmds: Vec<Node>
337+
-> NodeEnum::AlterTableCmd(cmd)
338+
-> cmd.subtype: AlterTableType
339+
-> cmd.def: Option<Node>
340+
-> NodeEnum::ColumnDef(col_def)
341+
```
342+
343+
**Translation Strategy**:
344+
1. Look at Eugene's simplified logic
345+
2. Map to corresponding PostgreSQL AST nodes
346+
3. Use existing pgt rules as references
347+
4. Check `pgt_query::protobuf` for available types/enums
348+
349+
## Useful References
350+
351+
- **Eugene source**: `eugene/eugene/src/lints/rules.rs`
352+
- **Existing pgt rules**: `crates/pgt_analyser/src/lint/safety/`
353+
- **Contributing guide**: `crates/pgt_analyser/CONTRIBUTING.md`
354+
- **AST types**: `crates/pgt_query/src/lib.rs`
355+
- **PostgreSQL protobuf**: `pgt_query::protobuf` module
356+
357+
## Next Steps
358+
359+
1. **Priority**: Port high-priority safety rules (E1, E2, E6, E7, E9)
360+
2. **Review overlaps**: Check if E3, E5, E6, E7 overlap with existing rules
361+
3. **Transaction tracking**: Design transaction state tracking for E4, E9
362+
4. **Documentation**: Update Eugene source attribution in ported rules
363+
5. **Testing**: Ensure comprehensive test coverage for all ported rules
364+
365+
## Template Checklist
366+
367+
When porting a new rule, ensure:
368+
369+
- [ ] Rule implementation in `src/lint/safety/<rule>.rs`
370+
- [ ] Documentation with examples (invalid and valid cases)
371+
- [ ] `sources: &[RuleSource::Eugene("<ID>")]` attribution
372+
- [ ] At least 3-5 test files (mix of invalid and valid)
373+
- [ ] Snapshot tests accepted with `cargo insta test --accept`
374+
- [ ] All tests pass: `cargo test -p pgt_analyser --test rules_tests`
375+
- [ ] Compilation clean: `cargo check`
376+
- [ ] Code generation: `just gen-lint`
377+
- [ ] Manual CLI test with sample SQL
378+
- [ ] Update this document with completed rule

0 commit comments

Comments
 (0)