-
Notifications
You must be signed in to change notification settings - Fork 106
BE-187: Refactor SQL query compiler with Identifier types and improved JOIN handling #7951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BE-187: Refactor SQL query compiler with Identifier types and improved JOIN handling #7951
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #7951 +/- ##
==========================================
+ Coverage 55.66% 55.77% +0.10%
==========================================
Files 1112 1115 +3
Lines 101195 101510 +315
Branches 4690 4700 +10
==========================================
+ Hits 56329 56615 +286
- Misses 44205 44230 +25
- Partials 661 665 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Graphite Automations"Request backend reviewers once CI passes" took an action on this PR • (10/30/25)1 reviewer was added to this PR based on Tim Diekmann's automation. "Request Rust reviewers once CI passes" took an action on this PR • (10/30/25)1 reviewer was added to this PR based on Tim Diekmann's automation. "Request DevOps reviewers once CI passes" took an action on this PR • (11/01/25)1 reviewer was added to this PR based on Tim Diekmann's automation. |
Introduce comprehensive meta-skill for creating and managing Claude Code skills in the HASH repository, following Anthropic best practices. Features: - Progressive disclosure with 6 detailed reference files - HASH-specific patterns from .github/labeler.yml - Complete trigger configuration guide - Testing and troubleshooting documentation - Pattern library with HASH repo structure Structure follows Anthropic best practices: - SKILL.md < 500 lines (453 lines) - Lowercase kebab-case for resource files - Comprehensive skill-rules.json configuration Based on claude-code-infrastructure-showcase/skill-developer with adaptations for HASH workspace structure and existing skills (rust-error-stack, cargo-dependencies).
Restructure cargo-dependencies skill following Anthropic best practices. Changes: - Add YAML frontmatter with trigger-rich description - Reduce SKILL.md from 236 to 163 lines (73 lines saved) - Split content into 3 detailed resource files: - workspace-setup.md: Workspace root management - package-dependencies.md: 4-section structure details - examples-reference.md: Real HASH codebase examples - Update to production-ready status Structure now follows 500-line rule with progressive disclosure pattern. All files use lowercase kebab-case naming convention.
Polish rust-error-stack skill following Anthropic best practices. Changes: - Enhance YAML frontmatter description with more trigger keywords - Standardize resource links (remove ./ prefix for consistency) - Add production-ready status footer - Line count: 129 (well under 500-line rule) All resource links now use consistent 'resources/' format without leading './'. Trigger description now includes key terms: Report types, change_context, attach, implementing Error trait.
Introduce comprehensive rust-documentation skill based on .cursor/rules/rust-documentation.mdc (388 lines). Features: - Function documentation patterns (simple vs complex) - Type documentation (structs, enums, traits) - Error documentation with intra-doc links - Examples and code snippets - Emphasis on avoiding obvious/redundant docs Structure: - SKILL.md: 221 lines (concise overview) - 4 resource files with detailed patterns: - function-documentation.md: Parameters, returns, async - type-documentation.md: Structs, enums, traits (avoid noise) - error-documentation.md: # Errors section format - examples-and-links.md: Writing examples, intra-doc links Key principle: Document WHY, not WHAT. Skip obvious documentation that just restates code (e.g., don't document "prints a line" for println). Follows Anthropic best practices with progressive disclosure and < 500 line rule per file.
Add trigger configurations for writing-skills and rust-documentation skills to skill-rules.json. Changes: - writing-skills: Meta-skill triggers (priority: high) - Keywords: skill system, triggers, hooks, SKILL.md - Intent patterns: create/modify/explain skills - File triggers: .claude/skills/**/SKILL.md - rust-documentation: Documentation practices (priority: high) - Keywords: rustdoc, doc comments, # Errors/Examples/Panics - Intent patterns: write/document rust code - File triggers: **/*.rs with doc comment patterns - Content patterns: detect existing doc comments Both skills use "suggest" enforcement and follow domain skill pattern.
…rformance improvements BREAKING: Removed fileTriggers (were unused dead code) Infrastructure: - Add path auto-detection for CLAUDE_PROJECT_DIR - Add --validate flag for health checks - Add SKILL_DEBUG=true for detailed matching logs - Improve error messages with project directory context Performance: - Cache regex compilation (50% faster) - Cache keyword toLowerCase() calls - Remove continue bug that skipped intent pattern checks - Extract getProjectDir() to eliminate code duplication Validation: - Validate regex patterns during --validate - Show helpful error messages for invalid patterns - Warn about missing promptTriggers - Verify SKILL.md files exist Configuration: - Remove unused fileTriggers from all skills - Add word boundaries to intent patterns (reduce false positives) - Add German keywords (fehler, fehlerbehandlung, etc.) Documentation: - Clean up writing-skills SKILL.md - Remove documentation for unimplemented features - Mark future features clearly Code Quality: - Disable ESLint no-console for hooks directory - Add proper error handling with stack traces - Improve debug output formatting
Add comprehensive future-enhancements.md resource documenting: - 14 potential skill system features - Implementation complexity estimates - Priority recommendations - Clear distinction between implemented and future features Organized into sections: - Skill System Features (file triggers, guardrails, session tracking) - Advanced Triggers (fuzzy matching, multi-language, context-aware) - Performance Optimizations (regex cache, parallel checking) - Developer Experience (templates, interactive builder) - Analytics & Monitoring (usage stats, A/B testing) Following progressive disclosure pattern - keeps main SKILL.md under 500 lines.
…matches Previously matchType was hardcoded to 'keyword' even when matched via intent pattern. Now properly tracks which trigger type matched: - Keywords take priority and set matchType to 'keyword' - Intent patterns only checked if no keyword match, sets 'intent' - Ensures accurate match type reporting for debugging and analytics
…iority Previously skipped intent pattern check if keyword matched, which meant: - Debug logs wouldn't show intent pattern matches - Couldn't validate that both trigger types work correctly Now both are checked independently: - Keywords set matchType to 'keyword' - Intent patterns use nullish coalescing (??=) to set 'intent' if no keyword match - Debug logs show all matches for better debugging - Keywords take priority when both match (more specific than regex) Also fixed shell script argument passing: - --validate flag now properly passed to TypeScript script - No longer tries to read stdin when using --validate flag - Uses conditional logic to handle validate vs normal mode Tested scenarios: - Both keyword + intent match → matchType='keyword' ✓ - Only intent pattern match → matchType='intent' ✓ - Only keyword match → matchType='keyword' ✓ - Validation mode → all skills validated ✓
…ectly Simplified skill hook architecture by removing unnecessary indirection: Changes: - Removed .claude/hooks/skill-activation-prompt.sh wrapper script - Claude Code hook now calls yarn workspace directly from settings.json - Renamed scripts to follow project conventions: - run:skill - executes skill activation (called by Claude Code hook) - lint:skill - validates skill configuration (called by CI & locally) - Added [email protected] as devDependency to @local/claude-hooks - Updated CI workflow to use yarn lint:skill - Added yarn lint:skill to root package.json Benefits: - Simpler architecture - one less layer of indirection - Consistent with project patterns (lint:*, run:* naming conventions) - No shell script maintenance needed - Clearer dependency management through package.json - TypeScript script's getProjectDir() fallback handles project dir resolution
a0d4230 to
4299928
Compare
Benchmark results
|
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2002 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1001 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 3314 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 1526 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 2078 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 1033 | Flame Graph |
policy_resolution_medium
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 102 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 51 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 269 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 107 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 133 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 63 | Flame Graph |
policy_resolution_none
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 8 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 3 | Flame Graph |
policy_resolution_small
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 52 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 25 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 94 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 26 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 66 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 29 | Flame Graph |
read_scaling_complete
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id;one_depth | 1 entities | Flame Graph | |
| entity_by_id;one_depth | 10 entities | Flame Graph | |
| entity_by_id;one_depth | 25 entities | Flame Graph | |
| entity_by_id;one_depth | 5 entities | Flame Graph | |
| entity_by_id;one_depth | 50 entities | Flame Graph | |
| entity_by_id;two_depth | 1 entities | Flame Graph | |
| entity_by_id;two_depth | 10 entities | Flame Graph | |
| entity_by_id;two_depth | 25 entities | Flame Graph | |
| entity_by_id;two_depth | 5 entities | Flame Graph | |
| entity_by_id;two_depth | 50 entities | Flame Graph | |
| entity_by_id;zero_depth | 1 entities | Flame Graph | |
| entity_by_id;zero_depth | 10 entities | Flame Graph | |
| entity_by_id;zero_depth | 25 entities | Flame Graph | |
| entity_by_id;zero_depth | 5 entities | Flame Graph | |
| entity_by_id;zero_depth | 50 entities | Flame Graph |
read_scaling_linkless
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | 1 entities | Flame Graph | |
| entity_by_id | 10 entities | Flame Graph | |
| entity_by_id | 100 entities | Flame Graph | |
| entity_by_id | 1000 entities | Flame Graph | |
| entity_by_id | 10000 entities | Flame Graph |
representative_read_entity
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1
|
Flame Graph |
representative_read_entity_type
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| get_entity_type_by_id | Account ID: bf5a9ef5-dc3b-43cf-a291-6210c0321eba
|
Flame Graph |
representative_read_multiple_entities
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_property | traversal_paths=0 | 0 | |
| entity_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=0 | 0 | |
| link_by_source_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true |
scenarios
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| full_test | query-limited | Flame Graph | |
| full_test | query-unlimited | Flame Graph | |
| linked_queries | query-limited | Flame Graph | |
| linked_queries | query-unlimited | Flame Graph |
4299928 to
4857cc4
Compare
Your organization requires reapproval when changes are made, so Graphite has dismissed approvals. See the output of git range-diff at https://github.com/hashintel/hash/actions/runs/18979805178
This stack of pull requests is managed by Graphite. Learn more about stacking. |

🌟 What is the purpose of this PR?
This PR refactors the SQL query compiler to improve flexibility and type safety by introducing new type abstractions that make SQL construction more robust and maintainable.
The key improvement is the introduction of
Identifiertypes that allow for flexible SQL identifier construction (table names, column names, etc.) instead of being hardcoded to specific enum variants. This makes the query compiler more adaptable to different use cases and future extensions.Additionally, the
add_join_statementsfunction was completely rewritten to use a more robust alias numbering pattern that correctly handles complex scenarios with parallel query chains.🔗 Related links
🔍 What does this change?
Core Refactoring
The main change is in
libs/@local/graph/postgres-store/src/store/postgres/query/compile.rs:Complete rewrite of
add_join_statements:TableReferencewithCow<>for efficient borrowed/owned semanticsJoinExpressionwith the newJoinFrom::Tablevariantmax_numberpattern for alias numberingJOINconditions after alias number updatesThis approach ensures
JOINconditions always reference the correct table aliases, even in complex scenarios with multiple parallel query chains requiring the same tables.Type Safety Refactoring
Introduced new type abstractions to replace stringly-typed code:
Identifier<'name>(new file:expression/identifier.rs):TableReference<'name>(new file:expression/table_reference.rs):AliasedTabletypeSchemaReference→TableName→TableReferencedatabase.schema.tableColumnReference<'name>(new file:expression/column_reference.rs):*) as a special caseJoinFromenum (updatedexpression/join_clause.rs):Table { table, alias }andSelectStatement { statement, alias }Code Quality Improvements
Expression::ColumnReference→Expression::AliasedColumnfor clarityColumnReferencetypeadd_conditionfunction now properly handles adding conditions to join statements vs. WHERE clauseJoinFrom::reference_table()method provides clean access to the table being joined without exposing internal structurememimport andConditionfromHashderivation (since it contains expressions which aren't always hashable)Files Changed
Core compiler logic (~350 lines modified):
compile.rs: Complete rewrite ofadd_join_statements, updates to use new types throughoutNew type abstractions (~600 lines added):
expression/identifier.rs: PostgreSQL identifier with proper quotingexpression/table_reference.rs: Schema and table reference typesexpression/column_reference.rs: Column reference typesUpdated expression types:
expression/join_clause.rs: IntroducedJoinFromenum, restructuredJoinExpressionexpression/select_clause.rs: UseAliasedColumninstead ofColumnReferenceexpression/where_clause.rs: Type updatesexpression/with_clause.rs: Type updatesexpression/order_clause.rs: Type updatesexpression/group_by_clause.rs: Type updatesexpression/conditional.rs: Type updatesStatement updates:
statement/select.rs: UseTableReferenceinstead ofAliasedTablestatement/insert.rs: Type updatesstatement/window.rs: Type updatesstatement/mod.rs: Export new typesSupporting changes:
condition.rs: RemovedHashderivation (correctness fix)table.rs: Added helper methods for new typesexpression/mod.rs: Export new types🛡 What tests cover this?
identifier.rs: 9 test cases covering quoting, escaping, keywords, unicodetable_reference.rs: 7 test cases covering qualified names and special characterscolumn_reference.rs: 8 test cases covering qualified columns and asteriskThe existing integration tests provide regression coverage for the bug fix. These tests exercise various query patterns including: