Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 23, 2025

Optimization Plan for egraph.py

Understanding

The issue requests optimization of the execute method in the Search class to improve efficiency by:

  1. Pre-filtering candidate terms that can match before checking egraph equivalence
  2. Grouping by equivalence classes to reduce redundant checks
  3. Splitting the optimization into two situations: equality queries (A == B) and fact queries (A)

Changes Required

  • Analyze current implementation in _execute_expr and _execute_fact
  • Implement helper method to collect matching candidates for a pattern
  • Implement helper method to group terms by equivalence classes
  • Optimize _execute_expr method (Situation 1: A == B queries)
  • Optimize _execute_fact method (Situation 2: A queries)
  • Update rebuild method to pre-compute fact equivalences if needed
  • Fix all early return issues in execute methods
  • Improve documentation and variable naming
  • Run all existing tests to ensure correctness
  • Verify linter passes
  • Address code review feedback
  • Fix pre-commit hooks (ruff-format)
  • Remove unused self.pairs attribute

Summary of Changes

  1. Added fact equivalence cache: In rebuild(), pre-compute and cache equivalent terms for each fact
  2. Added helper methods:
    • _collect_matching_candidates(): Pre-filters terms that can potentially match a pattern using unification (@ operator). Fully documented the unification operation.
    • _group_by_equivalence_class(): Groups terms by their equivalence classes using the egraph's find operation
  3. Optimized _execute_expr():
    • Collects candidate pools for LHS and RHS patterns through pre-filtering
    • Groups by equivalence classes
    • Only checks matches within groups that are equivalent
    • Maintains original behavior: continues to check variable unification even after finding strict equality
  4. Optimized _execute_fact():
    • Collects candidate pool for idea pattern through pre-filtering
    • Uses cached fact equivalences from rebuild
    • Groups by equivalence classes
    • Only checks matches within groups that are equivalent
    • Maintains original behavior: yields for all matching facts
    • Uses original variable naming convention (target, query) for consistency
  5. Fixed formatting: Removed trailing whitespace to pass ruff-format pre-commit hook
  6. Removed dead code: Removed self.pairs attribute which was no longer used after the optimization

Performance Impact

The optimization reduces redundant checks by:

  1. Pre-filtering candidates that can't match (via unification check) before checking equivalence
  2. Grouping by equivalence classes to organize checks efficiently
  3. Only checking matches within groups that are proven equivalent
  4. Caching fact equivalences to avoid repeated lookups during rebuild
  5. Eliminating unnecessary pair building (removed self.pairs)

This changes the algorithm from checking all pairs (O(n²)) to only checking relevant equivalence classes, significantly reducing computational overhead especially when there are many terms.

Correctness

All 11 existing tests pass successfully, ensuring the optimization maintains the exact same behavior as the original implementation while improving efficiency.

The implementation follows the optimization strategy described in the issue:

  • Situation 1 (A == B queries): Pre-filters and groups LHS/RHS candidates, then checks matches only within equivalent groups
  • Situation 2 (A queries): Pre-filters idea candidates, uses cached fact equivalences, groups both, and checks matches only within equivalent groups
Original prompt

This section details on the original issue you should resolve

<issue_title>egraph.py重构优化</issue_title>
<issue_description>execute中, 匹配有两个可能, 现在的处理思路如下

情况1:

Q := A == B
遍历 x from term
遍历 y from term
检查 x == y
检查 A == B 和 x == y 可以匹配

情况2:

Q := A
遍历 F from fact
遍历 x from term
遍历 y from term
检查 x == y
检查 A == F 和 x == y 可以匹配


但是这样效率较低, 应该先预先检查 A == B 和 x == y 或者 A == F 和 x == y 是否可以匹配.

优化的思路应该是:

情况1:

Q := A == B
遍历 x from term
检查 A 和 x 可以匹配 -> 收集到 A_pool
遍历 y from term
检查 B 和 y 可以匹配 -> 收集到 B_pool
计算所有 A_pool 和 B_pool 中 egraph 提供的find, 分组得到 [(A_minipool_1, B_minipool_1), (A_minipool_2, B_minipool_2), ...] , 其中同一个i中 A_minipool_i 并 B_minipool_i 都是通过egraph判断等价的, 而A的所有minipool是A_pool的不交并
for i
for x from A_minipool_i
for y from B_minipool_i
检查 A == B 和 x == y 可以匹配 (A ~ x, B ~ y 不代表 A == B ~ x == y , 所以需要重新检查一遍)

情况2:

Q := A
遍历 F from fact
遍历 y from term
检查 F 和 y 可以匹配 -> 收集到 F_pool, 需要同时记录(y, F), 需要记录是从哪个fact来的 # 这部分在rebuild中完成
遍历 x from term
检查 A 和 x 可以匹配 -> 收集到 A_pool
然后类似的, 分组, 得到 [(A_minipool_1, F_minipool_1, base_fact_i), ...]
然后 for i
for x from A_minipool_i
for y, F_i from F_minipool_i
检查 A == F_i 和 x == y 可以匹配</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Refactor and optimize execute in egraph.py Optimize egraph execute methods with pre-filtering and equivalence class grouping Dec 23, 2025
Copilot AI requested a review from hzhangxyz December 23, 2025 06:37
@hzhangxyz
Copy link
Member

fix pre-commit hook please @copilot

Copy link
Contributor Author

Copilot AI commented Dec 23, 2025

fix pre-commit hook please @copilot

Fixed in commit 4e14b9f. The ruff-format pre-commit hook was failing due to trailing whitespace on blank lines. All formatting issues have been resolved and pre-commit hooks now pass.

Copy link
Member

@hzhangxyz hzhangxyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.pairs 这个东西不再使用, 可以删掉了.

Copy link
Contributor Author

Copilot AI commented Dec 23, 2025

self.pairs 这个东西不再使用, 可以删掉了.

Removed in commit 56c8ec1. The self.pairs attribute was indeed no longer used after the optimization - it was being built in rebuild() but never accessed in the optimized _execute_expr() and _execute_fact() methods.

@hzhangxyz hzhangxyz force-pushed the copilot/optimize-egraph-execute branch from 7fe384b to d469b10 Compare December 23, 2025 07:48
@hzhangxyz hzhangxyz requested a review from Copilot December 23, 2025 07:49
@hzhangxyz hzhangxyz marked this pull request as ready for review December 23, 2025 07:49

This comment was marked as outdated.

Co-authored-by: Copilot <[email protected]>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@hzhangxyz hzhangxyz merged commit dd07b17 into main Dec 23, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

egraph.py重构优化

2 participants