Feature/sem review by antiartificial · Pull Request #19 · Ataraxy-Labs/sem

antiartificial · 2026-03-08T20:50:09Z

Adds a few new features:

sem review and sem log work with historical state whereas sem graph and sem impact work with current state. Signature change detection (new capability, powers the [signature changed] / [body only] labels in review and changelog).

sem log full history of any function (entity), following it through renames and moves across commits
sem changelog auto-categorized release notes from a commit range with a suggested semver bump
sem review groups changes by impact (API surface, internal, config) with dependency counts and a risk assessment

Other items in this PR:

This includes a slight fix to the version string handling, previously this was hardcoded and could become stale, now reflects what's in the Cargo.toml.
Test coverage for all three new modules (19 new tests)
README updated with documentation for the new commands

Example usage/output

sem review --from HEAD~2 --to HEAD # last two commits

  ┌─ Internal Changes ──────────────────────────────────
  │  ∆ function   getSupportedMimeType      [modified]
  │  ∆ variable   types                     [modified]
  └───────────────────────────────────────────────────────

  Summary: 2 internal
  Risk: low (internal/config changes only, no public API impact)

Example usages:
sem log --entity authenticateUser # finds all references to entity (ex. a function)
sem log --entity login --file src/auth.ts # constrains search to specific file

Example output:

>sem log --entity getSupportedMimeType                                                                                                              (base)

  prism-web/src/lib/audio.ts :: function :: getSupportedMimeType

  349f65f  2026-02-18  Aaron Barton  [modified]       Fix Brave misdetected as Safari: prefer webm over mp4 MIME type
  45b819f  2026-02-06  antiartifici  [added]          Add Prism: voice-driven Socratic interview simulator

Example usages:
sem changelog --from v1.2.0 --to HEAD # ex changes spanning a tag range (or commits)
sem changelog --from v1.0.0 --to v2.0.0 --format markdown

Example output:

sem changelog --from HEAD~1 --to HEAD                                                                                                              (base)

## Unreleased — 2026-03-08

### Internal
  · `getSupportedMimeType` — function modified in prism-web/src/lib/audio.ts
  · `types` — variable modified in prism-web/src/lib/audio.ts

Suggested version bump: PATCH (no new public API, no breaking changes)

Groups entity-level changes into API surface (cross-file dependents), internal (body-only, deletions, no cross-file refs), and config/data categories. Includes approximate dependent counts from the entity graph and a risk heuristic (low/medium/high). Supports terminal and JSON output.

Classifies entity-level changes into Keep-a-Changelog categories (Breaking, Added, Changed, Removed, Internal) with semver bump suggestion. Includes Conventional Commits parsing from commit messages and supports terminal, markdown, and JSON output formats. Also adds get_log_range() to GitBridge for commit message access.

Re-parses before/after entity content with tree-sitter to extract parameter lists and classify changes as breaking (params removed or reordered) vs non-breaking (params added, body-only). Supports TypeScript, JavaScript, Python, Rust, Go, Java, C, C++, and more. Integrated into `sem changelog` for automatic breaking change detection.

…mmit walking

Add README sections for entity history tracking, semantic review, and changelog generation. Update language table with Swift, Elixir, Bash, and Vue. Document sem diff file comparison mode.

19 new tests covering risk assessment, config classification, entity resolution, changelog generation, and modification classification. All 83 tests pass.

…try.file_type()

rs545837 · 2026-03-09T02:11:58Z

Thanks I will take a look at this asap.

rs545837 · 2026-03-09T23:31:18Z

Replied on the discussion with more context on how this relates to inspect and the broader roadmap. The sem log and sem changelog modules look good and fill real gaps in the CLI. sem review overlaps significantly with inspect which does entity-level triage with graph-based risk scoring and LLM integration.

Going to review the code in detail. Main things I want to check: how the signature detection interacts with sem-core's existing structural_hash (which already distinguishes cosmetic vs structural changes), and whether the log module's git history walking could live in sem-core as a library function so inspect and weave can use it too.

Palanikannan1437

Great feature set — review, changelog, and log are solid additions. Left some inline comments. The two I'd want addressed before merge are the signature breaking classification (return type changes + language-aware param handling) and the entity name collision in log history tracking. The rest are suggestions for follow-ups.

Also we could discuss more on what should stay here vs in inspect as @rs545837 mentioned above

Palanikannan1437 · 2026-03-10T15:25:26Z

crates/sem-core/src/parser/signature.rs

+    pub fn is_breaking(&self) -> bool {
+        matches!(
+            self,
+            SignatureChangeKind::ParamsRemoved { .. }


I think ReturnTypeChanged should be in here too — changing a return type will break every caller that depends on the old type. Right now it silently gets classified as non-breaking, which means changelog/semver will miss real breakage.

Palanikannan1437 · 2026-03-10T15:25:26Z

crates/sem-core/src/parser/signature.rs

+    pub fn label(&self) -> &str {
+        match self {
+            SignatureChangeKind::BodyOnly => "body only",
+            SignatureChangeKind::ParamsAdded { .. } => "parameter added",


This is only non-breaking in languages with optional/default params (JS/TS/Python). In Rust, Go, Java, C — adding a required param is absolutely breaking. Since we don't actually check for defaults here (only compare names), I'd either treat this as breaking by default, or at least label it "signature changed" instead of claiming it's safe.

Great callout as I've been testing this feature with Go primarily.

Palanikannan1437 · 2026-03-10T15:25:26Z

crates/sem-core/src/git/bridge.rs

-    pub fn get_log(&self, limit: usize) -> Result<Vec<CommitInfo>, GitError> {
+    /// Walk commits lazily from `to` (or HEAD) backward, stopping at `from`.
+    /// The callback returns `true` to continue or `false` to stop early.
+    pub fn for_each_commit(


Nice API — the lazy callback with early-stop is a good design for entity log. One thing to consider for later: the callback receives CommitInfo with sha as a String, but callers like build_entity_log immediately re-resolve that SHA back into trees via resolve_tree. Passing richer data (or at least the Oid + parent) would avoid that round-trip.

Palanikannan1437 · 2026-03-10T15:25:26Z

crates/sem-core/src/parser/log.rs

+
+        // 3. Resolve both trees once per commit, then read blobs.
+        let after_tree = git.resolve_tree(&commit.sha).ok();
+        let before_tree = git.resolve_tree(&format!("{}~1", commit.sha)).ok();


If commit.sha is the very first (root) commit, sha~1 won't resolve and this will silently skip the "before" content. Might be worth handling that case explicitly so we don't miss the initial creation of an entity.

Palanikannan1437 · 2026-03-10T15:25:26Z

crates/sem-core/src/parser/log.rs

+
+        // 5. Find the change that matches our tracked entity.
+        for change in &diff.changes {
+            if change.entity_name != tracked_name {


Matching by entity_name alone could misattribute history when different files have entities with the same name (e.g., login in auth.ts and login in admin.ts). Worth also checking change.file_path == tracked_file here.

Palanikannan1437 · 2026-03-10T15:25:26Z

crates/sem-cli/src/commands/review.rs

+        }
+    };
+
+    let file_changes = if let Some(ref sha) = opts.commit {


This entire if/else-if chain for resolving file changes is duplicated almost verbatim in changelog.rs (and partially in diff.rs). Would be nice to extract a small commands/common.rs helper like resolve_file_changes(git, commit, from, to, staged, file_exts) — keeps each command focused on its own logic.

Palanikannan1437 · 2026-03-10T15:25:27Z

crates/sem-core/src/utils/date.rs

@@ -0,0 +1,53 @@
+/// Format a Unix timestamp (seconds since epoch) as YYYY-MM-DD.
+pub fn format_unix_date(unix_seconds: i64) -> String {


Nitpick: hand-rolled calendar math always makes me a bit nervous (leap year edges, negative timestamps, etc.). If you ever pull in time or chrono, this would be a good candidate to replace. Works fine as-is though.

Palanikannan1437 · 2026-03-10T15:25:27Z

crates/sem-cli/src/main.rs

+        format: String,
+
+        /// Follow renames (default: true)
+        #[arg(long, default_value_t = true)]


Heads up: default_value_t = true means --follow is always on and there's no way to turn it off from CLI. Clap won't parse --follow false. You'd want either a --no-follow flag, or use #[arg(long, action = ArgAction::Set, default_value_t = true)].

Palanikannan1437 · 2026-03-10T15:25:27Z

crates/sem-core/src/parser/review.rs

+fn assess_risk(
+    api_surface: &[ReviewChange],
+    internal: &[ReviewChange],
+    _config: &[ReviewChange],


Curious — _config is unused in the risk assessment. Feels like config changes with high impact (e.g., changing a database URL or feature flag) could warrant at least a mention in the risk reason. Not blocking, just something to think about.

crates/sem-cli/src/main.rs

antiartificial · 2026-03-10T15:35:53Z

I'll have more time to digest and reflect this evening, thank you for your eyes and input.

antiartificial added 9 commits March 8, 2026 13:47

add sem log --entity command for entity history tracking

c8c768a

refactor: extract shared utilities, make CommitInfo.date i64, lazy co…

64debd8

…mmit walking

docs: document sem log, sem review, sem changelog, and new languages

0563dd4

Add README sections for entity history tracking, semantic review, and changelog generation. Update language table with Swift, Elixir, Bash, and Vue. Document sem diff file comparison mode.

fix: derive CLI version from Cargo.toml instead of hardcoded string

a9229f5

fix: restore upstream URLs in README

bf4b8f1

test: add coverage for review, changelog, and entity log modules

7d8fd4f

19 new tests covering risk assessment, config classification, entity resolution, changelog generation, and modification classification. All 83 tests pass.

antiartificial marked this pull request as ready for review March 8, 2026 23:31

cleanup: deduplicate cwd resolution, fix FUNCTION_KINDS dupes, use en…

7dd3ca5

…try.file_type()

This comment was marked as off-topic.

Sign in to view

Palanikannan1437 reviewed Mar 10, 2026

View reviewed changes

		@@ -0,0 +1,53 @@
		/// Format a Unix timestamp (seconds since epoch) as YYYY-MM-DD.
		pub fn format_unix_date(unix_seconds: i64) -> String {

Conversation

antiartificial commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rs545837 commented Mar 9, 2026

Uh oh!

rs545837 commented Mar 9, 2026

Uh oh!

This comment was marked as off-topic.

Uh oh!

Palanikannan1437 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

antiartificial commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

antiartificial commented Mar 8, 2026 •

edited

Loading

Palanikannan1437 left a comment •

edited

Loading