Conversation
Groups entity-level changes into API surface (cross-file dependents), internal (body-only, deletions, no cross-file refs), and config/data categories. Includes approximate dependent counts from the entity graph and a risk heuristic (low/medium/high). Supports terminal and JSON output.
Classifies entity-level changes into Keep-a-Changelog categories (Breaking, Added, Changed, Removed, Internal) with semver bump suggestion. Includes Conventional Commits parsing from commit messages and supports terminal, markdown, and JSON output formats. Also adds get_log_range() to GitBridge for commit message access.
Re-parses before/after entity content with tree-sitter to extract parameter lists and classify changes as breaking (params removed or reordered) vs non-breaking (params added, body-only). Supports TypeScript, JavaScript, Python, Rust, Go, Java, C, C++, and more. Integrated into `sem changelog` for automatic breaking change detection.
Add README sections for entity history tracking, semantic review, and changelog generation. Update language table with Swift, Elixir, Bash, and Vue. Document sem diff file comparison mode.
19 new tests covering risk assessment, config classification, entity resolution, changelog generation, and modification classification. All 83 tests pass.
|
Thanks I will take a look at this asap. |
|
Replied on the discussion with more context on how this relates to inspect and the broader roadmap. The Going to review the code in detail. Main things I want to check: how the signature detection interacts with sem-core's existing |
There was a problem hiding this comment.
Great feature set — review, changelog, and log are solid additions. Left some inline comments. The two I'd want addressed before merge are the signature breaking classification (return type changes + language-aware param handling) and the entity name collision in log history tracking. The rest are suggestions for follow-ups.
Also we could discuss more on what should stay here vs in inspect as @rs545837 mentioned above
| pub fn is_breaking(&self) -> bool { | ||
| matches!( | ||
| self, | ||
| SignatureChangeKind::ParamsRemoved { .. } |
There was a problem hiding this comment.
I think ReturnTypeChanged should be in here too — changing a return type will break every caller that depends on the old type. Right now it silently gets classified as non-breaking, which means changelog/semver will miss real breakage.
| pub fn label(&self) -> &str { | ||
| match self { | ||
| SignatureChangeKind::BodyOnly => "body only", | ||
| SignatureChangeKind::ParamsAdded { .. } => "parameter added", |
There was a problem hiding this comment.
This is only non-breaking in languages with optional/default params (JS/TS/Python). In Rust, Go, Java, C — adding a required param is absolutely breaking. Since we don't actually check for defaults here (only compare names), I'd either treat this as breaking by default, or at least label it "signature changed" instead of claiming it's safe.
There was a problem hiding this comment.
Great callout as I've been testing this feature with Go primarily.
| pub fn get_log(&self, limit: usize) -> Result<Vec<CommitInfo>, GitError> { | ||
| /// Walk commits lazily from `to` (or HEAD) backward, stopping at `from`. | ||
| /// The callback returns `true` to continue or `false` to stop early. | ||
| pub fn for_each_commit( |
There was a problem hiding this comment.
Nice API — the lazy callback with early-stop is a good design for entity log. One thing to consider for later: the callback receives CommitInfo with sha as a String, but callers like build_entity_log immediately re-resolve that SHA back into trees via resolve_tree. Passing richer data (or at least the Oid + parent) would avoid that round-trip.
|
|
||
| // 3. Resolve both trees once per commit, then read blobs. | ||
| let after_tree = git.resolve_tree(&commit.sha).ok(); | ||
| let before_tree = git.resolve_tree(&format!("{}~1", commit.sha)).ok(); |
There was a problem hiding this comment.
If commit.sha is the very first (root) commit, sha~1 won't resolve and this will silently skip the "before" content. Might be worth handling that case explicitly so we don't miss the initial creation of an entity.
|
|
||
| // 5. Find the change that matches our tracked entity. | ||
| for change in &diff.changes { | ||
| if change.entity_name != tracked_name { |
There was a problem hiding this comment.
Matching by entity_name alone could misattribute history when different files have entities with the same name (e.g., login in auth.ts and login in admin.ts). Worth also checking change.file_path == tracked_file here.
| } | ||
| }; | ||
|
|
||
| let file_changes = if let Some(ref sha) = opts.commit { |
There was a problem hiding this comment.
This entire if/else-if chain for resolving file changes is duplicated almost verbatim in changelog.rs (and partially in diff.rs). Would be nice to extract a small commands/common.rs helper like resolve_file_changes(git, commit, from, to, staged, file_exts) — keeps each command focused on its own logic.
| @@ -0,0 +1,53 @@ | |||
| /// Format a Unix timestamp (seconds since epoch) as YYYY-MM-DD. | |||
| pub fn format_unix_date(unix_seconds: i64) -> String { | |||
There was a problem hiding this comment.
Nitpick: hand-rolled calendar math always makes me a bit nervous (leap year edges, negative timestamps, etc.). If you ever pull in time or chrono, this would be a good candidate to replace. Works fine as-is though.
| format: String, | ||
|
|
||
| /// Follow renames (default: true) | ||
| #[arg(long, default_value_t = true)] |
There was a problem hiding this comment.
Heads up: default_value_t = true means --follow is always on and there's no way to turn it off from CLI. Clap won't parse --follow false. You'd want either a --no-follow flag, or use #[arg(long, action = ArgAction::Set, default_value_t = true)].
| fn assess_risk( | ||
| api_surface: &[ReviewChange], | ||
| internal: &[ReviewChange], | ||
| _config: &[ReviewChange], |
There was a problem hiding this comment.
Curious — _config is unused in the risk assessment. Feels like config changes with high impact (e.g., changing a database URL or feature flag) could warrant at least a mention in the risk reason. Not blocking, just something to think about.
|
I'll have more time to digest and reflect this evening, thank you for your eyes and input. |
Adds a few new features:
sem reviewandsem logwork with historical state whereassem graphandsem impactwork with current state. Signature change detection (new capability, powers the [signature changed] / [body only] labels in review and changelog).sem logfull history of any function (entity), following it through renames and moves across commitssem changelogauto-categorized release notes from a commit range with a suggested semver bumpsem reviewgroups changes by impact (API surface, internal, config) with dependency counts and a risk assessmentOther items in this PR:
Example usage/output
Example usages:
sem log --entity authenticateUser # finds all references to entity (ex. a function)
sem log --entity login --file src/auth.ts # constrains search to specific file
Example output:
Example usages:
sem changelog --from v1.2.0 --to HEAD # ex changes spanning a tag range (or commits)
sem changelog --from v1.0.0 --to v2.0.0 --format markdown
Example output: