Skip to content

Conversation

@codabrink
Copy link
Contributor

@codabrink codabrink commented Jan 8, 2026

Introduce log_parser crate and update xmtp_macro::build_logging_metadata to emit -prefixed logs with timestamp and Event discriminants

Add a new log_parser crate with a pest grammar for log lines, implement state::LogEvent::from to map lines to Event and typed context, and add event: Event to EventMetadata. Update macro output to include and timestamp in log objects.

📍Where to Start

Start with LogEvent::from in log_parser/src/state.rs and the grammar in log_parser/src/parser/defs/log.pest.


📊 Macroscope summarized b545658. 17 files reviewed, 17 issues evaluated, 10 issues filtered, 6 comments posted

🗂️ Filtered Issues

crates/xmtp_common/src/fmt.rs — 0 comments posted, 1 evaluated, 1 filtered
  • line 34: The short_hex function now returns the FIRST SHORT_LEN bytes instead of the LAST SHORT_LEN bytes as the old trait implementation did. The old ShortHex impl for &[u8] used &self[self.len().saturating_sub(SHORT_LEN)..] (taking from the end), while the new function uses &bytes[..(SHORT_LEN.min(bytes.len()))] (taking from the beginning). Any existing callers that depended on getting the trailing bytes of an identifier or hash will now receive different (incorrect) results, which could cause debugging issues or incorrect display output. [ Already posted ]
crates/xmtp_macro/src/logging.rs — 0 comments posted, 1 evaluated, 1 filtered
  • line 144: The diff shows that the code block which created an installation_id_field from the parsed installation_id expression has been removed. The installation_id is still parsed and stored in LogEventInput::installation_id (line 44), but no corresponding Field is created and pushed to fields. This means the installation_id passed to the log_event! macro (e.g., self.context.installation_id() in the consumer code) will be parsed but never included in the log output, causing logs to silently lose the installation_id field. [ Already posted ]
crates/xmtp_mls/src/tasks.rs — 0 comments posted, 1 evaluated, 1 filtered
  • line 207: The expected and got variable assignments are swapped in the hash mismatch warning message. expected is assigned the computed hash (data_hash) while got is assigned the stored hash (task.data_hash). This is backwards from the intuitive meaning where expected should represent the stored/reference value and got should represent the computed/received value. The error message will mislead developers during debugging. [ Low confidence ]
log_parser/src/state/event.rs — 2 comments posted, 5 evaluated, 2 filtered
  • line 98: At line 98, inbox.as_str()? uses error propagation to return early, but this happens after the line has already been consumed from the iterator at line 55. If this fails, the caller's iterator will be left in an inconsistent state - the line was consumed but no LogEvent was returned, and the caller cannot recover that line. [ Low confidence ]
  • line 103: When Self::from(lines) fails after line 55 has executed (e.g., at line 97 due to missing inst field), the line has already been consumed by lines.next() on line 55. However, the while body then calls lines.next() again on line 103, consuming the NEXT line instead of the line that failed to parse. This causes lines to be incorrectly attributed to intermediate or skipped entirely. [ Already posted ]
log_parser/src/state/mod.rs — 3 comments posted, 7 evaluated, 4 filtered
  • line 54: Type mismatch: client.groups is HashMap<String, Arc<RwLock<GroupState>>> but or_insert_with(|| GroupState::new()) inserts a plain GroupState instead of Arc<RwLock<GroupState>>. This should be or_insert_with(|| Arc::new(RwLock::new(GroupState::new()))). [ Low confidence ]
  • line 89: The ingest method pushes events to self.stream but never populates the groups HashMap. Since LogEvent has a group_id() method and ClientState has a groups: HashMap<String, Arc<RwLock<GroupState>>> field, this may be an incomplete implementation where group-related events should be tracked in the groups map but currently are not. [ Low confidence ]
  • line 115: The update method creates unbounded memory growth by storing a reference to the previous state in prev on every call. Each invocation of update() creates a new GroupState with prev: Some(self.clone()), forming an ever-growing linked list of historical states. Since each new state holds an Arc reference to all previous states, memory is never reclaimed, potentially leading to a memory leak over time if update() is called frequently. [ Low confidence ]
  • line 116: The update method uses unwrap() on self.read() (line 116) and self.write() (line 119), which will panic if the RwLock is poisoned (i.e., if a thread panicked while holding the lock). Depending on the concurrency model of the application, this could cause cascading panics. [ Low confidence ]
log_parser/src/state/value.rs — 0 comments posted, 1 evaluated, 1 filtered
  • line 49: In fmt_indented for Value::Array, the nested elements are formatted using v.to_string() which calls fmt_indented(f, 0) with indent=0. This means nested objects inside arrays will not respect the parent indentation level, producing incorrectly indented output when arrays contain nested objects. [ Already posted ]

@claude
Copy link

claude bot commented Jan 8, 2026

Claude finished @codabrink's task —— View job


Code Review - Log Parsing PR

Reviewing this PR for code quality, bugs, performance, security, and test coverage issues.

Todo:

  • Read and analyze core log parser files
  • Review macro changes
  • Check for issues not already flagged by Macroscope
  • Provide concise feedback

Review Feedback

🔴 Critical Issues

1. Iterator consumption bug in log_parser/src/state/event.rs:61-66

The code calls find() twice on the same iterator, which will consume elements. If object appears before event in the parsed line, the first find() will skip it, causing the second find() to fail. This matches Macroscope's latest comment.

let mut line_inner = line.into_inner();
let Some(event) = line_inner.find(|e| matches!(e.as_rule(), Rule::event)) else { ... };
let Some(object) = line_inner.find(|p| matches!(p.as_rule(), Rule::object)) else { ... };

Fix: Collect the iterator first or use clone():

let line_inner: Vec<_> = line.into_inner().collect();
let Some(event) = line_inner.iter().find(...) else { ... };
let Some(object) = line_inner.into_iter().find(...) else { ... };

2. extract_if is unstable in log_parser/src/state/event.rs:93-97

The code uses extract_if, which is a nightly-only experimental API. This will fail to compile on stable Rust:

let (_, inbox) = context
    .extract_if(.., |(k, _)| k == "inst")  // ⚠️ unstable
    .collect::<Vec<_>>()
    .pop()

Fix: Use stable APIs like iter().position() with swap_remove() or find() + remove().

3. Type mismatch in tests - context is Vec not HashMap

In log_parser/src/state/mod.rs:136 and elsewhere, tests call .get() on event.context, but context is defined as Vec<(String, Value)>, not a HashMap. This code won't compile:

let group_id = event.context.get("group_id");  // ❌ Vec doesn't have get(&str)

🟡 Moderate Issues

4. Missing installation_id in structured logging branch

In crates/xmtp_macro/src/lib.rs:280, the structured logging path is missing inst field despite the comment saying it should be included. This matches Macroscope's comment. Non-structured logging includes it on line 294.

5. Recursive parsing causes stack growth in log_parser/src/state/event.rs:102-108

The "intermediate lines" logic recursively calls Self::from() in a loop to detect unparseable lines:

while Self::from(lines).is_err() {
    // consume line
}

Each recursive call creates a full parser attempt with error context. For logs with many consecutive unparseable lines, this could cause excessive stack growth.

Fix: Check for parse success more cheaply (e.g., peek and try basic pattern match) rather than full recursive parsing.

🟢 Minor Issues

6. Test code coverage concern

Codecov reports 62.5% patch coverage with 72 lines missing coverage in log_parser/src/state.rs. The State::ingest() method and most of the state management logic appear untested. Consider adding tests for state transitions.

7. Potential panic in GroupState::update() at log_parser/src/state/mod.rs:116,119

Uses .unwrap() on RwLock operations which will panic if poisoned. Macroscope already flagged this. While poisoning is rare, consider using .expect() with descriptive messages for better debugging.


Summary: This is a good start on structured log parsing, but there are several compilation-blocking issues (extract_if unstable API, type mismatches) and a critical iterator consumption bug that need to be fixed before merge. The test coverage also needs improvement, especially for state management logic.

Review completed: 2026-01-22 22:13 UTC

@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

❌ Patch coverage is 62.50000% with 72 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.33%. Comparing base (d633c2b) to head (405d142).

Files with missing lines Patch % Lines
log_parser/src/state.rs 58.15% 59 Missing ⚠️
xmtp_mls/src/groups/mls_sync.rs 10.00% 9 Missing ⚠️
log_parser/src/main.rs 86.36% 3 Missing ⚠️
xmtp_mls/src/builder.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2991      +/-   ##
==========================================
- Coverage   74.38%   74.33%   -0.05%     
==========================================
  Files         413      415       +2     
  Lines       53249    53435     +186     
==========================================
+ Hits        39610    39723     +113     
- Misses      13639    13712      +73     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@macroscopeapp macroscopeapp bot Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated inbox truncation assumes #inbox is &str, which will fail for String or other types. Consider using (#inbox).as_ref() to accept any AsRef<str>.

No longer relevant as of a9c0728

fn as_int(&self) -> Result<i64> {
match self {
Self::Int(int) => Ok(*int),
_ => bail!("{self:?} is not string"),
Copy link
Contributor

@macroscopeapp macroscopeapp bot Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as_int() returns the wrong error text. It says "is not string" but should say "is not int". Consider updating the message to match the type so failures are clearer.

No longer relevant as of a9c0728

Comment on lines +31 to +36
let ctx = |key: &str| {
event
.context
.get(key)
.with_context(|| format!("Missing context field {key}"))
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

event.context is a Vec<(String, Value)>, not a HashMap, so .get(key) won't compile. Consider using event.context.iter().find(|(k, _)| k == key).map(|(_, v)| v) instead.

Suggested change
let ctx = |key: &str| {
event
.context
.get(key)
.with_context(|| format!("Missing context field {key}"))
};
let ctx = |key: &str| {
event
.context
.iter()
.find(|(k, _)| k == key)
.map(|(_, v)| v)
.with_context(|| format!("Missing context field {key}"))
};

🚀 Want me to fix this? Reply ex: "fix it for me".

.groups
.entry(group_id.to_string())
.or_insert_with(|| GroupState::new())
.update();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GroupState has no update() method defined. Consider removing this call or implementing the method.

Suggested change
.update();
;

🚀 Want me to fix this? Reply ex: "fix it for me".


// Collect up the intermediate lines that don't parse.
let mut intermediate = String::new();
while Self::from(lines).is_err() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing via recursive Self::from(lines) advances the iterator during probes, causing skipped lines and dropped successful parses. Consider peeking without consuming or refactoring so look-aheads don’t advance and successful events are preserved.

🚀 Want me to fix this? Reply ex: "fix it for me".

Comment on lines +27 to +28
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build is missing a return statement. Consider adding state at the end of the function.

Suggested change
}
}
}
state
}

🚀 Want me to fix this? Reply ex: "fix it for me".

Comment on lines +20 to +22
pub fn inbox(&self) -> &str {
&self.installation
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is named inbox() but returns installation. Consider renaming to installation() to match the field and data source ("inst" key).

Suggested change
pub fn inbox(&self) -> &str {
&self.installation
}
pub fn installation(&self) -> &str {
&self.installation
}

🚀 Want me to fix this? Reply ex: "fix it for me".

open_log(handle, &log_file);
}

pub fn open_log(handle: Weak<AppWindow>, log_file: &str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log_file is declared as &str but used as a struct with a .streams field on line 115. Consider parsing the string into a log structure before accessing .streams, or change the parameter type to accept an already-parsed struct.

🚀 Want me to fix this? Reply ex: "fix it for me".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants