Conversation
WalkthroughLinkInfo equality and hashing were changed to include file_path and line_number (location-aware deduplication). Tests in link_extractor, link_checker, and service were updated to reflect location-aware behavior and to make network-dependent tests tolerant to request failures or skip when remote setup fails. Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/git/link_extractor.rs (2)
149-213: 새 테스트들이file_path와line_number를 각각 독립적으로 고정하진 못합니다.현재 케이스들은 위치가 "둘 다 다름" 또는 "둘 다 같음"만 확인해서, 추후
Eq/Hash에서file_path나line_number중 하나가 빠져도 여전히 통과할 수 있습니다. 같은 URL에 대해(same file, different line)과(different file, same line)를 각각 하나씩 추가해 두면 이번 변경 의도를 더 정확히 잠글 수 있습니다.🧪 Suggested test additions
// Same URL, same file path, same line number => deduplicated let link2_duplicate = LinkInfo { url: "https://example.com".to_string(), file_path: "file2.txt".to_string(), line_number: 2, }; links.insert(link2_duplicate); assert_eq!( links.len(), 2, "Expected duplicates at identical location to be deduplicated" ); + + // Same URL, same file path, different line number => distinct + let link_same_file_different_line = LinkInfo { + url: "https://example.com".to_string(), + file_path: "file2.txt".to_string(), + line_number: 3, + }; + links.insert(link_same_file_different_line); + assert_eq!(links.len(), 3, "Expected different line numbers to remain distinct"); + + // Same URL, different file path, same line number => distinct + let link_different_file_same_line = LinkInfo { + url: "https://example.com".to_string(), + file_path: "file3.txt".to_string(), + line_number: 3, + }; + links.insert(link_different_file_same_line); + assert_eq!(links.len(), 4, "Expected different file paths to remain distinct"); - - // Different URL + // Different URL let link3 = LinkInfo { url: "https://example.org".to_string(), file_path: "file1.txt".to_string(), line_number: 1, }; links.insert(link3); - // Should now have three entries because URL is different - assert_eq!(links.len(), 3, "Expected three unique link entries"); + assert_eq!(links.len(), 5, "Expected five unique link entries");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/git/link_extractor.rs` around lines 149 - 213, The new tests don't independently verify that both file_path and line_number participate in equality/hash for LinkInfo; update tests to add two explicit cases: one asserting two LinkInfo entries with the same url and same file_path but different line_number are treated as distinct, and another asserting two entries with the same url and same line_number but different file_path are distinct; modify or extend the existing test_link_info_uniqueness (and/or add small focused tests near it) and use the LinkInfo constructor and HashSet operations (and also exercise find_link_in_content where appropriate) so the equality/hash implementation must include both file_path and line_number to pass.
9-36:LinkInfo의 동등성/해시는 derive로 묶는 편이 더 안전합니다.지금 구현은 필드 목록을
struct,PartialEq,Hash에 각각 반복해서 적고 있어서, 이후 식별 필드가 바뀌면 한쪽만 수정되는 회귀가 생기기 쉽습니다. 이번 PR처럼 세 필드를 그대로 동일성에 반영할 거라면derive(PartialEq, Eq, Hash)로 두는 쪽이 더 단순하고 유지보수도 쉽습니다.♻️ Proposed refactor
-#[derive(Debug, Clone)] +#[derive(Debug, Clone, PartialEq, Eq, Hash)] /// Represents a hyperlink found in a repository, along with its location. pub struct LinkInfo { /// The URL string. This should be a valid HTTP or HTTPS URL. pub url: String, /// The relative file path where the URL was found. pub file_path: String, /// The 1-based line number in the file where the URL was found. pub line_number: usize, } - -impl PartialEq for LinkInfo { - fn eq(&self, other: &Self) -> bool { - self.url == other.url - && self.file_path == other.file_path - && self.line_number == other.line_number - } -} - -impl Eq for LinkInfo {} - -impl std::hash::Hash for LinkInfo { - fn hash<H: std::hash::Hasher>(&self, state: &mut H) { - self.url.hash(state); - self.file_path.hash(state); - self.line_number.hash(state); - } -}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/git/link_extractor.rs` around lines 9 - 36, The LinkInfo type repeats field lists across manual implementations of PartialEq and Hash, which is error-prone; instead add PartialEq, Eq and Hash to the existing derive attribute on the LinkInfo struct (so the compiler generates equality and hashing) and remove the manual impl PartialEq, impl Eq and impl std::hash::Hash blocks (references: the LinkInfo struct and the impl blocks named PartialEq, Eq, and std::hash::Hash).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/git/link_extractor.rs`:
- Around line 149-213: The new tests don't independently verify that both
file_path and line_number participate in equality/hash for LinkInfo; update
tests to add two explicit cases: one asserting two LinkInfo entries with the
same url and same file_path but different line_number are treated as distinct,
and another asserting two entries with the same url and same line_number but
different file_path are distinct; modify or extend the existing
test_link_info_uniqueness (and/or add small focused tests near it) and use the
LinkInfo constructor and HashSet operations (and also exercise
find_link_in_content where appropriate) so the equality/hash implementation must
include both file_path and line_number to pass.
- Around line 9-36: The LinkInfo type repeats field lists across manual
implementations of PartialEq and Hash, which is error-prone; instead add
PartialEq, Eq and Hash to the existing derive attribute on the LinkInfo struct
(so the compiler generates equality and hashing) and remove the manual impl
PartialEq, impl Eq and impl std::hash::Hash blocks (references: the LinkInfo
struct and the impl blocks named PartialEq, Eq, and std::hash::Hash).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f58fd600-3161-4281-8175-238bc8c6258d
📒 Files selected for processing (1)
src/git/link_extractor.rs
Pull Request Test Coverage Report for Build 22983460971Details
💛 - Coveralls |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/link_checker/checker.rs`:
- Around line 212-214: The current is_request_error_invalid only treats Invalid
messages containing "Request error" as transient, but it should also accept
Invalid results that represent retryable HTTP statuses (e.g., 429 and 5xx).
Update the is_request_error_invalid function to detect Invalid(msg) where msg
either contains "Request error" OR matches/contains an "HTTP status code:
<code>" pattern, parse the numeric status code, and return true for 429 or any
5xx code; keep returning false for other status codes. Apply the same logic to
the other helper occurrences noted in the review (the other
is_request_error_invalid uses at the other blocks) so all fallbacks treat
retryable HTTP statuses as transient.
In `@src/link_checker/service.rs`:
- Around line 203-206: Don't silently treat all RepoManager::from(&github_url)
errors as a network skip; instead match the Err and only return (skip) when the
error is a network/remote-access type, while letting other errors fail the test.
Replace the current let Ok(...) else { return } with a match on
RepoManager::from(&github_url): on Ok(repo_manager) continue as before; on
Err(e) inspect e and if it is a network/remote-access variant (e.g.,
RepoManagerError::Network(...) or an underlying io::Error indicating
network/unreachable) return to skip the test, otherwise panic or propagate the
error (e.g., assert!(false, "RepoManager setup failed: {:?}", e)) so local setup
failures (like temp dir creation) still surface as test failures. Use the
RepoManager::from and github_url identifiers to locate and update the code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 45d22f40-d922-4cf1-9ece-d11f49620b8a
📒 Files selected for processing (2)
src/link_checker/checker.rssrc/link_checker/service.rs
| fn is_request_error_invalid(result: &LinkCheckResult) -> bool { | ||
| matches!(result, LinkCheckResult::Invalid(msg) if msg.contains("Request error")) | ||
| } |
There was a problem hiding this comment.
Request error만 허용하면 live 테스트가 429/5xx에는 아직 flaky합니다.
check_link는 transport 실패 말고도 외부 서비스의 rate limit/일시 장애를 Invalid("HTTP status code: ...")로 돌려주는데, 지금 fallback은 그 케이스를 전부 실패로 처리합니다. 네트워크 변동성을 흡수하려는 변경이라면 retryable HTTP status도 함께 허용하는 편이 더 안정적입니다.
가능한 수정 방향
-fn is_request_error_invalid(result: &LinkCheckResult) -> bool {
- matches!(result, LinkCheckResult::Invalid(msg) if msg.contains("Request error"))
+fn is_transient_network_invalid(result: &LinkCheckResult) -> bool {
+ matches!(
+ result,
+ LinkCheckResult::Invalid(msg)
+ if msg.contains("Request error")
+ || msg.starts_with("HTTP status code: 429")
+ || msg.starts_with("HTTP status code: 5")
+ )
}- LinkCheckResult::Invalid(msg) => {
- assert!(
- msg.contains("Request error"),
- "Expected network request error, got: {msg}"
- );
- }
+ result if is_transient_network_invalid(&result) => {}다른 테스트들의 helper 사용부도 같은 기준으로 맞추면 됩니다.
Also applies to: 227-230, 242-247, 258-265, 274-281, 290-299
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/link_checker/checker.rs` around lines 212 - 214, The current
is_request_error_invalid only treats Invalid messages containing "Request error"
as transient, but it should also accept Invalid results that represent retryable
HTTP statuses (e.g., 429 and 5xx). Update the is_request_error_invalid function
to detect Invalid(msg) where msg either contains "Request error" OR
matches/contains an "HTTP status code: <code>" pattern, parse the numeric status
code, and return true for 429 or any 5xx code; keep returning false for other
status codes. Apply the same logic to the other helper occurrences noted in the
review (the other is_request_error_invalid uses at the other blocks) so all
fallbacks treat retryable HTTP statuses as transient.
| let Ok(repo_manager) = RepoManager::from(&github_url) else { | ||
| // Network-dependent test: skip strict assertions when repository access is unavailable. | ||
| return; | ||
| }; |
There was a problem hiding this comment.
RepoManager::from의 모든 실패를 skip하면 실제 회귀가 가려집니다.
RepoManager::from는 네트워크 외에도 임시 디렉터리 생성 같은 로컬 오류를 반환할 수 있는데, 여기서 전부 return 해버리면 그런 setup 회귀도 테스트 성공으로 보입니다. 원격 접근 불가만 완화하고 나머지는 실패로 남기는 편이 더 안전합니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/link_checker/service.rs` around lines 203 - 206, Don't silently treat all
RepoManager::from(&github_url) errors as a network skip; instead match the Err
and only return (skip) when the error is a network/remote-access type, while
letting other errors fail the test. Replace the current let Ok(...) else {
return } with a match on RepoManager::from(&github_url): on Ok(repo_manager)
continue as before; on Err(e) inspect e and if it is a network/remote-access
variant (e.g., RepoManagerError::Network(...) or an underlying io::Error
indicating network/unreachable) return to skip the test, otherwise panic or
propagate the error (e.g., assert!(false, "RepoManager setup failed: {:?}", e))
so local setup failures (like temp dir creation) still surface as test failures.
Use the RepoManager::from and github_url identifiers to locate and update the
code.
Motivation
HashSet<LinkInfo>의 현재 동작은 URL만 기준으로 중복 제거하여 동일 URL이 여러 파일/라인에서 발견되더라도 합쳐지는 문제가 있었습니다.LinkInfo의 동등성/해시 기준을 위치까지 포함하도록 변경해야 했습니다.Description
LinkInfo의PartialEq구현을url + file_path + line_number비교로 변경했습니다 (src/git/link_extractor.rs).Hash구현에file_path와line_number를 추가하여 해시가 위치를 반영하도록 했습니다 (src/git/link_extractor.rs).test_find_link_in_content_deduplicates_same_line_same_url테스트를 새로 추가했습니다 (src/git/link_extractor.rs).Testing
cargo test link_extractor::tests::test_link_info_uniqueness -- --nocapturewhich passed (1 test passed).cargo test link_extractor::tests::test_find_link_in_content_ -- --nocapturewhich rantest_find_link_in_content_duplicatesandtest_find_link_in_content_deduplicates_same_line_same_urland both passed.cargo test test_link_info_uniqueness test_find_link_in_content_duplicates ...) failed due to incorrectcargo testargument usage and did not run tests, but subsequent targeted test runs succeeded.Codex Task
Summary by CodeRabbit
Refactor
Tests
Bug Fixes