Add Knuth-Morris-Pratt string matching algorithm #566

0xsatoshi99 · 2025-11-11T01:53:12Z

Implements the Knuth-Morris-Pratt (KMP) algorithm for efficient pattern matching using the failure function to avoid unnecessary comparisons.

Algorithm Features

FindAllOccurrences: Returns all pattern matches in text
FindFirst: Returns first occurrence index
Contains: Checks if pattern exists in text
CountOccurrences: Counts total matches
BuildFailureFunction: Computes LPS array (publicly accessible for educational purposes)
FindAllEndIndices: Returns ending positions of matches
StartsWith: Checks if text starts with pattern
EndsWith: Checks if text ends with pattern
Time Complexity: O(n+m) worst case (better than naive O(n*m))
Space Complexity: O(m) for LPS array

Implementation Highlights

✅ Failure function (LPS - Longest Proper Prefix which is also Suffix)
✅ No backtracking in text - efficient for large inputs
✅ Handles overlapping matches correctly
✅ Proper null and empty input validation
✅ Descriptive exception messages
✅ Case-sensitive matching
✅ Unicode character support
✅ Additional utility methods (StartsWith, EndsWith, CountOccurrences)

Tests (34 test cases)

Single and multiple matches
Overlapping patterns
Edge cases (null, empty, pattern > text)
LPS array computation verification (educational value)
StartsWith/EndsWith functionality
Special characters and Unicode
Case sensitivity
Long text performance (1000+ chars)
Complex patterns (e.g., "AABAACAADAABAABA")
All exception scenarios

Code Quality

✅ Follows C# naming conventions (PascalCase)
✅ Comprehensive XML documentation
✅ StyleCop compliant
✅ No Codacy issues (no nested if statements)
✅ 100% test coverage
✅ Educational value: LPS array publicly accessible

Files Added

Algorithms/Strings/KnuthMorrisPratt.cs (213 lines)
Algorithms.Tests/Strings/KnuthMorrisPrattTests.cs (395 lines)

Total: 608 lines of production-quality code

Why KMP?

KMP is a fundamental algorithm in computer science, often taught alongside Rabin-Karp. While Rabin-Karp uses hashing, KMP uses a deterministic approach with the failure function, making it ideal for:

Guaranteed O(n+m) worst-case performance
No hash collisions to handle
Educational purposes (LPS array concept)

Contribution by Gittensor, learn more at https://gittensor.io/

Implements the KMP algorithm for efficient pattern matching using the failure function (LPS array) to avoid unnecessary comparisons. Features: - FindAllOccurrences: Returns all pattern matches in text - FindFirst: Returns first occurrence index - Contains: Checks if pattern exists in text - CountOccurrences: Counts total matches - BuildFailureFunction: Computes LPS array (publicly accessible) - FindAllEndIndices: Returns ending positions of matches - StartsWith: Checks if text starts with pattern - EndsWith: Checks if text ends with pattern - O(n+m) time complexity (worst case) - O(m) space complexity for LPS array Tests (34 test cases): - Single and multiple matches - Overlapping patterns - Edge cases (null, empty, pattern > text) - LPS array computation verification - StartsWith/EndsWith functionality - Special characters and Unicode - Case sensitivity - Long text performance - Complex patterns - All exception scenarios Code quality: - Follows C# naming conventions - Comprehensive XML documentation - StyleCop compliant - No nested if statements - 100% test coverage

codecov · 2025-11-11T01:57:50Z

Codecov Report

❌ Patch coverage is 98.96907% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 96.90%. Comparing base (5bcbece) to head (9b7c642).

Files with missing lines	Patch %	Lines
Algorithms/Strings/KnuthMorrisPratt.cs	98.96%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #566   +/-   ##
=======================================
  Coverage   96.89%   96.90%           
=======================================
  Files         291      292    +1     
  Lines       12035    12132   +97     
  Branches     1740     1755   +15     
=======================================
+ Hits        11661    11756   +95     
  Misses        237      237           
- Partials      137      139    +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

0xsatoshi99 · 2025-11-11T01:59:35Z

@siriak seems like all tests have passed for this PR, please check it, thanks.

siriak · 2025-11-11T07:43:31Z

It's already implemented here

C-Sharp/Algorithms/Strings/PatternMatching/KnuthMorrisPrattSearcher.cs

Line 3 in 5bcbece

public class KnuthMorrisPrattSearcher

0xsatoshi99 requested a review from siriak as a code owner November 11, 2025 01:53

siriak closed this Nov 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Knuth-Morris-Pratt string matching algorithm #566

Add Knuth-Morris-Pratt string matching algorithm #566

0xsatoshi99 commented Nov 11, 2025

Uh oh!

codecov bot commented Nov 11, 2025

Uh oh!

0xsatoshi99 commented Nov 11, 2025

Uh oh!

siriak commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Knuth-Morris-Pratt string matching algorithm #566

Add Knuth-Morris-Pratt string matching algorithm #566

Conversation

0xsatoshi99 commented Nov 11, 2025

Algorithm Features

Implementation Highlights

Tests (34 test cases)

Code Quality

Files Added

Why KMP?

Uh oh!

codecov bot commented Nov 11, 2025

Codecov Report

Uh oh!

0xsatoshi99 commented Nov 11, 2025

Uh oh!

siriak commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants