-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Welcome to the official documentation of CmpStr — a modern, modular and abstracted TypeScript library for advanced string matching and similarity calculations, phonetic indexing, normalization and filtering and basic text analysis.
CmpStr targets applications for development and research environments and offers a centralized API for single, batch and pairwise comparisons. The library is lightweight, extensible and supports both synchronous and asynchronous processes.
CmpStr provides a uniform interface for similarity calculations, distance measurements and comparison operations. The extensible metrics system includes numerous methods such as Levenshtein, Jaro-Winkler, Cosine, Dice, Hamming or the Longest Common Subsequence (LCS).
It also supports a range of phonetic methods such as Cologne Phonetics, Soundex and Metaphone. A customizable registration structure facilitates the management of phonetic mappings. A flexible normalization and filtering pipeline ensures consistent pre-processing steps.
The API allows precise single, batch and pair comparisons with post-processed or raw outputs for various use cases. Phonetic searching and matching can also be used to recognize voice-based similarities.
Furthermore, various tools for text analysis allow for easy analysis, e.g. number of words, average sentence length, readability index, word histogram and much more. The diff tool included in the package cannot replace Diffutils or similar software, but offers a quick and mostly sufficient look on diffs between two texts using the well-known unified diff layout.
CmpStr is designed for modern development requirements and offers complete TypeScript typing, extensibility and an asynchronous API for high-performance workflows.
CmpStr provides flexible and precise string comparisons and can be integrated into a wide range of contexts – whether for searching, detecting duplicates, data cleansing or language-related processing.
Try out or use CmpStr on the terminal. Install the CLI version and use many features of CmpStr directly on the console via the cmpstr command. Many options and parameters also make the command suitable for scripts and automatic processing.
Integrating the library into your project is straightforward: Install the npm package, import CmpStr and use the provided functions. The package supports browser environments as well as ESM and CommonJS without relying on external dependencies.
The quick start guide shows how the library can be used in various scenarios. If you have any questions, the FAQ will help, or open an Issue on GitHub if you encounter bugs, problems or just have a nice idea to implement into CmpStr. Information about the latest updates can be found in the Changelog.
CmpStr 3.0 / API Reference • FAQ • npm Package • jsDelivr
CmpStr is a lightweight, fast and well performing package for calculating string similarity.
Getting Started
Installation & Setup
Quick Start
API Reference
Documentation
Similarity Metrics
Phonetic Algorithms
Normalization & Filtering
Comparison Modes
Asynchronous API
Diff & Text Analysis
Extending CmpStr
Project Management