String Similarity
Enter a string in both fields to see the results.
Insights
Try a preset pair
Quick reference (six algorithms)
31~0.3S530 / S5300MPSN / TMSN657 / 657About String Similarity
String Similarity compares two strings using six classical algorithms and tells you when they disagree. Use it for typo detection, name matching, duplicate detection, or any time you need to ask "how close are these two strings — by letters, by edits, or by sound?"
The six algorithms
- Levenshtein distance — minimum number of single-character edits to turn one string into another.
- Damerau–Levenshtein distance — Levenshtein plus adjacent-character transpositions (good for "teh" vs "the").
- Sørensen–Dice coefficient — character-bigram overlap in [0, 1] (the workhorse of fuzzy duplicate detection).
- Soundex — classic American English phonetic code (LDDD), used by the US census.
- Metaphone — sharper English phonetic code, designed by Lawrence Philips in 1990.
- Cologne phonetics — German phonetic code (Kölner Phonetik), the standard for German name matching.
When the algorithms disagree, the tool tells you
Each result row is paired with a plain-language explanation, a "best for" hint, and a longer description. The insight panel surfaces interesting cases: "Sounds the same, spelled differently" (use a phonetic algorithm), "Close at both levels" (likely a true near-duplicate), "Phonetic match despite spelling differences" (a phonetic algorithm is the right choice), and "Phonetic codes may be uninformative for this input" (when the input is non-Latin).
Common use cases
- Detecting typos and misspellings in user input.
- Matching customer records by name (especially with the Soundex / Metaphone / Cologne phonetic codes).
- Identifying near-duplicate entries in a database.
- Choosing the right similarity algorithm for a downstream task.
- Verifying that two spellings of a German / English name map to the same phonetic code.
All six algorithms run in the browser. No string, result, or preset description ever leaves your machine.
Comments
Please accept the "Functionality" cookie category to view and post comments.
Comments failed to load. You can try again or view the discussion directly on GitHub.
View on GitHub