Skip to content

A generic implementation of the Levenshtein distance that allows arbitrarily weighting operations for different elements.

License

Notifications You must be signed in to change notification settings

bplevin36/weighted_levenshtein

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

Docs

A generic implementation of the Levenshtein distance that allows arbitrarily weighting operations for different elements.

Generic

This crate can work on slices of any kind. It can:

  • Compute a distance in characters between two strings:
assert_eq!(distance("abc", "aaxcc"), 3);
  • Compute a distance in words between two strings:
assert_eq!(
   distance(
      "The quick brown fox".split (' ').collect::<Vec<_>>(),
      "The very quick brown cat".split (' ').collect()),
   2);
  • Or compute a distance between arbitrary sequences:
assert_eq!(distance(vec![1, 2, 3], vec![0, 1, 3, 3, 4]), 3);

Weighting

This crate allows defining custom weights for each operation on each symbol. These weights can be specified for custom types by implementing the EditWeight trait.

For example:

enum MyType {
  A,
  B,
}

impl EditWeight for MyType {
  fn add_cost(&self) -> usize {
    match *self {
      MyType::A => 1,
      MyType::B => 2,
    }
  }
  fn rm_cost(&self) -> usize {
    match *self {
      MyType::A => 1,
      MyType::B => 2,
    }
  }
  fn sub_cost(&self, other: &Self) -> usize {
    if self == other {
      0
    } else {
      3
    }
  }
}

assert_eq!(distance(vec![MyType::A], vec![MyType::B, MyType::B]), 5)

About

A generic implementation of the Levenshtein distance that allows arbitrarily weighting operations for different elements.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages