Index |
All Packages |
All Categories |
By Author |
ap (3) |
cp (3) |
dp (3) |
exe (3) |
gui (0) |
gui/gtk (0) |
gui/tk (4) |
io (1) |
lib (11) |
math (0) |
net (9) |
nlp (18) |
op (4) |
os (2) |
program (3) |
sp (2) |
tool (9) |
wp (2) |
xml (2) |
type | : | package |
id | : | mogul:/lager/levenshtein |
section | : | mogul:/lager |
blurb | : | Two modules (one in C, one in pure Oz) for measuring edit distance between two strings |
author | : | Torbjörn Lager |
category | : | nlp |
documentation | : | index.html |
download | : | lager-levenshtein__1.2.5__source__0.pkg lager-levenshtein__1.3.0__source__0.pkg |
provides | : | [nlp] x-ozlib://lager/levenshtein/Levenshtein.so{native} [nlp] x-ozlib://lager/levenshtein/Levenshtein.ozf |
The modules in this package export functions which measure the so called edit distance (also called Levenshtein distance) between two strings, a source and a target. The edit distance is defined as the number of deletions, insertions, or substitutions required to transform the source into the target. The greater the distance, the more different the strings are, and vice versa. Edit distance can be (and has been) used for spell checking and speech recognition purposes.
The distribution contains two functionally equivalent implementations, one in C linked into Oz, and one in pure Oz. They are both straightforward implementations of Levenshtein's algorithm - a dynamic programming algorithm capable of calculating the edit distance in time proportional to the length of the source times the length of the target. The C-based version is roughly eight times faster than the pure Oz version, and is therefore recommended for serious use.