about
requirements
to do
installation
download
related software
author: jose_at_monkey.org
license: 3-clause BSD.
|
$ jmatch -h
Usage: jmatch [-HLins] -d dist needle file1 [...]
search for matching text using fuzzy rules.
-H use the Hamming distance algorithm
-L use the Levenshtein distance algorithm
-i case insensitive search
-n print line numbers of matching files
-s print the final distance score
-d maximum distance allowed for match
the following example shows how it compares to grep:
$ jmatch -Lnis -d 5 "begin 664" /home/jose/monkey-spam*
/home/jose/monkey-spam-2.mbox:593107:5:beginning
/home/jose/monkey-spam-2.mbox:600298:5:beg
/home/jose/monkey-spam-2.mbox:994278:5:being,
/home/jose/monkey-spam-2.mbox:1169258:5:ve in
/home/jose/monkey-spam-2.mbox:1486415:5:erin
/home/jose/monkey-spam-2.mbox:1488703:5:been
some caveats when compared to grep(1):
- whitespace matters, the match is done against the whole line.
- you'll often wind up with more than you expected.
- be aware of how the algorithms work. the Levenshtein distance is
computed with equal costs of conversions or insertions of characters. the
Hamming distance can only be computed if the "needle" is the same length
as the line being tested.
jmatch requires libdistance to build.
- more algorithms: needleman-wunsch, jaccard, etc ...
- download
- unpack
- modify Makefile (or GNUmakefile) to point to the correct libdistance
location (to find libdistance.a and distance.h)
- make (or gmake)
jmatch-0.2.0.tar.gz (1 jan 2005)
preliminary release.
|