wdif: compare files based on words and approximate values

When testing numerical software, I often need to know if a new version produced the same results as an older version. Of course, one can use 'diff' or 'gvimdiff' for that purpose, but if the new version is allowed to produce slightly different results, or produce a different lay-out of the results these tools are less useful.

For example:

old version                     new version
results: 1                      results:
2 3.13414                       1 2.0 3.1341453
errors:
0.0014                          errors: 0.00255

So I created a program wdif which compares files word by word (like wdiff), and, if a word looks like a number, compares the two numbers, using a threshold. For the above example, wdif outputs:

word diff in lines 1,1: 'old' and 'new'
numerical diff in lines 5,4: '0.0014' and '0.00255' 
wdif: sumdif is 0.0011553 # difs is 2

The program is rather fast, thousands of lines are compared in a blink of the eye. Here is the source, accompanied by a Makefile. You need 'bison' and 'flex' to make wdif.

The output of 'wdif -h'

wdif compares two files on a word-by-word basis.
Differences in words are printed to stdout
If the words seem to be numbers, a conversion is done
to doubles, subsequentely these doubles are compared.
White space, newlines and commas are ignored.
In order to facilitate comparing fortran output files,
things as 4*1.1 are expanded to 1.1 1.1 1.1 1.1
At the end, wdif prints the sum of absolute differences
and the number of differences found.

Usage: wdif [-n N] [-t tol] file1 file2
where:
   N is the maximum number of differences allowed before
     the program stops, default 20. 0 means: no stop.
   tol is the threshold for noticing differences between numbers
     default: 0.001
   file1, file2: files to be compared.

Example:
  wdif -n 0 -t 0.1 old new