GNU bug report logs - #16625
diff shown is not (locally) minimum

Previous Next

Package: diffutils;

Reported by: Stephan Beyer <s-beyer <at> gmx.net>

Date: Sun, 2 Feb 2014 17:25:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 16625 in the body.
You can then email your comments to 16625 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-diffutils <at> gnu.org:
bug#16625; Package diffutils. (Sun, 02 Feb 2014 17:25:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stephan Beyer <s-beyer <at> gmx.net>:
New bug report received and forwarded. Copy sent to bug-diffutils <at> gnu.org. (Sun, 02 Feb 2014 17:25:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stephan Beyer <s-beyer <at> gmx.net>
To: bug-diffutils <at> gnu.org
Subject: diff shown is not (locally) minimum
Date: Sun, 02 Feb 2014 13:05:45 +0100
[Message part 1 (text/plain, inline)]
Hi,

diffing millions of lines each day, I stumbled upon something I
considered to be a bug. The attached example files are big, so I depict
the bug on a short example (where the actual bug does not occur with the
real "diff" tool).

The generated diff looks like

 A
 B
-C
-D
-E
+C
+D
 F
 G

instead of only

 C
 D
-E
 F
 G

If we generate the reverse diff, the bug does not occur, we just have

 C
 D
+E
 F
 G

Interestingly, if we remove a line that does not even show up in the
affected section, the problem disappears. (See real example below.)

Using diff -d the problem has not shown up.


I know that the standard diff algorithm does not guarantee minimum
diffs, for example, if sections of a file are moved around.
However, I expected it to guarantee some kind of "local minimum",
that is, *unchanged* lines should not be deleted and inserted
if there are no unchanged lines in between.

This is probably going to be a WONTFIX, but I wanted to report the bug
nonetheless.

To see the bug in action, save the attached files and do:

 diff -u old.txt new.txt | grep -A 32 zzz  # bug shows up
 tail -n +2 old.txt > old2.txt  # old2.txt = old1.txt without 1st line
 diff -u old2.txt new.txt  | grep -A 23 zzz  # should-be diff

(The grep extracts the problem section.)

Btw: For these files, the bug also occurs in other diff tools like
"vimdiff" and "git diff" (with the standard diff algorithm) but, for
example, not using KDE's "kompare".
I have had related bigger files where the bug occurs with "diff" and
"vimdiff" only but not "git diff".

It seems to me that the bug is related to some kind of chunk size of the
diff algorithm. However, I did not investigate it further.

Stephan

-- 
With knowledge grows doubt. -- Goethe
[new.txt (text/plain, attachment)]
[old.txt (text/plain, attachment)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Sun, 30 Mar 2014 05:16:02 GMT) Full text and rfc822 format available.

Notification sent to Stephan Beyer <s-beyer <at> gmx.net>:
bug acknowledged by developer. (Sun, 30 Mar 2014 05:16:03 GMT) Full text and rfc822 format available.

Message #10 received at 16625-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Stephan Beyer <s-beyer <at> gmx.net>
Cc: 16625-done <at> debbugs.gnu.org
Subject: Re: diff shown is not (locally) minimum
Date: Sat, 29 Mar 2014 22:15:15 -0700
Re http://bugs.gnu.org/16625

Thanks, this bug should be fixed by the following patch:

http://git.savannah.gnu.org/cgit/diffutils.git/commit/?id=9b48bf3d3ed002e32fad5de5f539745bc861a104

which should appear in the next diffutils release.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 27 Apr 2014 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 359 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.