GNU bug report logs -
#16625
diff shown is not (locally) minimum
Previous Next
Reported by: Stephan Beyer <s-beyer <at> gmx.net>
Date: Sun, 2 Feb 2014 17:25:01 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 16625 in the body.
You can then email your comments to 16625 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-diffutils <at> gnu.org
:
bug#16625
; Package
diffutils
.
(Sun, 02 Feb 2014 17:25:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Stephan Beyer <s-beyer <at> gmx.net>
:
New bug report received and forwarded. Copy sent to
bug-diffutils <at> gnu.org
.
(Sun, 02 Feb 2014 17:25:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi,
diffing millions of lines each day, I stumbled upon something I
considered to be a bug. The attached example files are big, so I depict
the bug on a short example (where the actual bug does not occur with the
real "diff" tool).
The generated diff looks like
A
B
-C
-D
-E
+C
+D
F
G
instead of only
C
D
-E
F
G
If we generate the reverse diff, the bug does not occur, we just have
C
D
+E
F
G
Interestingly, if we remove a line that does not even show up in the
affected section, the problem disappears. (See real example below.)
Using diff -d the problem has not shown up.
I know that the standard diff algorithm does not guarantee minimum
diffs, for example, if sections of a file are moved around.
However, I expected it to guarantee some kind of "local minimum",
that is, *unchanged* lines should not be deleted and inserted
if there are no unchanged lines in between.
This is probably going to be a WONTFIX, but I wanted to report the bug
nonetheless.
To see the bug in action, save the attached files and do:
diff -u old.txt new.txt | grep -A 32 zzz # bug shows up
tail -n +2 old.txt > old2.txt # old2.txt = old1.txt without 1st line
diff -u old2.txt new.txt | grep -A 23 zzz # should-be diff
(The grep extracts the problem section.)
Btw: For these files, the bug also occurs in other diff tools like
"vimdiff" and "git diff" (with the standard diff algorithm) but, for
example, not using KDE's "kompare".
I have had related bigger files where the bug occurs with "diff" and
"vimdiff" only but not "git diff".
It seems to me that the bug is related to some kind of chunk size of the
diff algorithm. However, I did not investigate it further.
Stephan
--
With knowledge grows doubt. -- Goethe
[new.txt (text/plain, attachment)]
[old.txt (text/plain, attachment)]
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Sun, 30 Mar 2014 05:16:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Stephan Beyer <s-beyer <at> gmx.net>
:
bug acknowledged by developer.
(Sun, 30 Mar 2014 05:16:03 GMT)
Full text and
rfc822 format available.
Message #10 received at 16625-done <at> debbugs.gnu.org (full text, mbox):
Re http://bugs.gnu.org/16625
Thanks, this bug should be fixed by the following patch:
http://git.savannah.gnu.org/cgit/diffutils.git/commit/?id=9b48bf3d3ed002e32fad5de5f539745bc861a104
which should appear in the next diffutils release.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 27 Apr 2014 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 359 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.