GNU bug report logs - #6557
du sometimes miscounts directories, and files whose link count equals 1

Previous Next

Package: coreutils;

Reported by: Paul Eggert <eggert <at> CS.UCLA.EDU>

Date: Sat, 3 Jul 2010 06:42:01 UTC

Severity: normal

Fixed in version 8.6

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 6557 in the body.
You can then email your comments to 6557 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6557; Package coreutils. (Sat, 03 Jul 2010 06:42:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Paul Eggert <eggert <at> CS.UCLA.EDU>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sat, 03 Jul 2010 06:42:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> CS.UCLA.EDU>
To: bug-coreutils <at> gnu.org
Subject: du sometimes miscounts directories, and files whose link count equals
	1
Date: Fri, 02 Jul 2010 23:41:08 -0700
(I found this bug by code inspection while doing the du performance
improvement reported in:
http://lists.gnu.org/archive/html/bug-coreutils/2010-07/msg00014.html
)

Unless -l is given, du is not supposed to count the same file more
than once.  It optimizes this test by not bothering to put a file into
the hash table if its link count is 1, or if it is a directory.  But
this optimization is not correct if -L is given (because the same
link-count-1 file, or directory, can be seen via symbolic links) or if
two or more arguments are given (because the same such file can be
seen under multiple arguments).  The optimization should be suppressed
if -L is given, or if multiple arguments are given.

Here is a patch, with a couple of test cases for it.  This patch
assumes the du performance fix, but I can prepare an independent
patch if you like.

-----

Don't miscount directories or link-count-1 files seen multiple times.
* NEWS: Mention this.
* src/du.c (hash_all): New static var.
(process_file): Use it.
(main): Set it.
* tests/du/hard-link: Add a couple of test cases to help make
sure this bug stays squashed.
diff --git a/NEWS b/NEWS
index 2493ef8..82190d9 100644
--- a/NEWS
+++ b/NEWS
@@ -42,6 +42,11 @@ GNU coreutils NEWS                                    -*- outline -*-
   Also errors are no longer suppressed for unsupported file types, and
   relative sizes are restricted to supported file types.
 
+** Bug fixes
+
+  du no longer multiply counts a file that is a directory or whose
+  link count is 1, even if the file is reached multiple times by
+  following symlinks or via multiple arguments.
 
 * Noteworthy changes in release 8.5 (2010-04-23) [stable]
 
diff --git a/src/du.c b/src/du.c
index bc24861..739be73 100644
--- a/src/du.c
+++ b/src/du.c
@@ -121,6 +121,9 @@ static bool apparent_size = false;
 /* If true, count each hard link of files with multiple links.  */
 static bool opt_count_all = false;
 
+/* If true, hash all files to look for hard links.  */
+static bool hash_all;
+
 /* If true, output the NUL byte instead of a newline at the end of each line. */
 static bool opt_nul_terminate_output = false;
 
@@ -457,8 +460,7 @@ process_file (FTS *fts, FTSENT *ent)
      via a hard link, then don't let it contribute to the sums.  */
   if (skip
       || (!opt_count_all
-          && ! S_ISDIR (sb->st_mode)
-          && 1 < sb->st_nlink
+          && (hash_all || (! S_ISDIR (sb->st_mode) && 1 < sb->st_nlink))
           && ! hash_ins (sb->st_ino, sb->st_dev)))
     {
       /* Note that we must not simply return here.
@@ -876,11 +878,20 @@ main (int argc, char **argv)
                quote (files_from));
 
       ai = argv_iter_init_stream (stdin);
+
+      /* It's not easy here to count the arguments, so assume the
+         worst.  */
+      hash_all = true;
     }
   else
     {
       char **files = (optind < argc ? argv + optind : cwd_only);
       ai = argv_iter_init_argv (files);
+
+      /* Hash all dev,ino pairs if there are multiple arguments, or if
+         following non-command-line symlinks, because in either case a
+         file with just one hard link might be seen more than once.  */
+      hash_all = (optind + 1 < argc || symlink_deref_bits == FTS_LOGICAL);
     }
 
   if (!ai)
diff --git a/tests/du/hard-link b/tests/du/hard-link
index 7e4f51a..e22320b 100755
--- a/tests/du/hard-link
+++ b/tests/du/hard-link
@@ -26,24 +26,40 @@ fi
 . $srcdir/test-lib.sh
 
 mkdir -p dir/sub
-( cd dir && { echo non-empty > f1; ln f1 f2; echo non-empty > sub/F; } )
-
-
-# Note that for this first test, we transform f1 or f2
-# (whichever name we find first) to f_.  That is necessary because,
-# depending on the type of file system, du could encounter either of those
-# two hard-linked files first, thus listing that one and not the other.
-du -a --exclude=sub dir \
-  | sed 's/^[0-9][0-9]*	//' | sed 's/f[12]/f_/' > out || fail=1
-echo === >> out
-du -a --exclude=sub --count-links dir \
-  | sed 's/^[0-9][0-9]*	//' | sort -r >> out || fail=1
+( cd dir &&
+  { echo non-empty > f1
+    ln f1 f2
+    ln -s f1 f3
+    echo non-empty > sub/F; } )
+
+du -a -L --exclude=sub --count-links dir \
+  | sed 's/^[0-9][0-9]*	//' | sort -r > out || fail=1
+
+# For these tests, transform f1 or f2 or f3 (whichever name is find
+# first) to f_.  That is necessary because, depending on the type of
+# file system, du could encounter any of those linked files first,
+# thus listing that one and not the others.
+for args in '-L' 'dir' '-L dir'
+do
+  echo === >> out
+  du -a --exclude=sub $args dir \
+    | sed 's/^[0-9][0-9]*	//' | sed 's/f[123]/f_/' >> out || fail=1
+done
+
 cat <<\EOF > exp
+dir/f3
+dir/f2
+dir/f1
+dir
+===
 dir/f_
 dir
 ===
-dir/f2
-dir/f1
+dir/f_
+dir/f_
+dir
+===
+dir/f_
 dir
 EOF
 




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6557; Package coreutils. (Sat, 03 Jul 2010 08:19:01 GMT) Full text and rfc822 format available.

Message #8 received at 6557 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> CS.UCLA.EDU>
Cc: 6557 <at> debbugs.gnu.org
Subject: Re: bug#6557: du sometimes miscounts directories,
	and files whose link count equals 1
Date: Sat, 03 Jul 2010 10:18:18 +0200
Paul Eggert wrote:
> (I found this bug by code inspection while doing the du performance
> improvement reported in:
> http://lists.gnu.org/archive/html/bug-coreutils/2010-07/msg00014.html
> )
>
> Unless -l is given, du is not supposed to count the same file more
> than once.  It optimizes this test by not bothering to put a file into
> the hash table if its link count is 1, or if it is a directory.  But
> this optimization is not correct if -L is given (because the same
> link-count-1 file, or directory, can be seen via symbolic links) or if
> two or more arguments are given (because the same such file can be
> seen under multiple arguments).  The optimization should be suppressed
> if -L is given, or if multiple arguments are given.
>
> Here is a patch, with a couple of test cases for it.  This patch
> assumes the du performance fix, but I can prepare an independent
> patch if you like.

Thanks!
Actually, that patch applies just fine, as-is.
However, it induces this new "make check" test failure:

    FAIL: du/files0-from (exit: 1)
    ==============================

    du (GNU coreutils) 8.5.75-569b2
    Copyright (C) 2010 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.

    Written by Torbjorn Granlund, David MacKenzie, Paul Eggert,
    and Jim Meyering.
    f-extra-arg...
    missing...
    minus-in-stdin...
    empty...
    empty-nonreg...
    nul-1...
    nul-2...
    1...
    1a...
    2...
    files0-from: test 2: stdout mismatch, comparing 2.O (actual) and 2.1 (expected)
    *** 2.O Sat Jul  3 09:28:08 2010
    --- 2.1 Sat Jul  3 09:28:08 2010
    ***************
    *** 1 ****
    --- 1,2 ----
      0     g
    + 0     g
    2a...
    files0-from: test 2a: stdout mismatch, comparing 2a.O (actual) and 2a.1 (expected)
    *** 2a.O        Sat Jul  3 09:28:08 2010
    --- 2a.1        Sat Jul  3 09:28:08 2010
    ***************
    *** 1 ****
    --- 1,2 ----
      0     g
    + 0     g
    zero-len...



That's because with the unpatched "du", a command like this, with
a duplicate argument, prints two lines, while the patched version
prints two:

    $ seq 100 > g; du g g
    4       g
    4       g

    $ seq 100 > g; ./du g g
    4       g

Note that the vendor versions of "du" from at least Solaris 10,
openBSD, netBSD and freeBSD print both lines.
I prefer the new semantics, especially when using --total:

    $ seq 100 > g; du --total g g
    4       g
    4       g
    8       total

    $ seq 100 > g; ./du --total g g
    4       g
    4       total

You can get some of the old semantics by using -l:

    $ seq 100 > g; ./du -l --total g g
    4       g
    4       g
    8       total

What do you think of breaking with that tradition?  POSIX does appear
to say that for each "FILE" argument du must print a line, but it also
mentions how with linked files, the space must be counted only once.
You can definitely consider listing the same file twice as being
analogous to a file being hard-linked.

An alternative might be to do this,

    $ seq 100 > g; du --total g g
    4       g
    0       g
    4       total
but this is too prone to misinterpretation both by people and by code
that parses du output.  So I'm inclined to go with your approach.

-------------------------------------
This is the additional patch we'd need to make the failing
failing test accept your new output.  You're welcome to merge
it into yours.

diff --git a/tests/du/files0-from b/tests/du/files0-from
index 620246d..860fc6a 100755
--- a/tests/du/files0-from
+++ b/tests/du/files0-from
@@ -70,15 +70,15 @@ my @Tests =
     {IN=>{f=>"g\0"}}, {AUX=>{g=>''}},
     {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],

-   # two file names, no final NUL
+   # two identical file names, no final NUL
    ['2', '--files0-from=-', '<',
     {IN=>{f=>"g\0g"}}, {AUX=>{g=>''}},
-    {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],
+    {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],

-   # two file names, with final NUL
+   # two identical file names, with final NUL
    ['2a', '--files0-from=-', '<',
     {IN=>{f=>"g\0g\0"}}, {AUX=>{g=>''}},
-    {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],
+    {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],

    # Ensure that $prog processes FILEs following a zero-length name.
    ['zero-len', '--files0-from=-', '<',




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6557; Package coreutils. (Sat, 03 Jul 2010 08:37:01 GMT) Full text and rfc822 format available.

Message #11 received at 6557 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> CS.UCLA.EDU>
Cc: 6557 <at> debbugs.gnu.org
Subject: Re: bug#6557: du sometimes miscounts directories,
	and files whose link count equals 1
Date: Sat, 03 Jul 2010 10:36:00 +0200
Jim Meyering wrote:
> Paul Eggert wrote:
>> (I found this bug by code inspection while doing the du performance
>> improvement reported in:
>> http://lists.gnu.org/archive/html/bug-coreutils/2010-07/msg00014.html
>> )
>>
>> Unless -l is given, du is not supposed to count the same file more
>> than once.  It optimizes this test by not bothering to put a file into
>> the hash table if its link count is 1, or if it is a directory.  But
>> this optimization is not correct if -L is given (because the same
>> link-count-1 file, or directory, can be seen via symbolic links) or if
>> two or more arguments are given (because the same such file can be
>> seen under multiple arguments).  The optimization should be suppressed
>> if -L is given, or if multiple arguments are given.
>>
>> Here is a patch, with a couple of test cases for it.  This patch
>> assumes the du performance fix, but I can prepare an independent
>> patch if you like.
>
> Thanks!
> Actually, that patch applies just fine, as-is.
> However, it induces this new "make check" test failure:
...
> This is the additional patch we'd need to make the failing
> failing test accept your new output.  You're welcome to merge
> it into yours.

Actually I did that.
Here's the adjusted patch, for review.
Note the "du: " prefix on the one-line log summary -- that's
the part that goes into the Subject below.  Plus, I shortened it.
Also, I added a log line for the tests/du/files0-from change.
(BTW, the following is the output from "git format-patch --stdout -1".
It's easy to apply that by saving it in a FILE, then running "git am FILE")

From efe53cc72b599979ea292754ecfe8abf7c839d22 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert <at> CS.UCLA.EDU>
Date: Fri, 2 Jul 2010 23:41:08 -0700
Subject: [PATCH] du: don't miscount duplicate directories or link-count-1 files

* NEWS: Mention this.
* src/du.c (hash_all): New static var.
(process_file): Use it.
(main): Set it.
* tests/du/hard-link: Add a couple of test cases to help make
sure this bug stays squashed.
* tests/du/files0-from: Adjust existing tests to reflect
change in semantics with duplicate arguments.
---
 NEWS                 |    5 +++++
 src/du.c             |   15 +++++++++++++--
 tests/du/files0-from |    8 ++++----
 tests/du/hard-link   |   44 ++++++++++++++++++++++++++++++--------------
 4 files changed, 52 insertions(+), 20 deletions(-)

diff --git a/NEWS b/NEWS
index 3a24925..b02a223 100644
--- a/NEWS
+++ b/NEWS
@@ -38,6 +38,11 @@ GNU coreutils NEWS                                    -*- outline -*-
   Also errors are no longer suppressed for unsupported file types, and
   relative sizes are restricted to supported file types.

+** Bug fixes
+
+  du no longer multiply counts a file that is a directory or whose
+  link count is 1, even if the file is reached multiple times by
+  following symlinks or via multiple arguments.

 * Noteworthy changes in release 8.5 (2010-04-23) [stable]

diff --git a/src/du.c b/src/du.c
index a90568e..4d6e03a 100644
--- a/src/du.c
+++ b/src/du.c
@@ -132,6 +132,9 @@ static bool apparent_size = false;
 /* If true, count each hard link of files with multiple links.  */
 static bool opt_count_all = false;

+/* If true, hash all files to look for hard links.  */
+static bool hash_all;
+
 /* If true, output the NUL byte instead of a newline at the end of each line. */
 static bool opt_nul_terminate_output = false;

@@ -518,8 +521,7 @@ process_file (FTS *fts, FTSENT *ent)
      via a hard link, then don't let it contribute to the sums.  */
   if (skip
       || (!opt_count_all
-          && ! S_ISDIR (sb->st_mode)
-          && 1 < sb->st_nlink
+          && (hash_all || (! S_ISDIR (sb->st_mode) && 1 < sb->st_nlink))
           && ! hash_ins (sb->st_ino, sb->st_dev)))
     {
       /* Note that we must not simply return here.
@@ -937,11 +939,20 @@ main (int argc, char **argv)
                quote (files_from));

       ai = argv_iter_init_stream (stdin);
+
+      /* It's not easy here to count the arguments, so assume the
+         worst.  */
+      hash_all = true;
     }
   else
     {
       char **files = (optind < argc ? argv + optind : cwd_only);
       ai = argv_iter_init_argv (files);
+
+      /* Hash all dev,ino pairs if there are multiple arguments, or if
+         following non-command-line symlinks, because in either case a
+         file with just one hard link might be seen more than once.  */
+      hash_all = (optind + 1 < argc || symlink_deref_bits == FTS_LOGICAL);
     }

   if (!ai)
diff --git a/tests/du/files0-from b/tests/du/files0-from
index 620246d..860fc6a 100755
--- a/tests/du/files0-from
+++ b/tests/du/files0-from
@@ -70,15 +70,15 @@ my @Tests =
     {IN=>{f=>"g\0"}}, {AUX=>{g=>''}},
     {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],

-   # two file names, no final NUL
+   # two identical file names, no final NUL
    ['2', '--files0-from=-', '<',
     {IN=>{f=>"g\0g"}}, {AUX=>{g=>''}},
-    {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],
+    {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],

-   # two file names, with final NUL
+   # two identical file names, with final NUL
    ['2a', '--files0-from=-', '<',
     {IN=>{f=>"g\0g\0"}}, {AUX=>{g=>''}},
-    {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],
+    {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],

    # Ensure that $prog processes FILEs following a zero-length name.
    ['zero-len', '--files0-from=-', '<',
diff --git a/tests/du/hard-link b/tests/du/hard-link
index 7e4f51a..e22320b 100755
--- a/tests/du/hard-link
+++ b/tests/du/hard-link
@@ -26,24 +26,40 @@ fi
 . $srcdir/test-lib.sh

 mkdir -p dir/sub
-( cd dir && { echo non-empty > f1; ln f1 f2; echo non-empty > sub/F; } )
-
-
-# Note that for this first test, we transform f1 or f2
-# (whichever name we find first) to f_.  That is necessary because,
-# depending on the type of file system, du could encounter either of those
-# two hard-linked files first, thus listing that one and not the other.
-du -a --exclude=sub dir \
-  | sed 's/^[0-9][0-9]*	//' | sed 's/f[12]/f_/' > out || fail=1
-echo === >> out
-du -a --exclude=sub --count-links dir \
-  | sed 's/^[0-9][0-9]*	//' | sort -r >> out || fail=1
+( cd dir &&
+  { echo non-empty > f1
+    ln f1 f2
+    ln -s f1 f3
+    echo non-empty > sub/F; } )
+
+du -a -L --exclude=sub --count-links dir \
+  | sed 's/^[0-9][0-9]*	//' | sort -r > out || fail=1
+
+# For these tests, transform f1 or f2 or f3 (whichever name is find
+# first) to f_.  That is necessary because, depending on the type of
+# file system, du could encounter any of those linked files first,
+# thus listing that one and not the others.
+for args in '-L' 'dir' '-L dir'
+do
+  echo === >> out
+  du -a --exclude=sub $args dir \
+    | sed 's/^[0-9][0-9]*	//' | sed 's/f[123]/f_/' >> out || fail=1
+done
+
 cat <<\EOF > exp
+dir/f3
+dir/f2
+dir/f1
+dir
+===
 dir/f_
 dir
 ===
-dir/f2
-dir/f1
+dir/f_
+dir/f_
+dir
+===
+dir/f_
 dir
 EOF

--
1.7.2.rc1.192.g262ff




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6557; Package coreutils. (Sun, 04 Jul 2010 01:49:02 GMT) Full text and rfc822 format available.

Message #14 received at 6557 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> CS.UCLA.EDU>
To: Jim Meyering <jim <at> meyering.net>
Cc: 6557 <at> debbugs.gnu.org
Subject: Re: bug#6557: du sometimes miscounts directories, and files whose
	link count equals 1
Date: Sat, 03 Jul 2010 18:48:40 -0700
On 07/03/10 01:36, Jim Meyering wrote:

> Here's the adjusted patch, for review.

Yes, thanks, that looks good and it works for me.

> Also, I added a log line for the tests/du/files0-from change.
> (BTW, the following is the output from "git format-patch --stdout -1".
> It's easy to apply that by saving it in a FILE, then running "git am FILE")

Yes, and here's a proposed change to README-hacking to try to record
this advice, along with some other good advice you've given me recently:

From ded44a4b21f50faf40aa70695bec20b3822cffd1 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Sat, 3 Jul 2010 18:44:16 -0700
Subject: [PATCH] Add advice about ChangeLogs and synchronizing submodules.

* README-hacking: Adjust accordingly.
---
 README-hacking |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/README-hacking b/README-hacking
index fecbf9e..02cb277 100644
--- a/README-hacking
+++ b/README-hacking
@@ -39,6 +39,12 @@ which are extracted from other source packages:
 
         $ ./bootstrap
 
+To use the most-recent gnulib (as opposed to the gnulib version that
+the package last synchronized to), do this next:
+
+        $ git submodule foreach git pull origin master
+        $ git commit -a -m 'build: update gnulib submodule to latest'
+
 And there you are!  Just
 
         $ ./configure --quiet #[--enable-gcc-warnings] [*]
@@ -60,6 +66,29 @@ to use recent system headers.  If you configure with this option,
 and spot a problem, please be sure to send the report to the bug
 reporting address of this package, and not to that of gnulib, even
 if the problem seems to originate in a gnulib-provided file.
+
+* Submitting patches
+
+If you develop a fix or a new feature, please send it to the
+appropriate bug-reporting address as reported by the --help option of
+each program.  One way to do this is to use vc-dwim
+<http://www.gnu.org/software/vc-dwim/>), as follows.
+
+  Run the command "vc-dwim --help", copy its definition of the
+  "git-changelog-symlink-init" function into your shell, and then run
+  this function at the top-level directory of the package.
+
+  Edit the ChangeLog file that this command creates, creating a
+  properly-formatted entry according to the GNU coding standards
+  <http://www.gnu.org/prep/standards/html_node/Change-Logs.html>.
+
+  Run the command "vc-dwim" and make sure its output looks good.
+
+  Run "vc-dwim --commit".
+
+  Run the command "git format-patch --stdout -1", and email its output
+  in, using the the output's subject line.
+
 -----
 
 Copyright (C) 2002-2010 Free Software Foundation, Inc.
-- 
1.7.0.4





Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6557; Package coreutils. (Sun, 04 Jul 2010 06:38:02 GMT) Full text and rfc822 format available.

Message #17 received at 6557 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> CS.UCLA.EDU>
Cc: 6557 <at> debbugs.gnu.org
Subject: Re: bug#6557: du sometimes miscounts directories,
	and files whose link count equals 1
Date: Sun, 04 Jul 2010 08:36:55 +0200
Paul Eggert wrote:
> On 07/03/10 01:36, Jim Meyering wrote:
>
>> Here's the adjusted patch, for review.
>
> Yes, thanks, that looks good and it works for me.

I've pushed that du fix.

>> Also, I added a log line for the tests/du/files0-from change.
>> (BTW, the following is the output from "git format-patch --stdout -1".
>> It's easy to apply that by saving it in a FILE, then running "git am FILE")
>
> Yes, and here's a proposed change to README-hacking to try to record
> this advice, along with some other good advice you've given me recently:

Thanks!

> Subject: [PATCH] Add advice about ChangeLogs and synchronizing submodules.

I like to put a "doc: " at the beginning of such summary lines
and to omit the trailing ".":

  doc: add advice about ChangeLogs and synchronizing submodules

> * README-hacking: Adjust accordingly.
> ---
>  README-hacking |   29 +++++++++++++++++++++++++++++
>  1 files changed, 29 insertions(+), 0 deletions(-)
>
> diff --git a/README-hacking b/README-hacking
> index fecbf9e..02cb277 100644
> --- a/README-hacking
> +++ b/README-hacking
> @@ -39,6 +39,12 @@ which are extracted from other source packages:
>
>          $ ./bootstrap
>
> +To use the most-recent gnulib (as opposed to the gnulib version that
> +the package last synchronized to), do this next:
> +
> +        $ git submodule foreach git pull origin master
> +        $ git commit -a -m 'build: update gnulib submodule to latest'

In general, I try to ensure that each gnulib-updating change
remains in a commit all by itself[*], partly because
they are relatively likely to conflict -- esp. if I do the
update on a branch, later update to a different version
on the trunk and try to rebase.  If it's a commit by itself
it's trivial to avoid trouble: just remove the commit before rebasing
the branch.

So maybe this, instead?

       $ git commit -m 'build: update gnulib submodule to latest' gnulib

[*] However, when a gnulib change induces a matching change
in coreutils, the gnulib-updating part obviously belongs
with the coreutils-changing deltas.

>  And there you are!  Just
>
>          $ ./configure --quiet #[--enable-gcc-warnings] [*]
> @@ -60,6 +66,29 @@ to use recent system headers.  If you configure with this option,
>  and spot a problem, please be sure to send the report to the bug
>  reporting address of this package, and not to that of gnulib, even
>  if the problem seems to originate in a gnulib-provided file.
> +
> +* Submitting patches
> +
> +If you develop a fix or a new feature, please send it to the
> +appropriate bug-reporting address as reported by the --help option of
> +each program.  One way to do this is to use vc-dwim
> +<http://www.gnu.org/software/vc-dwim/>), as follows.
> +
> +  Run the command "vc-dwim --help", copy its definition of the
> +  "git-changelog-symlink-init" function into your shell, and then run
> +  this function at the top-level directory of the package.

This (above and below) is precisely the process I use.
Thanks for documenting it.  It may sound a little tortuous,
but has some hidden benefits.

> +  Edit the ChangeLog file that this command creates, creating a
> +  properly-formatted entry according to the GNU coding standards
> +  <http://www.gnu.org/prep/standards/html_node/Change-Logs.html>.
> +
> +  Run the command "vc-dwim" and make sure its output looks good.
> +
> +  Run "vc-dwim --commit".
> +
> +  Run the command "git format-patch --stdout -1", and email its output
> +  in, using the the output's subject line.
---------------^^^ ^^^

"make syntax-check" spotted the doubled "the".

You're welcome to push the result.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6557; Package coreutils. (Sun, 04 Jul 2010 22:51:02 GMT) Full text and rfc822 format available.

Message #20 received at 6557 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> CS.UCLA.EDU>
To: Jim Meyering <jim <at> meyering.net>
Cc: 6557 <at> debbugs.gnu.org
Subject: Re: bug#6557: du sometimes miscounts directories,	and files whose
	link count equals 1
Date: Sun, 04 Jul 2010 15:50:24 -0700
On 07/03/10 23:36, Jim Meyering wrote:

> So maybe this, instead?
> 
>        $ git commit -m 'build: update gnulib submodule to latest' gnulib

Sure, that's better.

> "make syntax-check" spotted the doubled "the".
> 
> You're welcome to push the result.

Thanks, I did that, with the two fixes noted above.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6557; Package coreutils. (Sun, 04 Jul 2010 23:04:02 GMT) Full text and rfc822 format available.

Message #23 received at 6557 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> CS.UCLA.EDU>
To: Jim Meyering <jim <at> meyering.net>
Cc: 6557 <at> debbugs.gnu.org
Subject: Re: bug#6557: du sometimes miscounts directories, and files whose
	link count equals 1
Date: Sun, 04 Jul 2010 16:03:11 -0700
On 07/03/10 01:18, Jim Meyering wrote:

> Note that the vendor versions of "du" from at least Solaris 10,
> openBSD, netBSD and freeBSD print both lines.
> I prefer the new semantics, especially when using --total:

Yes, the new semantics make more sense.  If you prefer the
traditional semantics, you can still get them, by using
"du A; du B" rather than "du A B".  In contrast, there's no way
to get the new (and better) semantics if all you have is the
traditional behavior.  This is another argument for staying
with the new semantics.

GNU du had already diverged from the traditional semantics, in
that it kept track of hard links across argument boundaries, which
traditional du does not.  (This behavior is documented in the
coreutils manual.)

Solaris 10 du -L is clearly busted, by the way, in that it counts
files multiply if their link count is 1.  I wouldn't be surprised
if the BSD du implementations are busted too.  This behavior is not
reasonable (and clearly doesn't conform to POSIX).  So in some sense
that weakens the argument of following the precedent of these older
implementations in this area.

> What do you think of breaking with that tradition?  POSIX does appear
> to say that for each "FILE" argument du must print a line, but it also
> mentions how with linked files, the space must be counted only once.
> You can definitely consider listing the same file twice as being
> analogous to a file being hard-linked.

The POSIX requirements are contradictory, and clearly the authors
had not thought through the implications.  When they're contradictory
we should do the best we can, and perhaps get POSIX fixed at some point
to clearly allow the new GNU behavior (as well as clearly allowing the
traditional behavior of course; right now POSIX does neither).

Thanks for fixing the test cases to match the new behavior.
I had only run the test case that I had updated, and should
have run them all (my only defense being that I'm using a
circa-2003 desktop to test....).




bug marked as fixed in version 8.6, send any further explanations to Paul Eggert <eggert <at> CS.UCLA.EDU> Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Wed, 14 Jul 2010 16:20:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 12 Aug 2010 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 13 years and 265 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.