Package: coreutils;
Reported by: Samuel Thibault <samuel.thibault <at> gnu.org>
Date: Mon, 7 May 2012 00:59:02 UTC
Severity: normal
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 11424 in the body.
You can then email your comments to 11424 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Mon, 07 May 2012 00:59:02 GMT) Full text and rfc822 format available.Samuel Thibault <samuel.thibault <at> gnu.org>
:bug-coreutils <at> gnu.org
.
(Mon, 07 May 2012 00:59:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Samuel Thibault <samuel.thibault <at> gnu.org> To: bug-coreutils <at> gnu.org Subject: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Mon, 7 May 2012 02:55:51 +0200
Hello, Since some time, coreutils fails in split tests on GNU/Hurd. More precisely, these two: split/filter:55 split --filter="head -c1 >/dev/null" -n 1 /dev/zero split/l-chunk:39 split -n l/2 /dev/zero It seems that these two tests assume that split will stop by itself when given /dev/zero as input. It does so on Linux, because /dev/zero there has stat_buf.st_size equal to 0, and thus split does just one loop, but on GNU/Hurd /dev/zero has stat_buf.st_size equal to LONG_MAX, and thus split just loops for a very long time. Posix doesn't seem to talk much about what should be done here, but it seems to me that coreutils tests should not assume size being zero, and for instance use dd to fetch only the required bytes from /dev/zero. Samuel
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Mon, 07 May 2012 07:49:02 GMT) Full text and rfc822 format available.Message #8 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Samuel Thibault <samuel.thibault <at> gnu.org> Cc: 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Mon, 07 May 2012 09:46:43 +0200
Samuel Thibault wrote: > Since some time, coreutils fails in split tests on GNU/Hurd. More > precisely, these two: > > split/filter:55 > split --filter="head -c1 >/dev/null" -n 1 /dev/zero > > split/l-chunk:39 > split -n l/2 /dev/zero > > It seems that these two tests assume that split will stop by itself when > given /dev/zero as input. It does so on Linux, because /dev/zero there > has stat_buf.st_size equal to 0, and thus split does just one loop, but > on GNU/Hurd /dev/zero has stat_buf.st_size equal to LONG_MAX, and thus > split just loops for a very long time. Posix doesn't seem to talk much > about what should be done here, but it seems to me that coreutils tests > should not assume size being zero, and for instance use dd to fetch only > the required bytes from /dev/zero. Hi Samuel, The real problem is that split was using stat.st_size from non-regular files: that is not defined. Here is a patch: From 0a63df4b669faf0585beab09f4b177c39d557b21 Mon Sep 17 00:00:00 2001 From: Jim Meyering <meyering <at> redhat.com> Date: Mon, 7 May 2012 09:32:00 +0200 Subject: [PATCH] split: avoid apparent infloop when splitting /dev/zero w/-n on the Hurd * src/split.c (main): Use stat.st_size only for regular files. Samuel Thibault reported in http://bugs.gnu.org/11424 that the /dev/zero-splitting tests would appear to infloop on GNU/Hurd, because /dev/zero's st_size is LONG_MAX. It was only a problem when using the --number (-n) option. * NEWS (Bug fixes): Mention it. This bug was introduced with the --number option, via commit v8.7-25-gbe10739 --- NEWS | 3 +++ src/split.c | 4 +++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index fd563c0..7563646 100644 --- a/NEWS +++ b/NEWS @@ -19,6 +19,9 @@ GNU coreutils NEWS -*- outline -*- was particularly easy to trigger, since there, the removal of D could precede the initial stat. [This bug was present in "the beginning".] + split --number=C /dev/null no longer appears to infloop on GNU/Hurd + [bug introduced in coreutils-8.8] + ** New features fmt now accepts the --goal=WIDTH (-g) option. diff --git a/src/split.c b/src/split.c index 99f6390..062aede 100644 --- a/src/split.c +++ b/src/split.c @@ -1339,7 +1339,9 @@ main (int argc, char **argv) error (EXIT_FAILURE, errno, "%s", infile); if (in_blk_size == 0) in_blk_size = io_blksize (stat_buf); - file_size = stat_buf.st_size; + + /* stat.st_size is valid only for regular files. For others, use 0. */ + file_size = S_ISREG (stat_buf.st_mode) ? stat_buf.st_size : 0; if (split_type == type_chunk_bytes || split_type == type_chunk_lines) { -- 1.7.10.1.457.g8275905
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Mon, 07 May 2012 07:53:02 GMT) Full text and rfc822 format available.Message #11 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: Samuel Thibault <samuel.thibault <at> gnu.org> Cc: 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Mon, 07 May 2012 00:50:00 -0700
Thanks, here's a quick cut at a patch to 'split' to fix the problem. Also, the test cases also need to be adjusted, so as not to attempt to split -n on a device. diff --git a/src/split.c b/src/split.c index 99f6390..8b966bc 100644 --- a/src/split.c +++ b/src/split.c @@ -1344,7 +1344,9 @@ main (int argc, char **argv) if (split_type == type_chunk_bytes || split_type == type_chunk_lines) { off_t input_offset = lseek (STDIN_FILENO, 0, SEEK_CUR); - if (input_offset < 0) + if (input_offset < 0 + || ! (S_ISREG (stat_buf.st_mode) + || S_TYPEISSHM (&stat_buf) || S_TYPEISTMO (&stat_buf))) error (EXIT_FAILURE, 0, _("%s: cannot determine file size"), quote (infile)); file_size -= input_offset; I audited coreutils for other misuses of st_size and found some; here are additional quick cuts at patches that look like they're needed. I didn't attempt to look for missed optimizations; just errors. diff --git a/src/dd.c b/src/dd.c index 4626de2..147b69d 100644 --- a/src/dd.c +++ b/src/dd.c @@ -1544,7 +1544,8 @@ skip (int fdesc, char const *file, uintmax_t records, size_t blocksize, struct stat st; if (fstat (STDIN_FILENO, &st) != 0) error (EXIT_FAILURE, errno, _("cannot fstat %s"), quote (file)); - if (S_ISREG (st.st_mode) && st.st_size < (input_offset + offset)) + if ((S_ISREG (st.st_mode) || S_TYPEISSHM (&st) || S_TYPEISTMO (&st)) + && st.st_size < input_offset + offset) { /* When skipping past EOF, return the number of _full_ blocks * that are not skipped, and set offset to EOF, so the caller @@ -2104,8 +2105,8 @@ dd_copy (void) } } - /* If the last write was converted to a seek, then for a regular file, - ftruncate to extend the size. */ + /* If the last write was converted to a seek, then for a regular file + or shared memory object, ftruncate to extend the size. */ if (final_op_was_seek) { struct stat stdout_stat; @@ -2114,7 +2115,7 @@ dd_copy (void) error (0, errno, _("cannot fstat %s"), quote (output_file)); return EXIT_FAILURE; } - if (S_ISREG (stdout_stat.st_mode)) + if (S_ISREG (stdout_stat.st_mode) || S_TYPEISSHM (&stdout_stat)) { off_t output_offset = lseek (STDOUT_FILENO, 0, SEEK_CUR); if (output_offset > stdout_stat.st_size) diff --git a/src/du.c b/src/du.c index 41c9535..7333941 100644 --- a/src/du.c +++ b/src/du.c @@ -99,7 +99,8 @@ duinfo_set (struct duinfo *a, uintmax_t size, struct timespec tmax) static inline void duinfo_add (struct duinfo *a, struct duinfo const *b) { - a->size += b->size; + uintmax_t sum = a->size + b->size; + a->size = a->size <= sum ? sum : UINTMAX_MAX; if (timespec_cmp (a->tmax, b->tmax) < 0) a->tmax = b->tmax; } @@ -370,8 +371,11 @@ static void print_only_size (uintmax_t n_bytes) { char buf[LONGEST_HUMAN_READABLE + 1]; - fputs (human_readable (n_bytes, buf, human_output_opts, - 1, output_block_size), stdout); + fputs ((n_bytes == UINTMAX_MAX + ? _("Infinity") + : human_readable (n_bytes, buf, human_output_opts, + 1, output_block_size)), + stdout); } /* Print size (and optionally time) indicated by *PDUI, followed by STRING. */ @@ -495,7 +499,7 @@ process_file (FTS *fts, FTSENT *ent) duinfo_set (&dui, (apparent_size - ? sb->st_size + ? MAX (0, sb->st_size) : (uintmax_t) ST_NBLOCKS (*sb) * ST_NBLOCKSIZE), (time_type == time_mtime ? get_stat_mtime (sb) : time_type == time_atime ? get_stat_atime (sb) diff --git a/src/od.c b/src/od.c index 7593796..a25f965 100644 --- a/src/od.c +++ b/src/od.c @@ -983,8 +983,7 @@ skip (uintmax_t n_skip) if (fstat (fileno (in_stream), &file_stats) == 0) { - /* The st_size field is valid only for regular files - (and for symbolic links, which cannot occur here). + /* The st_size field is valid for regular files. If the number of bytes left to skip is larger than the size of the current file, we can decrement n_skip and go on to the next file. Skip this optimization also diff --git a/src/stat.c b/src/stat.c index b2e1030..d001cda 100644 --- a/src/stat.c +++ b/src/stat.c @@ -954,7 +954,7 @@ print_stat (char *pformat, size_t prefix_len, unsigned int m, out_uint_x (pformat, prefix_len, minor (statbuf->st_rdev)); break; case 's': - out_uint (pformat, prefix_len, statbuf->st_size); + out_int (pformat, prefix_len, statbuf->st_size); break; case 'B': out_uint (pformat, prefix_len, ST_NBLOCKSIZE); diff --git a/src/truncate.c b/src/truncate.c index 9b847d2..eaef598 100644 --- a/src/truncate.c +++ b/src/truncate.c @@ -161,7 +161,8 @@ do_ftruncate (int fd, char const *fname, off_t ssize, off_t rsize, if (rsize < 0) /* fstat used above to get size. */ { - if (!S_ISREG (sb.st_mode) && !S_TYPEISSHM (&sb)) + if (!S_ISREG (sb.st_mode) + && !S_TYPEISSHM (&sb) && !S_TYPEISTMO (&sb)) { error (0, 0, _("cannot get the size of %s"), quote (fname)); return false; @@ -350,7 +351,7 @@ main (int argc, char **argv) struct stat sb; if (stat (ref_file, &sb) != 0) error (EXIT_FAILURE, errno, _("cannot stat %s"), quote (ref_file)); - if (!S_ISREG (sb.st_mode) && !S_TYPEISSHM (&sb)) + if (!S_ISREG (sb.st_mode) && !S_TYPEISSHM (&sb) && !S_TYPEISTMO (&sb)) error (EXIT_FAILURE, 0, _("cannot get the size of %s"), quote (ref_file)); if (!got_size)
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Mon, 07 May 2012 07:55:01 GMT) Full text and rfc822 format available.Message #14 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: Jim Meyering <jim <at> meyering.net> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Mon, 07 May 2012 00:52:11 -0700
On 05/07/2012 12:46 AM, Jim Meyering wrote: > + > + /* stat.st_size is valid only for regular files. For others, use 0. */ > + file_size = S_ISREG (stat_buf.st_mode) ? stat_buf.st_size : 0; Is it right to use 0 there, for non-regular files? Won't later code compute incorrect sizes in that case? Also, as a nit, stat.st_size is also valid for SHM and TMO files (this was in the patch I just sent).
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Mon, 07 May 2012 09:10:02 GMT) Full text and rfc822 format available.Message #17 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Mon, 07 May 2012 11:06:58 +0200
Paul Eggert wrote: > On 05/07/2012 12:46 AM, Jim Meyering wrote: >> + >> + /* stat.st_size is valid only for regular files. For others, use 0. */ >> + file_size = S_ISREG (stat_buf.st_mode) ? stat_buf.st_size : 0; > > Is it right to use 0 there, for non-regular files? > Won't later code compute incorrect sizes in that case? Hi Paul, I agree that more change is required and do prefer the direction your patches suggest. However, to fix the Hurd/infloop with minimal impact elsewhere, I have a slight preference for my small change. I.e. continuing to operate on non-regular files with --number we don't have to change the split --number tests that operate on /dev/zero. Then, introducing the behavior change (with your follow-on patch) can be independent of the bug fix commit. I do admit that without being able to determine a size up front, there's little point in using that option, so your patch (reject files with unusable stat.st_size) is required. With or without my patch on Linux/GNU, if you split /dev/zero, it sets file_size = 0, so at least for the tested cases I don't think that patch introduces a regression. > Also, as a nit, stat.st_size is also valid for > SHM and TMO files (this was in the patch I just sent). Good point. Do you feel like adding something like this to system.h and completing your patch? /* Return a boolean indicating whether sb->st_size is defined. */ static inline bool usable_st_size (struct stat const *sb) { return S_ISREG (sb->st_mode) || S_TYPEISSHM (sb) || S_TYPEISTMO (sb); }
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Tue, 08 May 2012 08:42:01 GMT) Full text and rfc822 format available.Message #20 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Tue, 08 May 2012 10:39:18 +0200
Jim Meyering wrote: > Paul Eggert wrote: >> On 05/07/2012 12:46 AM, Jim Meyering wrote: >>> + >>> + /* stat.st_size is valid only for regular files. For others, use 0. */ >>> + file_size = S_ISREG (stat_buf.st_mode) ? stat_buf.st_size : 0; >> >> Is it right to use 0 there, for non-regular files? >> Won't later code compute incorrect sizes in that case? > > Hi Paul, > > I agree that more change is required and do prefer the direction your > patches suggest. However, to fix the Hurd/infloop with minimal > impact elsewhere, I have a slight preference for my small change. > I.e. continuing to operate on non-regular files with --number we > don't have to change the split --number tests that operate on /dev/zero. > Then, introducing the behavior change (with your follow-on patch) can > be independent of the bug fix commit. > > I do admit that without being able to determine a size up front, there's > little point in using that option, so your patch (reject files with > unusable stat.st_size) is required. > > With or without my patch on Linux/GNU, if you split /dev/zero, > it sets file_size = 0, so at least for the tested cases > I don't think that patch introduces a regression. > >> Also, as a nit, stat.st_size is also valid for >> SHM and TMO files (this was in the patch I just sent). > > Good point. > Do you feel like adding something like this to system.h > and completing your patch? > > /* Return a boolean indicating whether sb->st_size is defined. */ > static inline bool > usable_st_size (struct stat const *sb) > { > return S_ISREG (sb->st_mode) || S_TYPEISSHM (sb) || S_TYPEISTMO (sb); > } I went ahead and pushed the less-invasive fix. Your behavior-changing one is more than welcome, too.
Jim Meyering <jim <at> meyering.net>
:Samuel Thibault <samuel.thibault <at> gnu.org>
:Message #25 received at 11424-done <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: 11424-done <at> debbugs.gnu.org Cc: Paul Eggert <eggert <at> cs.ucla.edu>, Samuel Thibault <samuel.thibault <at> gnu.org> Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Tue, 08 May 2012 11:19:50 +0200
Jim Meyering wrote: >> Paul Eggert wrote: ... > I went ahead and pushed the less-invasive fix. > Your behavior-changing one is more than welcome, too. Samuel confirmed that the fix solved his problem, so I've marked this as closed.
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Tue, 08 May 2012 16:30:02 GMT) Full text and rfc822 format available.Message #28 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: Jim Meyering <jim <at> meyering.net> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Tue, 08 May 2012 09:27:19 -0700
On 05/08/2012 01:39 AM, Jim Meyering wrote: > I went ahead and pushed the less-invasive fix. Hmm, I don't see this on Savannah; is this part of the problem where usable_st_size got pushed? > Your behavior-changing one is more than welcome, too. I came up with a better idea, and propose this patch instead. The idea is to fall back on lseek if st_size is not reliable. This allows the programs to work in more cases, including the case in question. One test case needs to be removed because it assumes a command must fail, where it now typically works. From bba6f199f621fb434c5df09a0b0278a081be87e2 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert <at> cs.ucla.edu> Date: Tue, 8 May 2012 09:22:22 -0700 Subject: [PATCH] maint: handle file sizes more reliably Problem reported by Samuel Thibault in <http://debbugs.gnu.org/11424>. * NEWS: Document this. * src/dd.c (skip): * src/split.c (main): * src/truncate.c (do_ftruncate, main): On files where st_size is not portable, fall back on using lseek with SEEK_END to determine the size. Although strictly speaking POSIX says the behavior is implementation-defined, in practice if lseek returns a nonnegative value it's a reasonable one to use for the file size. * src/dd.c (dd_copy): It's OK to truncate shared memory objects. * src/du.c (duinfo_add): Check for overflow. (print_only_size): Report overflow. (process_file): Ignore negative file sizes in the --apparent-size case. * src/od.c (skip): Fix comment about st_size. * src/stat.c (print_stat): Don't report negative sizes as huge positive ones. * src/system.h (usable_st_size): Symlinks have reliable st_size too. * tests/misc/truncate-dir-fail: Don't assume that getting the size of a dir is not allowed, as it's now allowed on many platforms, e.g., GNU/Linux. --- NEWS | 3 ++ src/dd.c | 17 +++++++++--- src/du.c | 12 ++++++--- src/od.c | 3 +- src/split.c | 7 +++++ src/stat.c | 2 +- src/system.h | 3 +- src/truncate.c | 56 ++++++++++++++++++++++++++++++----------- tests/misc/truncate-dir-fail | 3 -- 9 files changed, 76 insertions(+), 30 deletions(-) diff --git a/NEWS b/NEWS index 2dc6531..c911d52 100644 --- a/NEWS +++ b/NEWS @@ -21,6 +21,9 @@ GNU coreutils NEWS -*- outline -*- ** New features + dd, split, and truncate now allow any seekable files in situations + where the file size is needed, instead of insisting on regular files. + fmt now accepts the --goal=WIDTH (-g) option. ** Changes in behavior diff --git a/src/dd.c b/src/dd.c index 4626de2..75109bc 100644 --- a/src/dd.c +++ b/src/dd.c @@ -1542,9 +1542,18 @@ skip (int fdesc, char const *file, uintmax_t records, size_t blocksize, if (fdesc == STDIN_FILENO) { struct stat st; + off_t file_size; if (fstat (STDIN_FILENO, &st) != 0) error (EXIT_FAILURE, errno, _("cannot fstat %s"), quote (file)); - if (S_ISREG (st.st_mode) && st.st_size < (input_offset + offset)) + if (usable_st_size (&st)) + file_size = st.st_size; + else + { + file_size = skip_via_lseek (file, fdesc, 0, SEEK_END); + if (skip_via_lseek (file, fdesc, offset, SEEK_CUR) < 0) + error (EXIT_FAILURE, errno, _("%s: cannot skip"), quote (file)); + } + if (0 <= file_size && file_size < input_offset + offset) { /* When skipping past EOF, return the number of _full_ blocks * that are not skipped, and set offset to EOF, so the caller @@ -2104,8 +2113,8 @@ dd_copy (void) } } - /* If the last write was converted to a seek, then for a regular file, - ftruncate to extend the size. */ + /* If the last write was converted to a seek, then for a regular file + or shared memory object, ftruncate to extend the size. */ if (final_op_was_seek) { struct stat stdout_stat; @@ -2114,7 +2123,7 @@ dd_copy (void) error (0, errno, _("cannot fstat %s"), quote (output_file)); return EXIT_FAILURE; } - if (S_ISREG (stdout_stat.st_mode)) + if (S_ISREG (stdout_stat.st_mode) || S_TYPEISSHM (&stdout_stat)) { off_t output_offset = lseek (STDOUT_FILENO, 0, SEEK_CUR); if (output_offset > stdout_stat.st_size) diff --git a/src/du.c b/src/du.c index 41c9535..7333941 100644 --- a/src/du.c +++ b/src/du.c @@ -99,7 +99,8 @@ duinfo_set (struct duinfo *a, uintmax_t size, struct timespec tmax) static inline void duinfo_add (struct duinfo *a, struct duinfo const *b) { - a->size += b->size; + uintmax_t sum = a->size + b->size; + a->size = a->size <= sum ? sum : UINTMAX_MAX; if (timespec_cmp (a->tmax, b->tmax) < 0) a->tmax = b->tmax; } @@ -370,8 +371,11 @@ static void print_only_size (uintmax_t n_bytes) { char buf[LONGEST_HUMAN_READABLE + 1]; - fputs (human_readable (n_bytes, buf, human_output_opts, - 1, output_block_size), stdout); + fputs ((n_bytes == UINTMAX_MAX + ? _("Infinity") + : human_readable (n_bytes, buf, human_output_opts, + 1, output_block_size)), + stdout); } /* Print size (and optionally time) indicated by *PDUI, followed by STRING. */ @@ -495,7 +499,7 @@ process_file (FTS *fts, FTSENT *ent) duinfo_set (&dui, (apparent_size - ? sb->st_size + ? MAX (0, sb->st_size) : (uintmax_t) ST_NBLOCKS (*sb) * ST_NBLOCKSIZE), (time_type == time_mtime ? get_stat_mtime (sb) : time_type == time_atime ? get_stat_atime (sb) diff --git a/src/od.c b/src/od.c index 7593796..a25f965 100644 --- a/src/od.c +++ b/src/od.c @@ -983,8 +983,7 @@ skip (uintmax_t n_skip) if (fstat (fileno (in_stream), &file_stats) == 0) { - /* The st_size field is valid only for regular files - (and for symbolic links, which cannot occur here). + /* The st_size field is valid for regular files. If the number of bytes left to skip is larger than the size of the current file, we can decrement n_skip and go on to the next file. Skip this optimization also diff --git a/src/split.c b/src/split.c index 99f6390..fe14d80 100644 --- a/src/split.c +++ b/src/split.c @@ -1344,6 +1344,13 @@ main (int argc, char **argv) if (split_type == type_chunk_bytes || split_type == type_chunk_lines) { off_t input_offset = lseek (STDIN_FILENO, 0, SEEK_CUR); + if (0 <= input_offset && ! usable_st_size (&stat_buf)) + { + file_size = lseek (STDIN_FILENO, 0, SEEK_END); + input_offset = (file_size < 0 + ? file_size + : lseek (STDIN_FILENO, input_offset, SEEK_SET)); + } if (input_offset < 0) error (EXIT_FAILURE, 0, _("%s: cannot determine file size"), quote (infile)); diff --git a/src/stat.c b/src/stat.c index b2e1030..d001cda 100644 --- a/src/stat.c +++ b/src/stat.c @@ -954,7 +954,7 @@ print_stat (char *pformat, size_t prefix_len, unsigned int m, out_uint_x (pformat, prefix_len, minor (statbuf->st_rdev)); break; case 's': - out_uint (pformat, prefix_len, statbuf->st_size); + out_int (pformat, prefix_len, statbuf->st_size); break; case 'B': out_uint (pformat, prefix_len, ST_NBLOCKSIZE); diff --git a/src/system.h b/src/system.h index e3d3156..06f09cb 100644 --- a/src/system.h +++ b/src/system.h @@ -605,7 +605,8 @@ bad_cast (char const *s) static inline bool usable_st_size (struct stat const *sb) { - return S_ISREG (sb->st_mode) || S_TYPEISSHM (sb) || S_TYPEISTMO (sb); + return (S_ISREG (sb->st_mode) || S_ISLNK (sb->st_mode) + || S_TYPEISSHM (sb) || S_TYPEISTMO (sb)); } void usage (int status) ATTRIBUTE_NORETURN; diff --git a/src/truncate.c b/src/truncate.c index 9b847d2..49d9732 100644 --- a/src/truncate.c +++ b/src/truncate.c @@ -157,23 +157,36 @@ do_ftruncate (int fd, char const *fname, off_t ssize, off_t rsize, } if (rel_mode) { - uintmax_t const fsize = rsize < 0 ? sb.st_size : rsize; + uintmax_t fsize; - if (rsize < 0) /* fstat used above to get size. */ + if (0 <= rsize) + fsize = rsize; + else { - if (!S_ISREG (sb.st_mode) && !S_TYPEISSHM (&sb)) + off_t file_size; + if (usable_st_size (&sb)) { - error (0, 0, _("cannot get the size of %s"), quote (fname)); - return false; + file_size = sb.st_size; + if (file_size < 0) + { + /* Sanity check. Overflow is the only reason I can think + this would ever go negative. */ + error (0, 0, _("%s has unusable, apparently negative size"), + quote (fname)); + return false; + } } - if (sb.st_size < 0) + else { - /* Sanity check. Overflow is the only reason I can think - this would ever go negative. */ - error (0, 0, _("%s has unusable, apparently negative size"), - quote (fname)); - return false; + file_size = lseek (fd, 0, SEEK_END); + if (file_size < 0) + { + error (0, errno, _("cannot get the size of %s"), + quote (fname)); + return false; + } } + fsize = file_size; } if (rel_mode == rm_min) @@ -348,15 +361,28 @@ main (int argc, char **argv) { /* FIXME: Maybe support getting size of block devices. */ struct stat sb; + off_t file_size = -1; if (stat (ref_file, &sb) != 0) error (EXIT_FAILURE, errno, _("cannot stat %s"), quote (ref_file)); - if (!S_ISREG (sb.st_mode) && !S_TYPEISSHM (&sb)) - error (EXIT_FAILURE, 0, _("cannot get the size of %s"), + if (usable_st_size (&sb)) + file_size = sb.st_size; + else + { + int ref_fd = open (ref_file, O_RDONLY); + if (0 <= ref_fd) + { + off_t file_end = lseek (ref_fd, 0, SEEK_END); + if (0 <= file_end && close (ref_fd) == 0) + file_size = file_end; + } + } + if (file_size < 0) + error (EXIT_FAILURE, errno, _("cannot get the size of %s"), quote (ref_file)); if (!got_size) - size = sb.st_size; + size = file_size; else - rsize = sb.st_size; + rsize = file_size; } oflags = O_WRONLY | (no_create ? 0 : O_CREAT) | O_NONBLOCK; diff --git a/tests/misc/truncate-dir-fail b/tests/misc/truncate-dir-fail index 1167352..54a3147 100755 --- a/tests/misc/truncate-dir-fail +++ b/tests/misc/truncate-dir-fail @@ -22,7 +22,4 @@ print_ver_ truncate # truncate on dir not allowed truncate -s+0 . && fail=1 -# getting the size of a dir is not allowed -truncate -r. file && fail=1 - Exit $fail -- 1.7.6.5
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Tue, 08 May 2012 16:42:03 GMT) Full text and rfc822 format available.Message #31 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Tue, 08 May 2012 18:39:41 +0200
Paul Eggert wrote: > On 05/08/2012 01:39 AM, Jim Meyering wrote: >> I went ahead and pushed the less-invasive fix. > > Hmm, I don't see this on Savannah; is this part > of the problem where usable_st_size got pushed? Ahh... I think I know what happened. I had both the usable_st_size and split-hang-fix patches on a temporary branch and ran git rebase -i HEAD~2 intending to delete the usable_st_size change set just before pushing. Obviously I removed the other instead. I've just pushed the split-hang-fix patch, along with a gnulib-updating patch that also pulls in the latest init.sh and bootstrap scripts. >> Your behavior-changing one is more than welcome, too. > > I came up with a better idea, and propose this patch > instead. The idea is to fall back on lseek if > st_size is not reliable. This allows the programs > to work in more cases, including the case in question. > One test case needs to be removed because it assumes > a command must fail, where it now typically works. Thanks! I'll look at it this evening. ... > Subject: [PATCH] maint: handle file sizes more reliably > > Problem reported by Samuel Thibault in <http://debbugs.gnu.org/11424>. > * NEWS: Document this. > * src/dd.c (skip): > * src/split.c (main): > * src/truncate.c (do_ftruncate, main): > On files where st_size is not portable, fall back on using lseek > with SEEK_END to determine the size. Although strictly speaking > POSIX says the behavior is implementation-defined, in practice > if lseek returns a nonnegative value it's a reasonable one to > use for the file size. > * src/dd.c (dd_copy): It's OK to truncate shared memory objects. > * src/du.c (duinfo_add): Check for overflow. > (print_only_size): Report overflow. > (process_file): Ignore negative file sizes in the --apparent-size case. > * src/od.c (skip): Fix comment about st_size. > * src/stat.c (print_stat): Don't report negative sizes as huge > positive ones. > * src/system.h (usable_st_size): Symlinks have reliable st_size too. > * tests/misc/truncate-dir-fail: Don't assume that getting the size > of a dir is not allowed, as it's now allowed on many platforms, > e.g., GNU/Linux.
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Tue, 08 May 2012 23:00:02 GMT) Full text and rfc822 format available.Message #34 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Pádraig Brady <P <at> draigBrady.com> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: 11424 <at> debbugs.gnu.org, Jim Meyering <jim <at> meyering.net>, Samuel Thibault <samuel.thibault <at> gnu.org> Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Tue, 08 May 2012 23:57:25 +0100
On 05/08/2012 05:27 PM, Paul Eggert wrote: > On 05/08/2012 01:39 AM, Jim Meyering wrote: >> I went ahead and pushed the less-invasive fix. > > Hmm, I don't see this on Savannah; is this part > of the problem where usable_st_size got pushed? > >> Your behavior-changing one is more than welcome, too. > > I came up with a better idea, and propose this patch > instead. The idea is to fall back on lseek if > st_size is not reliable. This allows the programs > to work in more cases, including the case in question. > One test case needs to be removed because it assumes > a command must fail, where it now typically works. > > >>From bba6f199f621fb434c5df09a0b0278a081be87e2 Mon Sep 17 00:00:00 2001 > From: Paul Eggert <eggert <at> cs.ucla.edu> > Date: Tue, 8 May 2012 09:22:22 -0700 > Subject: [PATCH] maint: handle file sizes more reliably > > Problem reported by Samuel Thibault in <http://debbugs.gnu.org/11424>. > * NEWS: Document this. > * src/dd.c (skip): > * src/split.c (main): > * src/truncate.c (do_ftruncate, main): > On files where st_size is not portable, fall back on using lseek > with SEEK_END to determine the size. Although strictly speaking > POSIX says the behavior is implementation-defined, in practice > if lseek returns a nonnegative value it's a reasonable one to > use for the file size. > * src/dd.c (dd_copy): It's OK to truncate shared memory objects. > * src/du.c (duinfo_add): Check for overflow. > (print_only_size): Report overflow. > (process_file): Ignore negative file sizes in the --apparent-size case. > * src/od.c (skip): Fix comment about st_size. > * src/stat.c (print_stat): Don't report negative sizes as huge > positive ones. > * src/system.h (usable_st_size): Symlinks have reliable st_size too. > * tests/misc/truncate-dir-fail: Don't assume that getting the size > of a dir is not allowed, as it's now allowed on many platforms, > e.g., GNU/Linux. > --- > NEWS | 3 ++ > src/dd.c | 17 +++++++++--- > src/du.c | 12 ++++++--- > src/od.c | 3 +- > src/split.c | 7 +++++ > src/stat.c | 2 +- > src/system.h | 3 +- > src/truncate.c | 56 ++++++++++++++++++++++++++++++----------- > tests/misc/truncate-dir-fail | 3 -- > 9 files changed, 76 insertions(+), 30 deletions(-) > > diff --git a/NEWS b/NEWS > index 2dc6531..c911d52 100644 > --- a/NEWS > +++ b/NEWS > @@ -21,6 +21,9 @@ GNU coreutils NEWS -*- outline -*- > > ** New features > > + dd, split, and truncate now allow any seekable files in situations > + where the file size is needed, instead of insisting on regular files. > + > fmt now accepts the --goal=WIDTH (-g) option. > > ** Changes in behavior > diff --git a/src/dd.c b/src/dd.c > index 4626de2..75109bc 100644 > --- a/src/dd.c > +++ b/src/dd.c > @@ -1542,9 +1542,18 @@ skip (int fdesc, char const *file, uintmax_t records, size_t blocksize, > if (fdesc == STDIN_FILENO) > { > struct stat st; > + off_t file_size; > if (fstat (STDIN_FILENO, &st) != 0) > error (EXIT_FAILURE, errno, _("cannot fstat %s"), quote (file)); > - if (S_ISREG (st.st_mode) && st.st_size < (input_offset + offset)) > + if (usable_st_size (&st)) > + file_size = st.st_size; > + else > + { > + file_size = skip_via_lseek (file, fdesc, 0, SEEK_END); > + if (skip_via_lseek (file, fdesc, offset, SEEK_CUR) < 0) > + error (EXIT_FAILURE, errno, _("%s: cannot skip"), quote (file)); > + } > + if (0 <= file_size && file_size < input_offset + offset) > { > /* When skipping past EOF, return the number of _full_ blocks > * that are not skipped, and set offset to EOF, so the caller Might SEEK_END have implications for tape devices, where we might go to the end even if only operating on the start of the device? This thread reminded me of an old one where I previously wondered if one could refactor out a stat_size() that would return unsigned and thus simplify many callers? http://lists.gnu.org/archive/html/bug-coreutils/2009-01/msg00069.html I think it was suggested in that thread that POSIX will return a positive st_size for those st_modes? Something like: uintmax_t stat_size (struct stat const *statp, int fd) { uintmax_t size = UINTMAX_MAX; /* Error. */ struct stat sb; if (! statp) { if (fd > 0) { if (fstat (fd, &sb) != 0) return size; statp = &sb; } else return size; } if (S_ISREG (statp->st_mode) || S_ISLNK (statp->st_mode) || S_TYPEISSHM (&stat_buf) || S_TYPEISTMO (&stat_buf)) size = statp->st_size; else if (fd > 0) ; /* Maybe do something with SEEK_END or BLKGETSIZE64 ? */ } cheers, Pádraig.
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Wed, 09 May 2012 10:18:01 GMT) Full text and rfc822 format available.Message #37 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Wed, 09 May 2012 12:14:55 +0200
Paul Eggert wrote: > On 05/08/2012 01:39 AM, Jim Meyering wrote: >> I went ahead and pushed the less-invasive fix. > > Hmm, I don't see this on Savannah; is this part > of the problem where usable_st_size got pushed? > >> Your behavior-changing one is more than welcome, too. > > I came up with a better idea, and propose this patch > instead. The idea is to fall back on lseek if > st_size is not reliable. This allows the programs > to work in more cases, including the case in question. > One test case needs to be removed because it assumes > a command must fail, where it now typically works. > >>From bba6f199f621fb434c5df09a0b0278a081be87e2 Mon Sep 17 00:00:00 2001 > From: Paul Eggert <eggert <at> cs.ucla.edu> > Date: Tue, 8 May 2012 09:22:22 -0700 > Subject: [PATCH] maint: handle file sizes more reliably > > Problem reported by Samuel Thibault in <http://debbugs.gnu.org/11424>. > * NEWS: Document this. > * src/dd.c (skip): > * src/split.c (main): > * src/truncate.c (do_ftruncate, main): > On files where st_size is not portable, fall back on using lseek > with SEEK_END to determine the size. Although strictly speaking > POSIX says the behavior is implementation-defined, in practice > if lseek returns a nonnegative value it's a reasonable one to > use for the file size. > * src/dd.c (dd_copy): It's OK to truncate shared memory objects. > * src/du.c (duinfo_add): Check for overflow. > (print_only_size): Report overflow. > (process_file): Ignore negative file sizes in the --apparent-size case. > * src/od.c (skip): Fix comment about st_size. > * src/stat.c (print_stat): Don't report negative sizes as huge > positive ones. > * src/system.h (usable_st_size): Symlinks have reliable st_size too. > * tests/misc/truncate-dir-fail: Don't assume that getting the size > of a dir is not allowed, as it's now allowed on many platforms, > e.g., GNU/Linux. > --- ... > diff --git a/src/stat.c b/src/stat.c > index b2e1030..d001cda 100644 > --- a/src/stat.c > +++ b/src/stat.c > @@ -954,7 +954,7 @@ print_stat (char *pformat, size_t prefix_len, unsigned int m, > out_uint_x (pformat, prefix_len, minor (statbuf->st_rdev)); > break; > case 's': > - out_uint (pformat, prefix_len, statbuf->st_size); > + out_int (pformat, prefix_len, statbuf->st_size); > break; > case 'B': > out_uint (pformat, prefix_len, ST_NBLOCKSIZE); Thanks for all of that. I think Pádraig's question about dd's skip seeking to EOF on an actual tape device is the most important to address. Your stat.c change is actually a bug fix, so I'd prefer to put it in a separate commit. I did that for you. Let me know if the change below is ok and I'll push it -- or you're welcome to make any change you'd like and push it yourself. From 0b816e06d0b3d0cc9b7d2f92b095145bfe7c5476 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert <at> cs.ucla.edu> Date: Wed, 9 May 2012 12:07:57 +0200 Subject: [PATCH] stat: don't report negative file size as huge positive number * src/stat.c (print_stat): Use out_int, not out_uint for stat.st_size. * NEWS (Bug fixes): Mention it. --- NEWS | 2 ++ src/stat.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index eb95404..2935276 100644 --- a/NEWS +++ b/NEWS @@ -22,6 +22,8 @@ GNU coreutils NEWS -*- outline -*- split --number=C /dev/null no longer appears to infloop on GNU/Hurd [bug introduced in coreutils-8.8] + stat no longer reports a negative file size as a huge positive number + ** New features fmt now accepts the --goal=WIDTH (-g) option. diff --git a/src/stat.c b/src/stat.c index b2e1030..d001cda 100644 --- a/src/stat.c +++ b/src/stat.c @@ -954,7 +954,7 @@ print_stat (char *pformat, size_t prefix_len, unsigned int m, out_uint_x (pformat, prefix_len, minor (statbuf->st_rdev)); break; case 's': - out_uint (pformat, prefix_len, statbuf->st_size); + out_int (pformat, prefix_len, statbuf->st_size); break; case 'B': out_uint (pformat, prefix_len, ST_NBLOCKSIZE); -- 1.7.10.1.487.ga3935e6
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Wed, 09 May 2012 11:16:01 GMT) Full text and rfc822 format available.Message #40 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Pádraig Brady <P <at> draigBrady.com> To: Jim Meyering <jim <at> meyering.net> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Wed, 09 May 2012 12:12:56 +0100
On 05/09/2012 11:14 AM, Jim Meyering wrote: > Paul Eggert wrote: >> On 05/08/2012 01:39 AM, Jim Meyering wrote: >>> I went ahead and pushed the less-invasive fix. >> >> Hmm, I don't see this on Savannah; is this part >> of the problem where usable_st_size got pushed? >> >>> Your behavior-changing one is more than welcome, too. >> >> I came up with a better idea, and propose this patch >> instead. The idea is to fall back on lseek if >> st_size is not reliable. This allows the programs >> to work in more cases, including the case in question. >> One test case needs to be removed because it assumes >> a command must fail, where it now typically works. >> >> >From bba6f199f621fb434c5df09a0b0278a081be87e2 Mon Sep 17 00:00:00 2001 >> From: Paul Eggert <eggert <at> cs.ucla.edu> >> Date: Tue, 8 May 2012 09:22:22 -0700 >> Subject: [PATCH] maint: handle file sizes more reliably >> >> Problem reported by Samuel Thibault in <http://debbugs.gnu.org/11424>. >> * NEWS: Document this. >> * src/dd.c (skip): >> * src/split.c (main): >> * src/truncate.c (do_ftruncate, main): >> On files where st_size is not portable, fall back on using lseek >> with SEEK_END to determine the size. Although strictly speaking >> POSIX says the behavior is implementation-defined, in practice >> if lseek returns a nonnegative value it's a reasonable one to >> use for the file size. >> * src/dd.c (dd_copy): It's OK to truncate shared memory objects. >> * src/du.c (duinfo_add): Check for overflow. >> (print_only_size): Report overflow. >> (process_file): Ignore negative file sizes in the --apparent-size case. >> * src/od.c (skip): Fix comment about st_size. >> * src/stat.c (print_stat): Don't report negative sizes as huge >> positive ones. >> * src/system.h (usable_st_size): Symlinks have reliable st_size too. >> * tests/misc/truncate-dir-fail: Don't assume that getting the size >> of a dir is not allowed, as it's now allowed on many platforms, >> e.g., GNU/Linux. >> --- > ... >> diff --git a/src/stat.c b/src/stat.c >> index b2e1030..d001cda 100644 >> --- a/src/stat.c >> +++ b/src/stat.c >> @@ -954,7 +954,7 @@ print_stat (char *pformat, size_t prefix_len, unsigned int m, >> out_uint_x (pformat, prefix_len, minor (statbuf->st_rdev)); >> break; >> case 's': >> - out_uint (pformat, prefix_len, statbuf->st_size); >> + out_int (pformat, prefix_len, statbuf->st_size); >> break; >> case 'B': >> out_uint (pformat, prefix_len, ST_NBLOCKSIZE); > > Thanks for all of that. > I think Pádraig's question about dd's skip seeking to EOF on an > actual tape device is the most important to address. > > Your stat.c change is actually a bug fix, so I'd prefer to > put it in a separate commit. I did that for you. Let me know > if the change below is ok and I'll push it -- or you're welcome > to make any change you'd like and push it yourself. > >>From 0b816e06d0b3d0cc9b7d2f92b095145bfe7c5476 Mon Sep 17 00:00:00 2001 > From: Paul Eggert <eggert <at> cs.ucla.edu> > Date: Wed, 9 May 2012 12:07:57 +0200 > Subject: [PATCH] stat: don't report negative file size as huge positive > number > > * src/stat.c (print_stat): Use out_int, not out_uint for stat.st_size. > * NEWS (Bug fixes): Mention it. > --- > NEWS | 2 ++ > src/stat.c | 2 +- > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/NEWS b/NEWS > index eb95404..2935276 100644 > --- a/NEWS > +++ b/NEWS > @@ -22,6 +22,8 @@ GNU coreutils NEWS -*- outline -*- > split --number=C /dev/null no longer appears to infloop on GNU/Hurd > [bug introduced in coreutils-8.8] > > + stat no longer reports a negative file size as a huge positive number > + > ** New features > > fmt now accepts the --goal=WIDTH (-g) option. > diff --git a/src/stat.c b/src/stat.c > index b2e1030..d001cda 100644 > --- a/src/stat.c > +++ b/src/stat.c > @@ -954,7 +954,7 @@ print_stat (char *pformat, size_t prefix_len, unsigned int m, > out_uint_x (pformat, prefix_len, minor (statbuf->st_rdev)); > break; > case 's': > - out_uint (pformat, prefix_len, statbuf->st_size); > + out_int (pformat, prefix_len, statbuf->st_size); > break; > case 'B': > out_uint (pformat, prefix_len, ST_NBLOCKSIZE); > -- > 1.7.10.1.487.ga3935e6 For the record, stat(1) has output st_size as unsigned since the initial implementation in fileutils-4.1.10. I noticed that st_size is unsigned for 32 bit linux kernels according to /usr/include/asm/stat.h, however my modern 32 kernel gives EOVERFLOW for files in the 2-4G range, and thus shouldn't put negative numbers in those fields. That used not be the case I think: http://lkml.indiana.edu/hypermail/linux/kernel/0004.1/0343.html Also other 32 bit environments might overflow here. So I think this change is a net improvement. cheers, Pádraig.
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Wed, 09 May 2012 11:25:02 GMT) Full text and rfc822 format available.Message #43 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Pádraig Brady <P <at> draigBrady.com> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Wed, 09 May 2012 13:22:17 +0200
Pádraig Brady wrote: ... >>>From 0b816e06d0b3d0cc9b7d2f92b095145bfe7c5476 Mon Sep 17 00:00:00 2001 >> From: Paul Eggert <eggert <at> cs.ucla.edu> >> Date: Wed, 9 May 2012 12:07:57 +0200 >> Subject: [PATCH] stat: don't report negative file size as huge positive >> number >> >> * src/stat.c (print_stat): Use out_int, not out_uint for stat.st_size. >> * NEWS (Bug fixes): Mention it. ... > > For the record, stat(1) has output st_size as unsigned since the > initial implementation in fileutils-4.1.10. > > I noticed that st_size is unsigned for 32 bit linux kernels > according to /usr/include/asm/stat.h, however my modern 32 kernel > gives EOVERFLOW for files in the 2-4G range, and thus shouldn't > put negative numbers in those fields. That used not be the case I think: > http://lkml.indiana.edu/hypermail/linux/kernel/0004.1/0343.html > Also other 32 bit environments might overflow here. Thanks for investigating that. It is reassuring to know that interpreting it as unsigned used to be correct, and it's only in not adapting to the 32-bit kernel ABI change did we let the error sneak in.
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Wed, 09 May 2012 21:48:01 GMT) Full text and rfc822 format available.Message #46 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: Jim Meyering <jim <at> meyering.net> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Wed, 09 May 2012 14:45:19 -0700
On 05/09/2012 03:14 AM, Jim Meyering wrote: > I think Pádraig's question about dd's skip seeking to EOF on an > actual tape device is the most important to address. Yes indeed, I think that part of my change should be backed out at least for now (so close before a release). Part of the issue is that it's not clear what lseek (fd, N, SEEK_SET) does on tape devices where N exceeds the length of the file. My vague recollection is that such lseeks return L, where L is the file length, but I can't cite chapter and verse on that right now. This would suggest further fixes in this area -- but later. > Your stat.c change is actually a bug fix, so I'd prefer to > put it in a separate commit. I did that for you. Let me know > if the change below is ok and I'll push it -- or you're welcome > to make any change you'd like and push it yourself. Yes, please push that. I'll try to squeeze some time free to look at this in the next day or two.
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Thu, 10 May 2012 06:40:08 GMT) Full text and rfc822 format available.Message #49 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Thu, 10 May 2012 08:36:38 +0200
Paul Eggert wrote: > On 05/09/2012 03:14 AM, Jim Meyering wrote: >> I think Pádraig's question about dd's skip seeking to EOF on an >> actual tape device is the most important to address. > > Yes indeed, I think that part of my change should be backed out > at least for now (so close before a release). > > Part of the issue is that it's not clear what lseek (fd, N, SEEK_SET) > does on tape devices where N exceeds the length of the file. My > vague recollection is that such lseeks return L, where L is the > file length, but I can't cite chapter and verse on that right now. > This would suggest further fixes in this area -- but later. > >> Your stat.c change is actually a bug fix, so I'd prefer to >> put it in a separate commit. I did that for you. Let me know >> if the change below is ok and I'll push it -- or you're welcome >> to make any change you'd like and push it yourself. > > Yes, please push that. I'll try to squeeze some time free > to look at this in the next day or two. I see you pushed it. Thanks for adding that line to NEWS: stat no longer reports a negative file size as a huge positive number. +[bug present since 'stat' was introduced in fileutils-4.1.9]
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Thu, 10 May 2012 06:59:01 GMT) Full text and rfc822 format available.Message #52 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: Jim Meyering <jim <at> meyering.net> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Wed, 09 May 2012 23:55:42 -0700
On 05/09/2012 11:36 PM, Jim Meyering wrote: > I see you pushed it. > Thanks for adding that line to NEWS: You're welcome. I shook free some time to adjust the st_size patch, and here's a new version that I think incorporates all the comments so far: From 0c9661af1b29e2999030cb93003d98673faabf83 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert <at> cs.ucla.edu> Date: Wed, 9 May 2012 23:53:16 -0700 Subject: [PATCH] maint: handle file sizes more reliably Problem reported by Samuel Thibault in <http://debbugs.gnu.org/11424>. * NEWS: Document this. * src/dd.c (skip): Handle skipping past EOF on shared or typed memory objects the same way as with regular files. (dd_copy): It's OK to truncate shared memory objects. * src/du.c (duinfo_add): Check for overflow. (print_only_size): Report overflow. (process_file): Ignore negative file sizes in the --apparent-size case. * src/od.c (skip): Fix comment about st_size. * src/split.c (main): * src/truncate.c (do_ftruncate, main): On files where st_size is not portable, fall back on using lseek with SEEK_END to determine the size. Although strictly speaking POSIX says the behavior is implementation-defined, in practice if lseek returns a nonnegative value it's a reasonable one to use for the file size. * src/system.h (usable_st_size): Symlinks have reliable st_size too. * tests/misc/truncate-dir-fail: Don't assume that getting the size of a dir is not allowed, as it's now allowed on many platforms, e.g., GNU/Linux. --- NEWS | 3 ++ src/dd.c | 8 +++--- src/du.c | 12 ++++++--- src/od.c | 3 +- src/split.c | 14 +++++++--- src/system.h | 3 +- src/truncate.c | 56 ++++++++++++++++++++++++++++++----------- tests/misc/truncate-dir-fail | 3 -- 8 files changed, 69 insertions(+), 33 deletions(-) diff --git a/NEWS b/NEWS index 7ef2f54..e56f8fa 100644 --- a/NEWS +++ b/NEWS @@ -27,6 +27,9 @@ GNU coreutils NEWS -*- outline -*- ** New features + split and truncate now allow any seekable files in situations where + the file size is needed, instead of insisting on regular files. + fmt now accepts the --goal=WIDTH (-g) option. ** Changes in behavior diff --git a/src/dd.c b/src/dd.c index 4626de2..163d514 100644 --- a/src/dd.c +++ b/src/dd.c @@ -1544,7 +1544,7 @@ skip (int fdesc, char const *file, uintmax_t records, size_t blocksize, struct stat st; if (fstat (STDIN_FILENO, &st) != 0) error (EXIT_FAILURE, errno, _("cannot fstat %s"), quote (file)); - if (S_ISREG (st.st_mode) && st.st_size < (input_offset + offset)) + if (usable_st_size (&st) && st.st_size < input_offset + offset) { /* When skipping past EOF, return the number of _full_ blocks * that are not skipped, and set offset to EOF, so the caller @@ -2104,8 +2104,8 @@ dd_copy (void) } } - /* If the last write was converted to a seek, then for a regular file, - ftruncate to extend the size. */ + /* If the last write was converted to a seek, then for a regular file + or shared memory object, ftruncate to extend the size. */ if (final_op_was_seek) { struct stat stdout_stat; @@ -2114,7 +2114,7 @@ dd_copy (void) error (0, errno, _("cannot fstat %s"), quote (output_file)); return EXIT_FAILURE; } - if (S_ISREG (stdout_stat.st_mode)) + if (S_ISREG (stdout_stat.st_mode) || S_TYPEISSHM (&stdout_stat)) { off_t output_offset = lseek (STDOUT_FILENO, 0, SEEK_CUR); if (output_offset > stdout_stat.st_size) diff --git a/src/du.c b/src/du.c index 41c9535..7333941 100644 --- a/src/du.c +++ b/src/du.c @@ -99,7 +99,8 @@ duinfo_set (struct duinfo *a, uintmax_t size, struct timespec tmax) static inline void duinfo_add (struct duinfo *a, struct duinfo const *b) { - a->size += b->size; + uintmax_t sum = a->size + b->size; + a->size = a->size <= sum ? sum : UINTMAX_MAX; if (timespec_cmp (a->tmax, b->tmax) < 0) a->tmax = b->tmax; } @@ -370,8 +371,11 @@ static void print_only_size (uintmax_t n_bytes) { char buf[LONGEST_HUMAN_READABLE + 1]; - fputs (human_readable (n_bytes, buf, human_output_opts, - 1, output_block_size), stdout); + fputs ((n_bytes == UINTMAX_MAX + ? _("Infinity") + : human_readable (n_bytes, buf, human_output_opts, + 1, output_block_size)), + stdout); } /* Print size (and optionally time) indicated by *PDUI, followed by STRING. */ @@ -495,7 +499,7 @@ process_file (FTS *fts, FTSENT *ent) duinfo_set (&dui, (apparent_size - ? sb->st_size + ? MAX (0, sb->st_size) : (uintmax_t) ST_NBLOCKS (*sb) * ST_NBLOCKSIZE), (time_type == time_mtime ? get_stat_mtime (sb) : time_type == time_atime ? get_stat_atime (sb) diff --git a/src/od.c b/src/od.c index 7593796..a25f965 100644 --- a/src/od.c +++ b/src/od.c @@ -983,8 +983,7 @@ skip (uintmax_t n_skip) if (fstat (fileno (in_stream), &file_stats) == 0) { - /* The st_size field is valid only for regular files - (and for symbolic links, which cannot occur here). + /* The st_size field is valid for regular files. If the number of bytes left to skip is larger than the size of the current file, we can decrement n_skip and go on to the next file. Skip this optimization also diff --git a/src/split.c b/src/split.c index 062aede..53ee271 100644 --- a/src/split.c +++ b/src/split.c @@ -1069,7 +1069,7 @@ main (int argc, char **argv) static char const multipliers[] = "bEGKkMmPTYZ0"; int c; int digits_optind = 0; - off_t file_size; + off_t file_size IF_LINT (= 0); initialize_main (&argc, &argv); set_program_name (argv[0]); @@ -1340,12 +1340,18 @@ main (int argc, char **argv) if (in_blk_size == 0) in_blk_size = io_blksize (stat_buf); - /* stat.st_size is valid only for regular files. For others, use 0. */ - file_size = S_ISREG (stat_buf.st_mode) ? stat_buf.st_size : 0; - if (split_type == type_chunk_bytes || split_type == type_chunk_lines) { off_t input_offset = lseek (STDIN_FILENO, 0, SEEK_CUR); + if (usable_st_size (&stat_buf)) + file_size = stat_buf.st_size; + else if (0 <= input_offset) + { + file_size = lseek (STDIN_FILENO, 0, SEEK_END); + input_offset = (file_size < 0 + ? file_size + : lseek (STDIN_FILENO, input_offset, SEEK_SET)); + } if (input_offset < 0) error (EXIT_FAILURE, 0, _("%s: cannot determine file size"), quote (infile)); diff --git a/src/system.h b/src/system.h index e3d3156..06f09cb 100644 --- a/src/system.h +++ b/src/system.h @@ -605,7 +605,8 @@ bad_cast (char const *s) static inline bool usable_st_size (struct stat const *sb) { - return S_ISREG (sb->st_mode) || S_TYPEISSHM (sb) || S_TYPEISTMO (sb); + return (S_ISREG (sb->st_mode) || S_ISLNK (sb->st_mode) + || S_TYPEISSHM (sb) || S_TYPEISTMO (sb)); } void usage (int status) ATTRIBUTE_NORETURN; diff --git a/src/truncate.c b/src/truncate.c index 9b847d2..49d9732 100644 --- a/src/truncate.c +++ b/src/truncate.c @@ -157,23 +157,36 @@ do_ftruncate (int fd, char const *fname, off_t ssize, off_t rsize, } if (rel_mode) { - uintmax_t const fsize = rsize < 0 ? sb.st_size : rsize; + uintmax_t fsize; - if (rsize < 0) /* fstat used above to get size. */ + if (0 <= rsize) + fsize = rsize; + else { - if (!S_ISREG (sb.st_mode) && !S_TYPEISSHM (&sb)) + off_t file_size; + if (usable_st_size (&sb)) { - error (0, 0, _("cannot get the size of %s"), quote (fname)); - return false; + file_size = sb.st_size; + if (file_size < 0) + { + /* Sanity check. Overflow is the only reason I can think + this would ever go negative. */ + error (0, 0, _("%s has unusable, apparently negative size"), + quote (fname)); + return false; + } } - if (sb.st_size < 0) + else { - /* Sanity check. Overflow is the only reason I can think - this would ever go negative. */ - error (0, 0, _("%s has unusable, apparently negative size"), - quote (fname)); - return false; + file_size = lseek (fd, 0, SEEK_END); + if (file_size < 0) + { + error (0, errno, _("cannot get the size of %s"), + quote (fname)); + return false; + } } + fsize = file_size; } if (rel_mode == rm_min) @@ -348,15 +361,28 @@ main (int argc, char **argv) { /* FIXME: Maybe support getting size of block devices. */ struct stat sb; + off_t file_size = -1; if (stat (ref_file, &sb) != 0) error (EXIT_FAILURE, errno, _("cannot stat %s"), quote (ref_file)); - if (!S_ISREG (sb.st_mode) && !S_TYPEISSHM (&sb)) - error (EXIT_FAILURE, 0, _("cannot get the size of %s"), + if (usable_st_size (&sb)) + file_size = sb.st_size; + else + { + int ref_fd = open (ref_file, O_RDONLY); + if (0 <= ref_fd) + { + off_t file_end = lseek (ref_fd, 0, SEEK_END); + if (0 <= file_end && close (ref_fd) == 0) + file_size = file_end; + } + } + if (file_size < 0) + error (EXIT_FAILURE, errno, _("cannot get the size of %s"), quote (ref_file)); if (!got_size) - size = sb.st_size; + size = file_size; else - rsize = sb.st_size; + rsize = file_size; } oflags = O_WRONLY | (no_create ? 0 : O_CREAT) | O_NONBLOCK; diff --git a/tests/misc/truncate-dir-fail b/tests/misc/truncate-dir-fail index 1167352..54a3147 100755 --- a/tests/misc/truncate-dir-fail +++ b/tests/misc/truncate-dir-fail @@ -22,7 +22,4 @@ print_ver_ truncate # truncate on dir not allowed truncate -s+0 . && fail=1 -# getting the size of a dir is not allowed -truncate -r. file && fail=1 - Exit $fail -- 1.7.6.5
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Thu, 10 May 2012 07:50:01 GMT) Full text and rfc822 format available.Message #55 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Thu, 10 May 2012 09:47:31 +0200
Paul Eggert wrote: > On 05/09/2012 11:36 PM, Jim Meyering wrote: >> I see you pushed it. >> Thanks for adding that line to NEWS: > > You're welcome. I shook free some time to adjust the st_size patch, > and here's a new version that I think incorporates all the comments > so far: > > From 0c9661af1b29e2999030cb93003d98673faabf83 Mon Sep 17 00:00:00 2001 > From: Paul Eggert <eggert <at> cs.ucla.edu> > Date: Wed, 9 May 2012 23:53:16 -0700 > Subject: [PATCH] maint: handle file sizes more reliably ... Thanks for sending that. Unfortunately, I am unable to apply it because something has mangled the patch. Note how blank lines of context have been omitted and lead to a space being inserted before the "+" or "-" on the following line: > diff --git a/NEWS b/NEWS > index 7ef2f54..e56f8fa 100644 > --- a/NEWS > +++ b/NEWS > @@ -27,6 +27,9 @@ GNU coreutils NEWS -*- outline -*- > > ** New features Omitted blank line of context. > + split and truncate now allow any seekable files in situations where > + the file size is needed, instead of insisting on regular files. > + > fmt now accepts the --goal=WIDTH (-g) option. > ** Changes in behavior > diff --git a/src/dd.c b/src/dd.c > index 4626de2..163d514 100644 > --- a/src/dd.c > +++ b/src/dd.c > @@ -1544,7 +1544,7 @@ skip (int fdesc, char const *file, uintmax_t records, size_t blocksize, > struct stat st; > if (fstat (STDIN_FILENO, &st) != 0) > error (EXIT_FAILURE, errno, _("cannot fstat %s"), quote (file)); > - if (S_ISREG (st.st_mode) && st.st_size < (input_offset + offset)) > + if (usable_st_size (&st) && st.st_size < input_offset + offset) > { > /* When skipping past EOF, return the number of _full_ blocks > * that are not skipped, and set offset to EOF, so the caller > @@ -2104,8 +2104,8 @@ dd_copy (void) > } > } There should be a blank line of context here. > - /* If the last write was converted to a seek, then for a regular file, > - ftruncate to extend the size. */ > + /* If the last write was converted to a seek, then for a regular file > + or shared memory object, ftruncate to extend the size. */ Normally I can apply mangled change sets using patch, but not this time. Every single hunk is rejected.
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Thu, 10 May 2012 08:13:02 GMT) Full text and rfc822 format available.Message #58 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Thu, 10 May 2012 10:10:38 +0200
Jim Meyering wrote: > Paul Eggert wrote: >> On 05/09/2012 11:36 PM, Jim Meyering wrote: >>> I see you pushed it. >>> Thanks for adding that line to NEWS: >> >> You're welcome. I shook free some time to adjust the st_size patch, >> and here's a new version that I think incorporates all the comments >> so far: >> >> From 0c9661af1b29e2999030cb93003d98673faabf83 Mon Sep 17 00:00:00 2001 >> From: Paul Eggert <eggert <at> cs.ucla.edu> >> Date: Wed, 9 May 2012 23:53:16 -0700 >> Subject: [PATCH] maint: handle file sizes more reliably > ... > > Thanks for sending that. > Unfortunately, I am unable to apply it because > something has mangled the patch. No need to resend. I figured it out. Filtering your patch through sed 's/^ / /' was enough to undo the offending transformation. Then it applied with "git am FILE".
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Thu, 10 May 2012 08:51:01 GMT) Full text and rfc822 format available.Message #61 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Pádraig Brady <P <at> draigBrady.com> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: 11424 <at> debbugs.gnu.org, Jim Meyering <jim <at> meyering.net>, Samuel Thibault <samuel.thibault <at> gnu.org> Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Thu, 10 May 2012 09:48:19 +0100
On 05/10/2012 07:55 AM, Paul Eggert wrote: > diff --git a/src/truncate.c b/src/truncate.c > @@ -348,15 +361,28 @@ main (int argc, char **argv) > { > /* FIXME: Maybe support getting size of block devices. */ Can the above be removed, as I think SEEK_END should handle devices. Otherwise looks good. thanks, Pádraig.
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Thu, 10 May 2012 08:58:02 GMT) Full text and rfc822 format available.Message #64 received at 11424 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Pádraig Brady <P <at> draigBrady.com> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, Samuel Thibault <samuel.thibault <at> gnu.org>, 11424 <at> debbugs.gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Thu, 10 May 2012 10:54:46 +0200
Pádraig Brady wrote: > On 05/10/2012 07:55 AM, Paul Eggert wrote: >> diff --git a/src/truncate.c b/src/truncate.c > >> @@ -348,15 +361,28 @@ main (int argc, char **argv) >> { >> /* FIXME: Maybe support getting size of block devices. */ > > Can the above be removed, as I think SEEK_END should handle devices. Thanks. I'll amend to remove that comment. > Otherwise looks good. I reached the same conclusion. The only other change is to the bug URL in the log message: s/debbugs/bugs/
bug-coreutils <at> gnu.org
:bug#11424
; Package coreutils
.
(Thu, 10 May 2012 15:38:01 GMT) Full text and rfc822 format available.Message #67 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: bug-coreutils <at> gnu.org Subject: Re: bug#11424: coreutils: split tests hang on /dev/zero on GNU/Hurd Date: Thu, 10 May 2012 08:34:43 -0700
On 05/10/2012 01:10 AM, Jim Meyering wrote: > Filtering your patch through sed 's/^ / /' > was enough to undo the offending transformation. Sorry about that. It's probably the recent upgrade to Thunderbird 12.0.1 on my home desktop. I'll have to look into why it's doing that, or just give up and use text attachments (urk -- why do they make email so hard?).
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Fri, 08 Jun 2012 11:24:02 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.