GNU bug report logs - #17145
head fails with implicit stdin on darwin

Previous Next

Package: coreutils;

Reported by: Denis Excoffier <gcc <at> Denis-Excoffier.org>

Date: Sun, 30 Mar 2014 21:44:02 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17145 in the body.
You can then email your comments to 17145 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#17145; Package coreutils. (Sun, 30 Mar 2014 21:44:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Denis Excoffier <gcc <at> Denis-Excoffier.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sun, 30 Mar 2014 21:44:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Denis Excoffier <gcc <at> Denis-Excoffier.org>
To: bug-coreutils <at> gnu.org
Subject: head fails with implicit stdin on darwin
Date: Sun, 30 Mar 2014 22:40:18 +0200
Hello,

head -n -1 -- -
or equivalently
head -n -1
returns immediately (ie does not wait for further stdin) and prints nothing.

I use coreutils 8.22 compiled (with gcc-4.8.2) on top of darwin 13.1.0 (Mavericks).

However the following seem to work perfectly:
head -n 1
head -c -1
cat | head -n -1
head -n -1 ---presume-input-pipe
on cygwin: head -n -1

What is weird on my system is lseek() at the beginning of elide_tail_lines_file():
lseek(fd, 0, SEEK_CUR) returns a (random?) number, something like 6735, 539 etc.
lseek(fd, 0, SEEK_END) returns 0

Hope this helps,

Regards,

Denis Excoffier.



Information forwarded to bug-coreutils <at> gnu.org:
bug#17145; Package coreutils. (Mon, 31 Mar 2014 07:42:02 GMT) Full text and rfc822 format available.

Message #8 received at 17145 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Denis Excoffier <gcc <at> Denis-Excoffier.org>, 17145 <at> debbugs.gnu.org
Subject: Re: bug#17145: head fails with implicit stdin on darwin
Date: Mon, 31 Mar 2014 00:41:19 -0700
Denis Excoffier wrote:
> I use coreutils 8.22 compiled (with gcc-4.8.2) on top of darwin 13.1.0 (Mavericks).

Presumably it's a darwin-specific problem, since it doesn't occur on 
GNU/Linux or on Cygwin.  I'm afraid that meas you'll have to investigate 
it more.  What happens when you use the darwin equivalent of 'strace', 
whatever that is these days?




Information forwarded to bug-coreutils <at> gnu.org:
bug#17145; Package coreutils. (Mon, 31 Mar 2014 12:33:02 GMT) Full text and rfc822 format available.

Message #11 received at 17145 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Denis Excoffier <gcc <at> Denis-Excoffier.org>
Cc: 17145 <at> debbugs.gnu.org
Subject: Re: bug#17145: head fails with implicit stdin on darwin
Date: Mon, 31 Mar 2014 13:32:50 +0100
[Message part 1 (text/plain, inline)]
On 03/30/2014 09:40 PM, Denis Excoffier wrote:
> Hello,
> 
> head -n -1 -- -
> or equivalently
> head -n -1
> returns immediately (ie does not wait for further stdin) and prints nothing.
> 
> I use coreutils 8.22 compiled (with gcc-4.8.2) on top of darwin 13.1.0 (Mavericks).
> 
> However the following seem to work perfectly:
> head -n 1
> head -c -1
> cat | head -n -1
> head -n -1 ---presume-input-pipe
> on cygwin: head -n -1
> 
> What is weird on my system is lseek() at the beginning of elide_tail_lines_file():
> lseek(fd, 0, SEEK_CUR) returns a (random?) number, something like 6735, 539 etc.
> lseek(fd, 0, SEEK_END) returns 0

So:
  head -n -1 # returns immediately
while:
  cat | head -n -1 # waits as expected

It seems we might be using non portable code here. POSIX says:

  "The behavior of lseek() on devices which are incapable of seeking is implementation-defined.
  The value of the file offset associated with such a device is undefined."

and also:

  "The lseek() function shall fail [with ESPIPE if] the fildes argument is associated with a pipe, FIFO, or socket"

So tty devices would come outside of this POSIX scope.
Furthermore the FreeBSD lseek man pages states:

  "Some devices are incapable of seeking and POSIX does not specify which
   devices must support it.

   Linux specific restrictions: using lseek on a tty device returns
   ESPIPE. Other systems return the number of written characters, using
   SEEK_SET to set the counter. Some devices, e.g. /dev/null do not cause
   the error ESPIPE, but return a pointer which value is undefined."

Now head(1) isn't the only place we use this logic. In dd we have:

  offset = lseek (STDIN_FILENO, 0, SEEK_CUR);
  input_seekable = (0 <= offset);

I wonder should be be using something like:

  bool seekable (int fd)
  {
    return ! isatty (fd) && lseek (fd, 0, SEEK_CUR) >= 0;
  }

Though this only handles the tty case, and there
could be other devices for which this could be an issue.
So the general question is, is there a way we can robustly
determine if we have a seekable device or not?
Perhaps by using SEEK_SET in combination with SEEK_CUR,
but notice the BSD lseek man page above says that tty devices
support SEEK_SET also :/ Anyway...

Note the original head(1) code to detect seekable input was introduced with:
  http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=61ba51a6
and that was changed recently due to a coverity identified logic issue, to:
  http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=5fdb5082
However that now logically consistent code will return immediately in your case.

I also notice the related `head -c -1` check is more conservative
in that it only uses the more efficient lseek() code for regular files,
which would mean we don't operate as efficiently as we could on a disk device
for example. But that's much better than undefined operation of course.
If we were to do the same for lines then we would also introduce a change
in behavior with devices like /dev/zero. Currently on Linux, this will return immediately:
  head -n -1 /dev/zero
I.E. we currently treat such devices as empty, and return immediately with success status,
whereas treating as a stream of NULs, would result in memory exhaustion while buffering
waiting for a complete line. That is probably the more consistent operation at least.

So the attached uses this more conservative test for the --lines=-N case.

thanks,
Pádraig.
[head-tty-bsd.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#17145; Package coreutils. (Mon, 31 Mar 2014 17:59:03 GMT) Full text and rfc822 format available.

Message #14 received at 17145 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>, 
 Denis Excoffier <gcc <at> Denis-Excoffier.org>
Cc: 17145 <at> debbugs.gnu.org
Subject: Re: bug#17145: head fails with implicit stdin on darwin
Date: Mon, 31 Mar 2014 10:57:59 -0700
On 03/31/2014 05:32 AM, Pádraig Brady wrote:
> I also notice the related `head -c -1` check is more conservative
I just now noticed that head -c -1 is obviously buggy in some cases, 
andfixed that in commit d08381bc261d95502a205f7214686f80383c9692.




Information forwarded to bug-coreutils <at> gnu.org:
bug#17145; Package coreutils. (Mon, 31 Mar 2014 18:20:01 GMT) Full text and rfc822 format available.

Message #17 received at 17145 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 17145 <at> debbugs.gnu.org, Denis Excoffier <gcc <at> Denis-Excoffier.org>
Subject: Re: bug#17145: head fails with implicit stdin on darwin
Date: Mon, 31 Mar 2014 19:19:54 +0100
On 03/31/2014 06:57 PM, Paul Eggert wrote:
> On 03/31/2014 05:32 AM, Pádraig Brady wrote:
>> I also notice the related `head -c -1` check is more conservative
> I just now noticed that head -c -1 is obviously buggy in some cases, andfixed that in commit d08381bc261d95502a205f7214686f80383c9692.

http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=d08381bc

+1

thanks,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#17145; Package coreutils. (Mon, 31 Mar 2014 19:44:02 GMT) Full text and rfc822 format available.

Message #20 received at 17145 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>, 
 Denis Excoffier <gcc <at> Denis-Excoffier.org>
Cc: 17145 <at> debbugs.gnu.org
Subject: Re: bug#17145: head fails with implicit stdin on darwin
Date: Mon, 31 Mar 2014 12:43:48 -0700
[Message part 1 (text/plain, inline)]
On 03/31/2014 05:32 AM, Pádraig Brady wrote:
> It seems we might be using non portable code here.

Yes.

> In dd we have:

I haven't looked at dd carefully, but I assume it's OK. It's a low-level 
program intended for people who want to exercise system calls pretty 
much directly.  If someone wants to lseek on a device, dd should 
probably do just that.

> is there a way we can robustly determine if we have a seekable device 
> or not?

Not as far as I know.  In fact, I expect much of the coreutils code uses 
the word "seekable" to mean "lseek returns a nonnegative value", which 
isn't the same thing that POSIX means. Quite possibly we should clean up 
coreutils' terminology here.

> head -n -1 /dev/zero I.E. we currently treat such devices as empty, 
> and return immediately with success status, whereas treating as a 
> stream of NULs, would result in memory exhaustion while buffering 
> waiting for a complete line. That is probably the more consistent 
> operation at least.

Yes, that sounds right.

> So the attached uses this more conservative test for the --lines=-N case.

Inoticed other instances of the problem, and in general the whole lseek 
business needs a cleanup; the code's too complicated.  Attached is a 
proposed cleanup+fix patch.
[0001-head-port-to-Darwin-and-use-simpler-seeks.patch (text/x-patch, attachment)]

Added tag(s) patch. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Mon, 31 Mar 2014 19:45:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#17145; Package coreutils. (Mon, 31 Mar 2014 23:15:03 GMT) Full text and rfc822 format available.

Message #25 received at 17145 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 17145 <at> debbugs.gnu.org, Denis Excoffier <gcc <at> Denis-Excoffier.org>
Subject: Re: bug#17145: head fails with implicit stdin on darwin
Date: Tue, 01 Apr 2014 00:14:04 +0100
Very nice cleanup. Comments below...

On 03/31/2014 08:43 PM, Paul Eggert wrote:
> diff --git a/NEWS b/NEWS

> +  head no longer assumes that lseek fails on unseekable devices.
> +  [bug introduced with the --bytes=-N feature in coreutils-5.0.1]

I slightly prefer my NEWS entry since it details the consequences
rather than the mechanism, and so could be more meaningful to end users.

> @@ -833,14 +802,24 @@ head (const char *filename, int fd, uintmax_t n_units, bool count_lines,
>  
>    if (elide_from_end)
>      {
> -      if (count_lines)
> +      off_t current_pos = -1, size = -1;
> +      if (! presume_input_pipe)
>          {
> -          return elide_tail_lines_file (filename, fd, n_units);
> +          struct stat st;
> +          if (fstat (fd, &st) != 0)
> +            error (0, errno, _("cannot fstat %s"), quotearg_colon (filename));
> +          if (S_ISREG (st.st_mode))

s/if/else if/

Could you also update Denis' current email address in THANKS.in

Otherwise it all looks good.

thanks!
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#17145; Package coreutils. (Tue, 01 Apr 2014 07:24:01 GMT) Full text and rfc822 format available.

Message #28 received at 17145 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Pádraig Brady
 <P <at> draigBrady.com>, Denis Excoffier <gcc <at> Denis-Excoffier.org>
Cc: 17145 <at> debbugs.gnu.org
Subject: Re: bug#17145: head fails with implicit stdin on darwin
Date: Tue, 01 Apr 2014 09:23:19 +0200
On 03/31/2014 09:43 PM, Paul Eggert wrote:

> Subject: [PATCH] head: port to Darwin and use simpler seeks
>
> This removes an unportable assumption that if lseek succeeds, the
> file is capable of seeking.  See Bug#17145.

Minor note: please
  s,Bug#17145,http://bugs.gnu.org/17145,

Thanks & have a nice day,
Berny




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Wed, 02 Apr 2014 14:48:02 GMT) Full text and rfc822 format available.

Notification sent to Denis Excoffier <gcc <at> Denis-Excoffier.org>:
bug acknowledged by developer. (Wed, 02 Apr 2014 14:48:03 GMT) Full text and rfc822 format available.

Message #33 received at 17145-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 17145-done <at> debbugs.gnu.org, Denis Excoffier <gcc <at> Denis-Excoffier.org>
Subject: Re: bug#17145: head fails with implicit stdin on darwin
Date: Wed, 02 Apr 2014 07:47:11 -0700
Pádraig Brady wrote:
> s/if/else if/

Thanks.  Or better yet, return false if fstat fails.  I did that, plus 
the other changes you and Bernhard suggested, and installed the patch.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 01 May 2014 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 10 years and 20 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.