GNU bug report logs -
#17145
head fails with implicit stdin on darwin
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17145 in the body.
You can then email your comments to 17145 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#17145
; Package
coreutils
.
(Sun, 30 Mar 2014 21:44:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Denis Excoffier <gcc <at> Denis-Excoffier.org>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Sun, 30 Mar 2014 21:44:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hello,
head -n -1 -- -
or equivalently
head -n -1
returns immediately (ie does not wait for further stdin) and prints nothing.
I use coreutils 8.22 compiled (with gcc-4.8.2) on top of darwin 13.1.0 (Mavericks).
However the following seem to work perfectly:
head -n 1
head -c -1
cat | head -n -1
head -n -1 ---presume-input-pipe
on cygwin: head -n -1
What is weird on my system is lseek() at the beginning of elide_tail_lines_file():
lseek(fd, 0, SEEK_CUR) returns a (random?) number, something like 6735, 539 etc.
lseek(fd, 0, SEEK_END) returns 0
Hope this helps,
Regards,
Denis Excoffier.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#17145
; Package
coreutils
.
(Mon, 31 Mar 2014 07:42:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 17145 <at> debbugs.gnu.org (full text, mbox):
Denis Excoffier wrote:
> I use coreutils 8.22 compiled (with gcc-4.8.2) on top of darwin 13.1.0 (Mavericks).
Presumably it's a darwin-specific problem, since it doesn't occur on
GNU/Linux or on Cygwin. I'm afraid that meas you'll have to investigate
it more. What happens when you use the darwin equivalent of 'strace',
whatever that is these days?
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#17145
; Package
coreutils
.
(Mon, 31 Mar 2014 12:33:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 17145 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 03/30/2014 09:40 PM, Denis Excoffier wrote:
> Hello,
>
> head -n -1 -- -
> or equivalently
> head -n -1
> returns immediately (ie does not wait for further stdin) and prints nothing.
>
> I use coreutils 8.22 compiled (with gcc-4.8.2) on top of darwin 13.1.0 (Mavericks).
>
> However the following seem to work perfectly:
> head -n 1
> head -c -1
> cat | head -n -1
> head -n -1 ---presume-input-pipe
> on cygwin: head -n -1
>
> What is weird on my system is lseek() at the beginning of elide_tail_lines_file():
> lseek(fd, 0, SEEK_CUR) returns a (random?) number, something like 6735, 539 etc.
> lseek(fd, 0, SEEK_END) returns 0
So:
head -n -1 # returns immediately
while:
cat | head -n -1 # waits as expected
It seems we might be using non portable code here. POSIX says:
"The behavior of lseek() on devices which are incapable of seeking is implementation-defined.
The value of the file offset associated with such a device is undefined."
and also:
"The lseek() function shall fail [with ESPIPE if] the fildes argument is associated with a pipe, FIFO, or socket"
So tty devices would come outside of this POSIX scope.
Furthermore the FreeBSD lseek man pages states:
"Some devices are incapable of seeking and POSIX does not specify which
devices must support it.
Linux specific restrictions: using lseek on a tty device returns
ESPIPE. Other systems return the number of written characters, using
SEEK_SET to set the counter. Some devices, e.g. /dev/null do not cause
the error ESPIPE, but return a pointer which value is undefined."
Now head(1) isn't the only place we use this logic. In dd we have:
offset = lseek (STDIN_FILENO, 0, SEEK_CUR);
input_seekable = (0 <= offset);
I wonder should be be using something like:
bool seekable (int fd)
{
return ! isatty (fd) && lseek (fd, 0, SEEK_CUR) >= 0;
}
Though this only handles the tty case, and there
could be other devices for which this could be an issue.
So the general question is, is there a way we can robustly
determine if we have a seekable device or not?
Perhaps by using SEEK_SET in combination with SEEK_CUR,
but notice the BSD lseek man page above says that tty devices
support SEEK_SET also :/ Anyway...
Note the original head(1) code to detect seekable input was introduced with:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=61ba51a6
and that was changed recently due to a coverity identified logic issue, to:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=5fdb5082
However that now logically consistent code will return immediately in your case.
I also notice the related `head -c -1` check is more conservative
in that it only uses the more efficient lseek() code for regular files,
which would mean we don't operate as efficiently as we could on a disk device
for example. But that's much better than undefined operation of course.
If we were to do the same for lines then we would also introduce a change
in behavior with devices like /dev/zero. Currently on Linux, this will return immediately:
head -n -1 /dev/zero
I.E. we currently treat such devices as empty, and return immediately with success status,
whereas treating as a stream of NULs, would result in memory exhaustion while buffering
waiting for a complete line. That is probably the more consistent operation at least.
So the attached uses this more conservative test for the --lines=-N case.
thanks,
Pádraig.
[head-tty-bsd.patch (text/x-patch, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#17145
; Package
coreutils
.
(Mon, 31 Mar 2014 17:59:03 GMT)
Full text and
rfc822 format available.
Message #14 received at 17145 <at> debbugs.gnu.org (full text, mbox):
On 03/31/2014 05:32 AM, Pádraig Brady wrote:
> I also notice the related `head -c -1` check is more conservative
I just now noticed that head -c -1 is obviously buggy in some cases,
andfixed that in commit d08381bc261d95502a205f7214686f80383c9692.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#17145
; Package
coreutils
.
(Mon, 31 Mar 2014 18:20:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 17145 <at> debbugs.gnu.org (full text, mbox):
On 03/31/2014 06:57 PM, Paul Eggert wrote:
> On 03/31/2014 05:32 AM, Pádraig Brady wrote:
>> I also notice the related `head -c -1` check is more conservative
> I just now noticed that head -c -1 is obviously buggy in some cases, andfixed that in commit d08381bc261d95502a205f7214686f80383c9692.
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=d08381bc
+1
thanks,
Pádraig.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#17145
; Package
coreutils
.
(Mon, 31 Mar 2014 19:44:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 17145 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 03/31/2014 05:32 AM, Pádraig Brady wrote:
> It seems we might be using non portable code here.
Yes.
> In dd we have:
I haven't looked at dd carefully, but I assume it's OK. It's a low-level
program intended for people who want to exercise system calls pretty
much directly. If someone wants to lseek on a device, dd should
probably do just that.
> is there a way we can robustly determine if we have a seekable device
> or not?
Not as far as I know. In fact, I expect much of the coreutils code uses
the word "seekable" to mean "lseek returns a nonnegative value", which
isn't the same thing that POSIX means. Quite possibly we should clean up
coreutils' terminology here.
> head -n -1 /dev/zero I.E. we currently treat such devices as empty,
> and return immediately with success status, whereas treating as a
> stream of NULs, would result in memory exhaustion while buffering
> waiting for a complete line. That is probably the more consistent
> operation at least.
Yes, that sounds right.
> So the attached uses this more conservative test for the --lines=-N case.
Inoticed other instances of the problem, and in general the whole lseek
business needs a cleanup; the code's too complicated. Attached is a
proposed cleanup+fix patch.
[0001-head-port-to-Darwin-and-use-simpler-seeks.patch (text/x-patch, attachment)]
Added tag(s) patch.
Request was from
Paul Eggert <eggert <at> cs.ucla.edu>
to
control <at> debbugs.gnu.org
.
(Mon, 31 Mar 2014 19:45:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#17145
; Package
coreutils
.
(Mon, 31 Mar 2014 23:15:03 GMT)
Full text and
rfc822 format available.
Message #25 received at 17145 <at> debbugs.gnu.org (full text, mbox):
Very nice cleanup. Comments below...
On 03/31/2014 08:43 PM, Paul Eggert wrote:
> diff --git a/NEWS b/NEWS
> + head no longer assumes that lseek fails on unseekable devices.
> + [bug introduced with the --bytes=-N feature in coreutils-5.0.1]
I slightly prefer my NEWS entry since it details the consequences
rather than the mechanism, and so could be more meaningful to end users.
> @@ -833,14 +802,24 @@ head (const char *filename, int fd, uintmax_t n_units, bool count_lines,
>
> if (elide_from_end)
> {
> - if (count_lines)
> + off_t current_pos = -1, size = -1;
> + if (! presume_input_pipe)
> {
> - return elide_tail_lines_file (filename, fd, n_units);
> + struct stat st;
> + if (fstat (fd, &st) != 0)
> + error (0, errno, _("cannot fstat %s"), quotearg_colon (filename));
> + if (S_ISREG (st.st_mode))
s/if/else if/
Could you also update Denis' current email address in THANKS.in
Otherwise it all looks good.
thanks!
Pádraig.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#17145
; Package
coreutils
.
(Tue, 01 Apr 2014 07:24:01 GMT)
Full text and
rfc822 format available.
Message #28 received at 17145 <at> debbugs.gnu.org (full text, mbox):
On 03/31/2014 09:43 PM, Paul Eggert wrote:
> Subject: [PATCH] head: port to Darwin and use simpler seeks
>
> This removes an unportable assumption that if lseek succeeds, the
> file is capable of seeking. See Bug#17145.
Minor note: please
s,Bug#17145,http://bugs.gnu.org/17145,
Thanks & have a nice day,
Berny
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Wed, 02 Apr 2014 14:48:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Denis Excoffier <gcc <at> Denis-Excoffier.org>
:
bug acknowledged by developer.
(Wed, 02 Apr 2014 14:48:03 GMT)
Full text and
rfc822 format available.
Message #33 received at 17145-done <at> debbugs.gnu.org (full text, mbox):
Pádraig Brady wrote:
> s/if/else if/
Thanks. Or better yet, return false if fstat fails. I did that, plus
the other changes you and Bernhard suggested, and installed the patch.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 01 May 2014 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 10 years and 20 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.