GNU bug report logs - #12216
peek-char incorrectly *CONSUMES* eof

Previous Next

Package: guile;

Reported by: dwheeler <at> dwheeler.com

Date: Fri, 17 Aug 2012 02:03:01 UTC

Severity: normal

Done: Mark H Weaver <mhw <at> netris.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 12216 in the body.
You can then email your comments to 12216 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Fri, 17 Aug 2012 02:03:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to dwheeler <at> dwheeler.com:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Fri, 17 Aug 2012 02:03:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "David A. Wheeler" <dwheeler <at> dwheeler.com>
To: bug-guile <at> gnu.org
Cc: almkglor <at> gmail.com
Subject: peek-char incorrectly *CONSUMES* eof
Date: Thu, 16 Aug 2012 21:53:23 -0400 (EDT)
All:

Guile's peek-char has a bug; it incorrectly *consumes* eof instead of just reporting it.

According to R5RS, "The value returned by a call to peek-char is the same as the value that would have been returned by a call to read-char with the same port. The only difference is that the very next call to read-char or peek-char on that port will return the value returned by the preceding call to peek-char."

However, if the value returned is #eof, guile does *not* meet the spec; you *can* get two successive peek-chars with different results.

This doesn't matter for files, but it *does* matter for interactive use.  It means that successive peek-char calls will actually *READ* characters, and they can even different in results.  Which means that code that does several peek-chars can act oddly when someone tries to end it with control-D.

We've confirmed this bug is true for guile 2.0, 1.8, and 1.6.

You can confirm this by placing this in "bug-demo":
===================
(write (peek-char))
(write (peek-char))
======================

Then run "guile bug-demo".  Press control-D, then newline.  You can see that two successive calls to peek-char are reporting different results (eof, then newline), which is NEVER supposed to happen!

I just had to write some code to work around this, but it'd be nice for this to work "correctly" in the future for interactive use.  (This was code for the "readable" project, http://readable.sourceforge.net, where we do a lot with guile.)

Thanks for your time... and thanks for guile!

--- David A. Wheeler




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Tue, 05 Mar 2013 18:55:04 GMT) Full text and rfc822 format available.

Message #8 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: dwheeler <at> dwheeler.com
Cc: almkglor <at> gmail.com, 12216 <at> debbugs.gnu.org
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Tue, 05 Mar 2013 17:53:45 +0100
Hi,

On Fri 17 Aug 2012 03:53, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:

> Guile's peek-char has a bug; it incorrectly *consumes* eof instead of
> just reporting it.

I'm not sure there is a sane way to deal with this.  Guile's ports are
fundamentally binary, underneath.  The backing store is a byte buffer
that is decoded and encoded to Scheme characters at need, along with
vtable to fill and flush that buffer.  You can do strange things like
unread characters, switch the encoding, and read them back out in a
different encoding.  There is no way to represent EOF in that buffer.

I have the feeling that for interactive use, if you expect to read a EOF
from a port and then continue, you have to not use peek-char.  You need
to handle your own lookahead buffer.  I know it's not a great answer,
but I can't think of anything else that makes sense.

What do you think?

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Tue, 05 Mar 2013 19:18:02 GMT) Full text and rfc822 format available.

Message #11 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: "David A. Wheeler" <dwheeler <at> dwheeler.com>
To: "wingo" <wingo <at> pobox.com>
Cc: almkglor <almkglor <at> gmail.com>, 12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Tue, 05 Mar 2013 14:17:31 -0500 (EST)
I reported:
> > Guile's peek-char has a bug; it incorrectly *consumes* eof instead of
> > just reporting it.

Andy Wingo replied:
> I have the feeling that for interactive use, if you expect to read a EOF
> from a port and then continue, you have to not use peek-char.  You need
> to handle your own lookahead buffer.  I know it's not a great answer,
> but I can't think of anything else that makes sense.

We don't want to read an EOF and then continue.
We peek-char to not *CONSUME* an interactive EOF.
Once peek-char returns EOF, every following peek-char
should *also* return EOF until the EOF is read.
That isn't what currently happens.

E.G.:
(define (demo) (write (peek-char)) (write (peek-char)))

(demo)

; Now type control-D ENTER.
; It SHOULD produce #<eof> twice, after peeking EOF,
; but instead it produces #<eof> #\newline
#<eof>#<eof>guile> (demo)

--- David A. Wheeler




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Thu, 07 Mar 2013 21:34:02 GMT) Full text and rfc822 format available.

Message #14 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: dwheeler <at> dwheeler.com
Cc: almkglor <almkglor <at> gmail.com>, 12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Thu, 07 Mar 2013 22:32:38 +0100
On Tue 05 Mar 2013 20:17, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:

> I reported:
>> > Guile's peek-char has a bug; it incorrectly *consumes* eof instead of
>> > just reporting it.
>
> Andy Wingo replied:
>> I have the feeling that for interactive use, if you expect to read a EOF
>> from a port and then continue, you have to not use peek-char.  You need
>> to handle your own lookahead buffer.  I know it's not a great answer,
>> but I can't think of anything else that makes sense.
>
> We don't want to read an EOF and then continue.
> We peek-char to not *CONSUME* an interactive EOF.

I understand what it is you want.  But I don't know of any sane way to
implement it.

Given that it is an edge case -- peek-char on an interactive port -- I
think the thing to do is to document this inconsistency, and for you (in
your code) to implement some sort of abstraction that does not use
peek-char.

That is to say, use read-char, and do unread-char as necessary.  Or keep
your own buffer around.

I know it isn't nice, but I don't know how to do anything better.

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Sat, 09 Mar 2013 02:27:01 GMT) Full text and rfc822 format available.

Message #17 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Hartwig <mandyke <at> gmail.com>
To: Andy Wingo <wingo <at> pobox.com>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Sat, 9 Mar 2013 10:25:30 +0800
On 8 March 2013 05:32, Andy Wingo <wingo <at> pobox.com> wrote:
> On Tue 05 Mar 2013 20:17, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
>
>> I reported:
>>> > Guile's peek-char has a bug; it incorrectly *consumes* eof instead of
>>> > just reporting it.
>>
>> Andy Wingo replied:
>>> I have the feeling that for interactive use, if you expect to read a EOF
>>> from a port and then continue, you have to not use peek-char.  You need
>>> to handle your own lookahead buffer.  I know it's not a great answer,
>>> but I can't think of anything else that makes sense.
>>
>> We don't want to read an EOF and then continue.
>> We peek-char to not *CONSUME* an interactive EOF.
>
> I understand what it is you want.  But I don't know of any sane way to
> implement it.

Indeed.  There is a distinction between the end-of-file _object_
(Scheme) and end-of-transmission _character_ (ASCII ^D).  The two are
not equivalent, and when read-char and peek-char return the former
value it is only to signal a _current_ lack of characters and should
not be considered part of the character stream read from the port.

This is the same reason why, e.g. the end-of-file object can not be
passed to unread-char: it is not a character.

IIRC some other Schemes do handle the test case from OP, but I do not
agree that those semantics are sane.




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Sat, 09 Mar 2013 17:13:02 GMT) Full text and rfc822 format available.

Message #20 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: "David A. Wheeler" <dwheeler <at> dwheeler.com>
To: "mandyke" <mandyke <at> gmail.com>
Cc: wingo <wingo <at> pobox.com>, almkglor <almkglor <at> gmail.com>,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Sat, 09 Mar 2013 12:11:12 -0500 (EST)
Daniel Hartwig:
> Indeed.  There is a distinction between the end-of-file _object_
> (Scheme) and end-of-transmission _character_ (ASCII ^D).  The two are
> not equivalent, ...

Sure.

> and when read-char and peek-char return the former
> value it is only to signal a _current_ lack of characters and should
> not be considered part of the character stream read from the port.

If I understand you correctly, that sounds more like "char-ready?".

In any case, I believe guile fails to meet the semantics
of R5RS, and I believe other versions of the Scheme spec,
in this area.  R5RS says:

> (peek-char) procedure
> (peek-char port) procedure
> Returns the next character available from the input port,
> without updating the port to point to the following character.
> If no more characters are available, an end of le
> object is returned. Port may be omitted, in which case it
> defaults to the value returned by current-input-port.
> Note: The value returned by a call to peek-char is the same as
> the value that would have been returned by a call to read-char
> with the same port. The only difference is that the very next call
> to read-char or peek-char on that port will return the value
> returned by the preceding call to peek-char. In particular, a
> call to peek-char on an interactive port will hang waiting for
> input whenever a call to read-char would have hung.

Note that the spec says that the very next call
"to read-char or peek-char on that port will return the value
returned by the preceding call to peek-char."  It does not say
the same *CHARACTER*, it says the same *VALUE*, and the
eof object is a possible value from peek-char and read-char.

I'd be okay if, after a read-char of EOF, it could return something else
as some sort of extension.  But (peek-char) (peek-char) (peek-char)
should return the same values 3 times, no matter what... even
if it's EOF.

--- David A. Wheeler




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Sun, 10 Mar 2013 00:25:02 GMT) Full text and rfc822 format available.

Message #23 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Hartwig <mandyke <at> gmail.com>
To: dwheeler <at> dwheeler.com
Cc: wingo <wingo <at> pobox.com>, almkglor <almkglor <at> gmail.com>,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Sun, 10 Mar 2013 08:23:41 +0800
On 10 March 2013 01:11, David A. Wheeler <dwheeler <at> dwheeler.com> wrote:
> Daniel Hartwig:
>> and when read-char and peek-char return the former
>> value it is only to signal a _current_ lack of characters and should
>> not be considered part of the character stream read from the port.
>
> If I understand you correctly, that sounds more like "char-ready?".

Erm yes, my mistake.




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Wed, 13 Mar 2013 11:04:02 GMT) Full text and rfc822 format available.

Message #26 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: dwheeler <at> dwheeler.com
Cc: 12216 <at> debbugs.gnu.org, almkglor <almkglor <at> gmail.com>,
	mandyke <mandyke <at> gmail.com>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Wed, 13 Mar 2013 12:02:47 +0100
On Sat 09 Mar 2013 18:11, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:

> (peek-char) (peek-char) (peek-char) should return the same values 3
> times, no matter what... even if it's EOF.

So, we are repeating ourselves here :)  I agree with you but I can't see
a good way of implementing this.  For that reason I am inclined to
document this inconsistency and close as a wontfix.

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Wed, 13 Mar 2013 13:11:01 GMT) Full text and rfc822 format available.

Message #29 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: "David A. Wheeler" <dwheeler <at> dwheeler.com>
To: "wingo" <wingo <at> pobox.com>
Cc: 12216 <12216 <at> debbugs.gnu.org>, almkglor <almkglor <at> gmail.com>,
	mandyke <mandyke <at> gmail.com>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Wed, 13 Mar 2013 09:09:23 -0400 (EDT)
Andy Wingo:

> So, we are repeating ourselves here :)  I agree with you but I can't see
> a good way of implementing this.

Would the per-port reader options be reasonable place to store the info
about EOF?

--- David A. Wheeler




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Wed, 13 Mar 2013 14:35:02 GMT) Full text and rfc822 format available.

Message #32 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: dwheeler <at> dwheeler.com
Cc: 12216 <12216 <at> debbugs.gnu.org>, almkglor <almkglor <at> gmail.com>,
	mandyke <mandyke <at> gmail.com>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Wed, 13 Mar 2013 15:33:33 +0100
On Wed 13 Mar 2013 14:09, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:

> Andy Wingo:
>
>> So, we are repeating ourselves here :)  I agree with you but I can't see
>> a good way of implementing this.
>
> Would the per-port reader options be reasonable place to store the info
> about EOF?

For your own purposes that would be fine.  But it cannot affect
read-char / peek-char / etc for everyone, because it would have bad
global effects on performance and correctness.  That's why I'm pushing
back on fixing this in Guile itself.

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Wed, 13 Mar 2013 18:12:01 GMT) Full text and rfc822 format available.

Message #35 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andy Wingo <wingo <at> pobox.com>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Wed, 13 Mar 2013 14:10:15 -0400
Andy Wingo <wingo <at> pobox.com> writes:

> On Wed 13 Mar 2013 14:09, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
>
>> Andy Wingo:
>>
>>> So, we are repeating ourselves here :)  I agree with you but I can't see
>>> a good way of implementing this.
>>
>> Would the per-port reader options be reasonable place to store the info
>> about EOF?
>
> For your own purposes that would be fine.  But it cannot affect
> read-char / peek-char / etc for everyone, because it would have bad
> global effects on performance and correctness.  That's why I'm pushing
> back on fixing this in Guile itself.

I don't know, it might not be that bad, now that we've agreed on a way
to extend the port structure in 2.0.  Maybe we could just have a "last
peek-char returned EOF" flag that would be consulted by the other read
primitives.

I agree that we should not allow EOF to be unread.

What do you think?

      Mark




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Wed, 13 Mar 2013 18:24:01 GMT) Full text and rfc822 format available.

Message #38 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Mark H Weaver <mhw <at> netris.org>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Wed, 13 Mar 2013 19:22:41 +0100
On Wed 13 Mar 2013 19:10, Mark H Weaver <mhw <at> netris.org> writes:

> Andy Wingo <wingo <at> pobox.com> writes:
>
>> On Wed 13 Mar 2013 14:09, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
>>
>>> Andy Wingo:
>>>
>>>> So, we are repeating ourselves here :)  I agree with you but I can't see
>>>> a good way of implementing this.
>>>
>>> Would the per-port reader options be reasonable place to store the info
>>> about EOF?
>>
>> For your own purposes that would be fine.  But it cannot affect
>> read-char / peek-char / etc for everyone, because it would have bad
>> global effects on performance and correctness.  That's why I'm pushing
>> back on fixing this in Guile itself.
>
> I don't know, it might not be that bad, now that we've agreed on a way
> to extend the port structure in 2.0.  Maybe we could just have a "last
> peek-char returned EOF" flag that would be consulted by the other read
> primitives.
>
> I agree that we should not allow EOF to be unread.
>
> What do you think?

I really doubt our ability to get it right.  Consider that we have code
that accesses the buffer directly, binary and textual ports, etc
etc... I don't think we're going to get this right.  Fixing this would
also have complexity and performance costs as well.

Maybe if it is somehow confined to scm_peek_char and scm_fill_input it
could be doable.

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Thu, 14 Mar 2013 17:16:01 GMT) Full text and rfc822 format available.

Message #41 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andy Wingo <wingo <at> pobox.com>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Thu, 14 Mar 2013 13:13:39 -0400
Andy Wingo <wingo <at> pobox.com> writes:

> On Wed 13 Mar 2013 19:10, Mark H Weaver <mhw <at> netris.org> writes:
>
>> I don't know, it might not be that bad, now that we've agreed on a way
>> to extend the port structure in 2.0.  Maybe we could just have a "last
>> peek-char returned EOF" flag that would be consulted by the other read
>> primitives.
>>
>> I agree that we should not allow EOF to be unread.
>>
>> What do you think?
>
> I really doubt our ability to get it right.  Consider that we have code
> that accesses the buffer directly, binary and textual ports, etc
> etc... I don't think we're going to get this right.

I think you're being overly pessimistic, but for the sake of argument,
suppose you're right.  Then we do the best we can and fix any remaining
problems as they're discovered later.  It's better to aim for
correctness and fail in our first attempt than to give up without
trying.

IMO, this is a bug, pure and simple.  I think we can do better than
"it's too hard, let's just punt on this".  We are implementors of
Scheme, which has lots of details that are hard to get right, yet we
rise to the challenge, do we not?

> Fixing this would also have complexity and performance costs as well.

One extra check per read primitive is hardly enough of a cost to
consider abandoning correctness, IMO.

     Mark




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Thu, 14 Mar 2013 17:45:02 GMT) Full text and rfc822 format available.

Message #44 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: "David A. Wheeler" <dwheeler <at> dwheeler.com>
To: "mhw" <mhw <at> netris.org>
Cc: wingo <wingo <at> pobox.com>, almkglor <almkglor <at> gmail.com>,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Thu, 14 Mar 2013 13:43:01 -0400 (EDT)
On Thu, 14 Mar 2013 13:13:39 -0400, Mark H Weaver <mhw <at> netris.org> wrote:
> One extra check per read primitive is hardly enough of a cost to
> consider abandoning correctness, IMO.

Though I've not reviewed this code, I *think* you'd only need to check when the buffer is empty.  When the buffer is empty, it'll take a while for the OS to fill it anyway, far more than checking a boolean value.

--- David A. Wheeler




Reply sent to Mark H Weaver <mhw <at> netris.org>:
You have taken responsibility. (Sun, 17 Mar 2013 23:50:02 GMT) Full text and rfc822 format available.

Notification sent to dwheeler <at> dwheeler.com:
bug acknowledged by developer. (Sun, 17 Mar 2013 23:50:02 GMT) Full text and rfc822 format available.

Message #49 received at 12216-done <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andy Wingo <wingo <at> pobox.com>
Cc: 12216-done <at> debbugs.gnu.org, Daniel Llorens <daniel.llorens <at> bluewin.ch>
Subject: Re: Problems with Guile's number printer
Date: Sun, 17 Mar 2013 19:47:35 -0400
I've pushed 1ea37620c2c1794f7685b312d2530676a078ada7 to stable-2.0,
which fixes our number printer.  Closing this bug.

   Thanks,
     Mark




Did not alter fixed versions and reopened. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 18 Mar 2013 01:17:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Mon, 18 Mar 2013 01:31:01 GMT) Full text and rfc822 format available.

Message #54 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andy Wingo <wingo <at> pobox.com>
Cc: Daniel Llorens <daniel.llorens <at> bluewin.ch>, 12216 <at> debbugs.gnu.org
Subject: Re: Problems with Guile's number printer
Date: Sun, 17 Mar 2013 21:28:15 -0400
I wrote:
> I've pushed 1ea37620c2c1794f7685b312d2530676a078ada7 to stable-2.0,
> which fixes our number printer.  Closing this bug.

Sorry, I sent this to the wrong bug.  Oops :)

    Mark




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Sat, 30 Mar 2013 22:41:07 GMT) Full text and rfc822 format available.

Message #57 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andy Wingo <wingo <at> pobox.com>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Sat, 30 Mar 2013 18:37:06 -0400
Andy Wingo <wingo <at> pobox.com> writes:
> Maybe if it is somehow confined to scm_peek_char and scm_fill_input it
> could be doable.

I think I now see a reasonable way to fix this.

First of all, all interfaces that do something like 'peek' would, when
returning an EOF, set a 'pending_eof' flag in the port structure.  Then
'scm_fill_input' would start by checking this flag; if it is set, then
it would clear the flag and return EOF.

We would also need to clear the 'pending_eof' flag in a few other
places, most notably in 'scm_end_input'.  In theory, that might be
enough, but to be on the safe side we should also clear the flag in any
procedure that writes or seeks.

Note that there is one inlined 'peek' function: 'scm_peek_byte_or_eof'
defined in inline.h.  We can change it to set the 'pending_eof' flag,
but we must consider that some code will be compiled against the old
version.  This is not a serious problem.  It just means that if someone
calls the old 'scm_peek_byte_or_eof' and gets EOF, the EOF will be
swallowed and not properly repeated by the next 'read'.  Oh well.

I plan to cook up a patch along these lines soon, as part of my ports
work for 2.0.8.

   Thoughts?
     Mark




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Sat, 30 Mar 2013 23:06:01 GMT) Full text and rfc822 format available.

Message #60 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: "David A. Wheeler" <dwheeler <at> dwheeler.com>
To: "mhw" <mhw <at> netris.org>
Cc: wingo <wingo <at> pobox.com>, almkglor <almkglor <at> gmail.com>,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Sat, 30 Mar 2013 19:02:31 -0400 (EDT)
Mark H Weaver:
> I think I now see a reasonable way to fix this.
> First of all, all interfaces that do something like 'peek' would, when
> returning an EOF, set a 'pending_eof' flag in the port structure...

That's awesome, and it sounds like it'd work well.

Thanks for looking into this.

--- David A. Wheeler




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Mon, 01 Apr 2013 02:10:01 GMT) Full text and rfc822 format available.

Message #63 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andy Wingo <wingo <at> pobox.com>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Sun, 31 Mar 2013 22:05:45 -0400
[Message part 1 (text/plain, inline)]
Mark H Weaver <mhw <at> netris.org> writes:

> Andy Wingo <wingo <at> pobox.com> writes:
>> Maybe if it is somehow confined to scm_peek_char and scm_fill_input it
>> could be doable.
>
> I think I now see a reasonable way to fix this.
>
> First of all, all interfaces that do something like 'peek' would, when
> returning an EOF, set a 'pending_eof' flag in the port structure.  Then
> 'scm_fill_input' would start by checking this flag; if it is set, then
> it would clear the flag and return EOF.
>
> We would also need to clear the 'pending_eof' flag in a few other
> places, most notably in 'scm_end_input'.  In theory, that might be
> enough, but to be on the safe side we should also clear the flag in any
> procedure that writes or seeks.

Upon further thought, I decided that we shouldn't clear it on writes.
If 'rw_random' is set, then 'scm_end_input' will be called anyway.  If
'rw_random' is not set, then we should *not* clear the flag on writes,
because it indicates that the read and write streams are independent,
e.g. a terminal or socket.

On the other hand, the 'pending_eof' flag should of course be cleared
when putting characters back (unget).

I've attached a proposed patch.  It depends on the three preliminary
patches posted here:

  http://lists.gnu.org/archive/html/guile-devel/2013-03/msg00221.html

Thoughts?

   Regards,
     Mark


[0004-Peeks-do-not-consume-EOFs.patch (text/x-diff, inline)]
From e732d035b86a990ea0cfb56a710ee9224d8dbd31 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 31 Mar 2013 19:06:51 -0400
Subject: [PATCH 4/5] Peeks do not consume EOFs.

* libguile/ports-internal.h (struct scm_port_internal): Add
  'pending_eof' flag.

* libguile/inline.h (scm_peek_byte_or_eof): Set 'pending_eof' flag
  before returning EOF.

* libguile/ports.c (scm_i_set_pending_eof): New function.
  (scm_i_clear_pending_eof): New static function.
  (scm_new_port_table_entry): Initialize 'pending_eof'.
  (scm_fill_input): Check for 'pending_eof'.
  (scm_end_input, scm_unget_byte, scm_seek): Clear 'pending_eof'.
  (scm_peek_char): Set 'pending_eof' flag before returning EOF.

* libguile/ports.h (scm_i_set_pending_eof): Add prototype.
---
 libguile/inline.h         |    5 ++++-
 libguile/ports-internal.h |    1 +
 libguile/ports.c          |   33 +++++++++++++++++++++++++++++++--
 libguile/ports.h          |    1 +
 4 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/libguile/inline.h b/libguile/inline.h
index 88ba7f7..fffe101 100644
--- a/libguile/inline.h
+++ b/libguile/inline.h
@@ -134,7 +134,10 @@ scm_peek_byte_or_eof (SCM port)
   if (pt->read_pos >= pt->read_end)
     {
       if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
-	return EOF;
+        {
+          scm_i_set_pending_eof (port);
+          return EOF;
+        }
     }
 
   c = *pt->read_pos;
diff --git a/libguile/ports-internal.h b/libguile/ports-internal.h
index d52eab2..5bde9d0 100644
--- a/libguile/ports-internal.h
+++ b/libguile/ports-internal.h
@@ -48,6 +48,7 @@ struct scm_port_internal
 {
   scm_t_port_encoding_mode encoding_mode;
   scm_t_iconv_descriptors *iconv_descriptors;
+  int pending_eof;
 };
 
 typedef struct scm_port_internal scm_t_port_internal;
diff --git a/libguile/ports.c b/libguile/ports.c
index 1ac9106..89e1714 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -241,6 +241,18 @@ scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM))
   scm_ptobs[SCM_TC2PTOBNUM (tc)].input_waiting = input_waiting;
 }
 
+void
+scm_i_set_pending_eof (SCM port)
+{
+  SCM_INTERNAL_PTAB_ENTRY (port)->pending_eof = 1;
+}
+
+static void
+scm_i_clear_pending_eof (SCM port)
+{
+  SCM_INTERNAL_PTAB_ENTRY (port)->pending_eof = 0;
+}
+
 
 
 SCM_DEFINE (scm_char_ready_p, "char-ready?", 0, 1, 0, 
@@ -634,6 +646,8 @@ scm_new_port_table_entry (scm_t_bits tag)
   entry->input_cd = pti;   /* XXX pointer to the internal port structure */
   entry->output_cd = NULL; /* XXX unused */
 
+  pti->pending_eof = 0;
+
   SCM_SET_CELL_TYPE (z, tag);
   SCM_SETPTAB_ENTRY (z, entry);
 
@@ -1410,9 +1424,16 @@ int
 scm_fill_input (SCM port)
 {
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
+  scm_t_port_internal *pti = SCM_INTERNAL_PTAB_ENTRY (port);
 
   assert (pt->read_pos == pt->read_end);
 
+  if (pti->pending_eof)
+    {
+      pti->pending_eof = 0;
+      return EOF;
+    }
+
   if (pt->read_buf == pt->putback_buf)
     {
       /* finished reading put-back chars.  */
@@ -1652,6 +1673,7 @@ scm_end_input (SCM port)
   long offset;
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
 
+  scm_i_clear_pending_eof (port);
   if (pt->read_buf == pt->putback_buf)
     {
       offset = pt->read_end - pt->read_pos;
@@ -1675,6 +1697,7 @@ scm_unget_byte (int c, SCM port)
 {
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
 
+  scm_i_clear_pending_eof (port);
   if (pt->read_buf == pt->putback_buf)
     /* already using the put-back buffer.  */
     {
@@ -1846,7 +1869,10 @@ SCM_DEFINE (scm_peek_char, "peek-char", 0, 1, 0,
       result = SCM_BOOL_F;
     }
   else if (c == EOF)
-    result = SCM_EOF_VAL;
+    {
+      scm_i_set_pending_eof (port);
+      result = SCM_EOF_VAL;
+    }
   else
     result = SCM_MAKE_CHAR (c);
 
@@ -1945,7 +1971,10 @@ SCM_DEFINE (scm_seek, "seek", 3, 0, 0,
 	SCM_MISC_ERROR ("port is not seekable", 
                         scm_cons (fd_port, SCM_EOL));
       else
-	rv = ptob->seek (fd_port, off, how);
+        {
+          scm_i_clear_pending_eof (fd_port);
+          rv = ptob->seek (fd_port, off, how);
+        }
       return scm_from_off_t_or_off64_t (rv);
     }
   else /* file descriptor?.  */
diff --git a/libguile/ports.h b/libguile/ports.h
index 95545cd..5aca26d 100644
--- a/libguile/ports.h
+++ b/libguile/ports.h
@@ -319,6 +319,7 @@ SCM_API SCM scm_set_port_filename_x (SCM port, SCM filename);
 SCM_INTERNAL const char *scm_i_default_port_encoding (void);
 SCM_INTERNAL void scm_i_set_default_port_encoding (const char *);
 SCM_INTERNAL void scm_i_set_port_encoding_x (SCM port, const char *str);
+SCM_INTERNAL void scm_i_set_pending_eof (SCM port);
 SCM_API SCM scm_port_encoding (SCM port);
 SCM_API SCM scm_set_port_encoding_x (SCM port, SCM encoding);
 SCM_INTERNAL scm_t_string_failed_conversion_handler
-- 
1.7.10.4


Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Mon, 01 Apr 2013 21:31:01 GMT) Full text and rfc822 format available.

Message #66 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andy Wingo <wingo <at> pobox.com>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Mon, 01 Apr 2013 17:27:14 -0400
[Message part 1 (text/plain, inline)]
The prerequisite patches are now in stable-2.0.  Here's an updated patch
that applies against current stable-2.0.  Otherwise it has not changed.

      Mark


[0001-Peeks-do-not-consume-EOFs.patch (text/x-diff, inline)]
From 7d9ebc691519af7236e1fec8dfefd5fc8784f875 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 31 Mar 2013 19:06:51 -0400
Subject: [PATCH] Peeks do not consume EOFs.

Fixes <http://bugs.gnu.org/12216>.

* libguile/ports-internal.h (struct scm_port_internal): Add
  'pending_eof' flag.

* libguile/inline.h (scm_peek_byte_or_eof): Set 'pending_eof' flag
  before returning EOF.

* libguile/ports.c (scm_i_set_pending_eof): New function.
  (scm_i_clear_pending_eof): New static function.
  (scm_new_port_table_entry): Initialize 'pending_eof'.
  (scm_fill_input): Check for 'pending_eof'.
  (scm_end_input, scm_unget_byte, scm_seek): Clear 'pending_eof'.
  (scm_peek_char): Set 'pending_eof' flag before returning EOF.

* libguile/ports.h (scm_i_set_pending_eof): Add prototype.
---
 libguile/inline.h         |    5 ++++-
 libguile/ports-internal.h |    1 +
 libguile/ports.c          |   32 ++++++++++++++++++++++++++++++--
 libguile/ports.h          |    1 +
 4 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/libguile/inline.h b/libguile/inline.h
index 88ba7f7..fffe101 100644
--- a/libguile/inline.h
+++ b/libguile/inline.h
@@ -134,7 +134,10 @@ scm_peek_byte_or_eof (SCM port)
   if (pt->read_pos >= pt->read_end)
     {
       if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
-	return EOF;
+        {
+          scm_i_set_pending_eof (port);
+          return EOF;
+        }
     }
 
   c = *pt->read_pos;
diff --git a/libguile/ports-internal.h b/libguile/ports-internal.h
index 73a788f..333d4fb 100644
--- a/libguile/ports-internal.h
+++ b/libguile/ports-internal.h
@@ -48,6 +48,7 @@ struct scm_port_internal
 {
   scm_t_port_encoding_mode encoding_mode;
   scm_t_iconv_descriptors *iconv_descriptors;
+  int pending_eof;
   SCM alist;
 };
 
diff --git a/libguile/ports.c b/libguile/ports.c
index becdbed..65e2e6f 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -241,6 +241,18 @@ scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM))
   scm_ptobs[SCM_TC2PTOBNUM (tc)].input_waiting = input_waiting;
 }
 
+void
+scm_i_set_pending_eof (SCM port)
+{
+  SCM_PORT_GET_INTERNAL (port)->pending_eof = 1;
+}
+
+static void
+scm_i_clear_pending_eof (SCM port)
+{
+  SCM_PORT_GET_INTERNAL (port)->pending_eof = 0;
+}
+
 SCM
 scm_i_port_alist (SCM port)
 {
@@ -645,6 +657,7 @@ scm_new_port_table_entry (scm_t_bits tag)
   entry->input_cd = pti;   /* XXX pointer to the internal port structure */
   entry->output_cd = NULL; /* XXX unused */
 
+  pti->pending_eof = 0;
   pti->alist = SCM_EOL;
 
   SCM_SET_CELL_TYPE (z, tag);
@@ -1423,9 +1436,16 @@ int
 scm_fill_input (SCM port)
 {
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
+  scm_t_port_internal *pti = SCM_PORT_GET_INTERNAL (port);
 
   assert (pt->read_pos == pt->read_end);
 
+  if (pti->pending_eof)
+    {
+      pti->pending_eof = 0;
+      return EOF;
+    }
+
   if (pt->read_buf == pt->putback_buf)
     {
       /* finished reading put-back chars.  */
@@ -1665,6 +1685,7 @@ scm_end_input (SCM port)
   long offset;
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
 
+  scm_i_clear_pending_eof (port);
   if (pt->read_buf == pt->putback_buf)
     {
       offset = pt->read_end - pt->read_pos;
@@ -1688,6 +1709,7 @@ scm_unget_byte (int c, SCM port)
 {
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
 
+  scm_i_clear_pending_eof (port);
   if (pt->read_buf == pt->putback_buf)
     /* already using the put-back buffer.  */
     {
@@ -1859,7 +1881,10 @@ SCM_DEFINE (scm_peek_char, "peek-char", 0, 1, 0,
       result = SCM_BOOL_F;
     }
   else if (c == EOF)
-    result = SCM_EOF_VAL;
+    {
+      scm_i_set_pending_eof (port);
+      result = SCM_EOF_VAL;
+    }
   else
     result = SCM_MAKE_CHAR (c);
 
@@ -1958,7 +1983,10 @@ SCM_DEFINE (scm_seek, "seek", 3, 0, 0,
 	SCM_MISC_ERROR ("port is not seekable", 
                         scm_cons (fd_port, SCM_EOL));
       else
-	rv = ptob->seek (fd_port, off, how);
+        {
+          scm_i_clear_pending_eof (fd_port);
+          rv = ptob->seek (fd_port, off, how);
+        }
       return scm_from_off_t_or_off64_t (rv);
     }
   else /* file descriptor?.  */
diff --git a/libguile/ports.h b/libguile/ports.h
index 53d5081..5c9334a 100644
--- a/libguile/ports.h
+++ b/libguile/ports.h
@@ -316,6 +316,7 @@ SCM_API SCM scm_port_column (SCM port);
 SCM_API SCM scm_set_port_column_x (SCM port, SCM line);
 SCM_API SCM scm_port_filename (SCM port);
 SCM_API SCM scm_set_port_filename_x (SCM port, SCM filename);
+SCM_INTERNAL void scm_i_set_pending_eof (SCM port);
 SCM_INTERNAL SCM scm_i_port_alist (SCM port);
 SCM_INTERNAL void scm_i_set_port_alist_x (SCM port, SCM alist);
 SCM_INTERNAL const char *scm_i_default_port_encoding (void);
-- 
1.7.10.4


Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Mon, 01 Apr 2013 21:37:01 GMT) Full text and rfc822 format available.

Message #69 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: "David A. Wheeler" <dwheeler <at> dwheeler.com>
To: "mhw" <mhw <at> netris.org>
Cc: wingo <wingo <at> pobox.com>, almkglor <almkglor <at> gmail.com>,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Mon, 01 Apr 2013 17:33:50 -0400 (EDT)
On Mon, 01 Apr 2013 17:27:14 -0400, Mark H Weaver <mhw <at> netris.org> wrote:

> The prerequisite patches are now in stable-2.0.  Here's an updated patch
> that applies against current stable-2.0.  Otherwise it has not changed.

Awesome!! Thanks for tracking that down and finding a solution.

--- David A. Wheeler




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Tue, 02 Apr 2013 08:02:01 GMT) Full text and rfc822 format available.

Message #72 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Mark H Weaver <mhw <at> netris.org>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Tue, 02 Apr 2013 09:58:43 +0200
Looks like a good start.  Two comments:

On Mon 01 Apr 2013 23:27, Mark H Weaver <mhw <at> netris.org> writes:

> --- a/libguile/inline.h
> +++ b/libguile/inline.h
> @@ -134,7 +134,10 @@ scm_peek_byte_or_eof (SCM port)
>    if (pt->read_pos >= pt->read_end)
>      {
>        if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
> -	return EOF;
> +        {
> +          scm_i_set_pending_eof (port);
> +          return EOF;
> +        }
>      }
>  

1. I don't much like the addition of this call to this inline function.
Can we move all the non-fast-path code into some other function?  That
would include the flush call as well.

2. I think we probably need some tests.

Thanks for looking at this :-)

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Tue, 02 Apr 2013 19:45:01 GMT) Full text and rfc822 format available.

Message #75 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andy Wingo <wingo <at> pobox.com>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Tue, 02 Apr 2013 15:41:22 -0400
[Message part 1 (text/plain, inline)]
Hi Andy,

Andy Wingo <wingo <at> pobox.com> writes:
> Looks like a good start.  Two comments:
>
> On Mon 01 Apr 2013 23:27, Mark H Weaver <mhw <at> netris.org> writes:
>
> 1. I don't much like the addition of this call to this inline function.
> Can we move all the non-fast-path code into some other function?  That
> would include the flush call as well.

Good idea!  I've posted separate patches to do that here:

  "[PATCH] Move slow path out of 'scm_get_byte_or_eof' et al"
  http://lists.gnu.org/archive/html/guile-devel/2013-04/msg00032.html

> 2. I think we probably need some tests.

I've attached a new patch which adds tests.  Note that this patch
depends upon the "Move slow path out" patches referenced above.

More thoughts?

   Thanks!
     Mark


[0001-Peeks-do-not-consume-EOFs.patch (text/x-diff, inline)]
From 688dd519d7be102b9d5ce2193883c33d82f8df7b Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 31 Mar 2013 19:06:51 -0400
Subject: [PATCH] Peeks do not consume EOFs.

Fixes <http://bugs.gnu.org/12216>.

* libguile/ports-internal.h (struct scm_port_internal): Add
  'pending_eof' flag.

* libguile/ports.c (scm_i_set_pending_eof, scm_i_clear_pending_eof): New
  static functions.
  (scm_new_port_table_entry): Initialize 'pending_eof'.
  (scm_i_fill_input): Check for 'pending_eof'.
  (scm_i_peek_byte_or_eof): Set 'pending_eof' flag before returning EOF.
  (scm_end_input, scm_unget_byte, scm_seek): Clear 'pending_eof'.
  (scm_peek_char): Set 'pending_eof' flag before returning EOF.

* test-suite/tests/ports.test ("pending EOF behavior"): Add tests.
---
 libguile/ports-internal.h   |    1 +
 libguile/ports.c            |   37 +++++++++++++++++--
 test-suite/tests/ports.test |   84 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 119 insertions(+), 3 deletions(-)

diff --git a/libguile/ports-internal.h b/libguile/ports-internal.h
index 73a788f..333d4fb 100644
--- a/libguile/ports-internal.h
+++ b/libguile/ports-internal.h
@@ -48,6 +48,7 @@ struct scm_port_internal
 {
   scm_t_port_encoding_mode encoding_mode;
   scm_t_iconv_descriptors *iconv_descriptors;
+  int pending_eof;
   SCM alist;
 };
 
diff --git a/libguile/ports.c b/libguile/ports.c
index ee14ca5..61bfc72 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -241,6 +241,18 @@ scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM))
   scm_ptobs[SCM_TC2PTOBNUM (tc)].input_waiting = input_waiting;
 }
 
+static void
+scm_i_set_pending_eof (SCM port)
+{
+  SCM_PORT_GET_INTERNAL (port)->pending_eof = 1;
+}
+
+static void
+scm_i_clear_pending_eof (SCM port)
+{
+  SCM_PORT_GET_INTERNAL (port)->pending_eof = 0;
+}
+
 SCM
 scm_i_port_alist (SCM port)
 {
@@ -645,6 +657,7 @@ scm_new_port_table_entry (scm_t_bits tag)
   entry->input_cd = pti;   /* XXX pointer to the internal port structure */
   entry->output_cd = NULL; /* XXX unused */
 
+  pti->pending_eof = 0;
   pti->alist = SCM_EOL;
 
   SCM_SET_CELL_TYPE (z, tag);
@@ -1423,9 +1436,16 @@ static int
 scm_i_fill_input (SCM port)
 {
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
+  scm_t_port_internal *pti = SCM_PORT_GET_INTERNAL (port);
 
   assert (pt->read_pos == pt->read_end);
 
+  if (pti->pending_eof)
+    {
+      pti->pending_eof = 0;
+      return EOF;
+    }
+
   if (pt->read_buf == pt->putback_buf)
     {
       /* finished reading put-back chars.  */
@@ -1481,7 +1501,10 @@ scm_i_peek_byte_or_eof (SCM port)
   if (pt->read_pos >= pt->read_end)
     {
       if (SCM_UNLIKELY (scm_i_fill_input (port) == EOF))
-	return EOF;
+        {
+          scm_i_set_pending_eof (port);
+          return EOF;
+        }
     }
 
   return *pt->read_pos;
@@ -1713,6 +1736,7 @@ scm_end_input (SCM port)
   long offset;
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
 
+  scm_i_clear_pending_eof (port);
   if (pt->read_buf == pt->putback_buf)
     {
       offset = pt->read_end - pt->read_pos;
@@ -1736,6 +1760,7 @@ scm_unget_byte (int c, SCM port)
 {
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
 
+  scm_i_clear_pending_eof (port);
   if (pt->read_buf == pt->putback_buf)
     /* already using the put-back buffer.  */
     {
@@ -1907,7 +1932,10 @@ SCM_DEFINE (scm_peek_char, "peek-char", 0, 1, 0,
       result = SCM_BOOL_F;
     }
   else if (c == EOF)
-    result = SCM_EOF_VAL;
+    {
+      scm_i_set_pending_eof (port);
+      result = SCM_EOF_VAL;
+    }
   else
     result = SCM_MAKE_CHAR (c);
 
@@ -2006,7 +2034,10 @@ SCM_DEFINE (scm_seek, "seek", 3, 0, 0,
 	SCM_MISC_ERROR ("port is not seekable", 
                         scm_cons (fd_port, SCM_EOL));
       else
-	rv = ptob->seek (fd_port, off, how);
+        {
+          scm_i_clear_pending_eof (fd_port);
+          rv = ptob->seek (fd_port, off, how);
+        }
       return scm_from_off_t_or_off64_t (rv);
     }
   else /* file descriptor?.  */
diff --git a/test-suite/tests/ports.test b/test-suite/tests/ports.test
index 886ab24..7b6ee22 100644
--- a/test-suite/tests/ports.test
+++ b/test-suite/tests/ports.test
@@ -1110,6 +1110,90 @@
 		(char-ready?))))))
 
 
+;;;; pending-eof behavior
+
+(with-test-prefix "pending EOF behavior"
+  ;; Make a test port that will produce the given sequence.  Each
+  ;; element of 'lst' may be either a character or #f (which means EOF).
+  (define (test-soft-port . lst)
+    (make-soft-port
+     (vector (lambda (c) #f)            ; write char
+             (lambda (s) #f)            ; write string
+             (lambda () #f)             ; flush
+             (lambda ()                 ; read char
+               (let ((c (car lst)))
+                 (set! lst (cdr lst))
+                 c))
+             (lambda () #f))            ; close
+     "rw"))
+
+  (define (call-with-port p proc)
+    (dynamic-wind
+      (lambda () #f)
+      (lambda () (proc p))
+      (lambda () (close-port p))))
+
+  (define (call-with-test-file str proc)
+    (let ((filename (test-file)))
+      (dynamic-wind
+        (lambda () (call-with-output-file filename
+                     (lambda (p) (display str p))))
+        (lambda () (call-with-input-file filename proc))
+        (lambda () (delete-file (test-file))))))
+
+  (pass-if "peek-char does not swallow EOF (soft port)"
+    (call-with-port (test-soft-port #\a #f #\b)
+      (lambda (p)
+        (and (char=? #\a  (peek-char p))
+             (char=? #\a  (read-char p))
+             (eof-object? (peek-char p))
+             (eof-object? (read-char p))
+             (char=? #\b  (peek-char p))
+             (char=? #\b  (read-char p))))))
+
+  (pass-if "unread clears pending EOF (soft port)"
+    (call-with-port (test-soft-port #\a #f #\b)
+      (lambda (p)
+        (and (char=? #\a  (read-char p))
+             (eof-object? (peek-char p))
+             (begin (unread-char #\u p)
+                    (char=? #\u  (read-char p)))))))
+
+  (pass-if "unread clears pending EOF (string port)"
+    (call-with-input-string "a"
+      (lambda (p)
+        (and (char=? #\a  (read-char p))
+             (eof-object? (peek-char p))
+             (begin (unread-char #\u p)
+                    (char=? #\u  (read-char p)))))))
+
+  (pass-if "unread clears pending EOF (file port)"
+    (call-with-test-file
+     "a"
+     (lambda (p)
+       (and (char=? #\a  (read-char p))
+            (eof-object? (peek-char p))
+            (begin (unread-char #\u p)
+                   (char=? #\u  (read-char p)))))))
+
+  (pass-if "seek clears pending EOF (string port)"
+    (call-with-input-string "a"
+      (lambda (p)
+        (and (char=? #\a  (read-char p))
+             (eof-object? (peek-char p))
+             (begin (seek p 0 SEEK_SET)
+                    (char=? #\a (read-char p)))))))
+
+  (pass-if "seek clears pending EOF (file port)"
+    (call-with-test-file
+     "a"
+     (lambda (p)
+       (and (char=? #\a  (read-char p))
+            (eof-object? (peek-char p))
+            (begin (seek p 0 SEEK_SET)
+                   (char=? #\a (read-char p))))))))
+
+
 ;;;; Close current-input-port, and make sure everyone can handle it.
 
 (with-test-prefix "closing current-input-port"
-- 
1.7.10.4


Information forwarded to bug-guile <at> gnu.org:
bug#12216; Package guile. (Thu, 04 Apr 2013 20:04:02 GMT) Full text and rfc822 format available.

Message #78 received at 12216 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Mark H Weaver <mhw <at> netris.org>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216 <12216 <at> debbugs.gnu.org>
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Thu, 04 Apr 2013 22:00:35 +0200
On Tue 02 Apr 2013 21:41, Mark H Weaver <mhw <at> netris.org> writes:

> Good idea!  I've posted separate patches to do that here:
>
>   "[PATCH] Move slow path out of 'scm_get_byte_or_eof' et al"
>   http://lists.gnu.org/archive/html/guile-devel/2013-04/msg00032.html

Cool, and thanks to Ludo for review there.  Just to note, if you end up
doing any more work on these functions in the future, you should move
them to ports.h.  See 18cd9aff9429c99ffae34448507f9b468e20e06f for an
example of what I'm talking about.

>> 2. I think we probably need some tests.
>
> I've attached a new patch which adds tests.  Note that this patch
> depends upon the "Move slow path out" patches referenced above.

Looks great to me.  Thanks for following up on this!

Andy
-- 
http://wingolog.org/




Reply sent to Mark H Weaver <mhw <at> netris.org>:
You have taken responsibility. (Thu, 04 Apr 2013 21:58:01 GMT) Full text and rfc822 format available.

Notification sent to dwheeler <at> dwheeler.com:
bug acknowledged by developer. (Thu, 04 Apr 2013 21:58:02 GMT) Full text and rfc822 format available.

Message #83 received at 12216-done <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andy Wingo <wingo <at> pobox.com>
Cc: almkglor <almkglor <at> gmail.com>, dwheeler <at> dwheeler.com,
	12216-done <at> debbugs.gnu.org
Subject: Re: bug#12216: peek-char incorrectly *CONSUMES* eof
Date: Thu, 04 Apr 2013 17:53:50 -0400
Andy Wingo <wingo <at> pobox.com> writes:
> Looks great to me.  Thanks for following up on this!

Excellent!  I pushed this to stable-2.0, so it will be in Guile 2.0.8.
Closing this bug now.

    Thanks!
      Mark




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 03 May 2013 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 10 years and 360 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.