GNU bug report logs -
#12216
peek-char incorrectly *CONSUMES* eof
Previous Next
Reported by: dwheeler <at> dwheeler.com
Date: Fri, 17 Aug 2012 02:03:01 UTC
Severity: normal
Done: Mark H Weaver <mhw <at> netris.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 12216 in the body.
You can then email your comments to 12216 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Fri, 17 Aug 2012 02:03:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
dwheeler <at> dwheeler.com
:
New bug report received and forwarded. Copy sent to
bug-guile <at> gnu.org
.
(Fri, 17 Aug 2012 02:03:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
All:
Guile's peek-char has a bug; it incorrectly *consumes* eof instead of just reporting it.
According to R5RS, "The value returned by a call to peek-char is the same as the value that would have been returned by a call to read-char with the same port. The only difference is that the very next call to read-char or peek-char on that port will return the value returned by the preceding call to peek-char."
However, if the value returned is #eof, guile does *not* meet the spec; you *can* get two successive peek-chars with different results.
This doesn't matter for files, but it *does* matter for interactive use. It means that successive peek-char calls will actually *READ* characters, and they can even different in results. Which means that code that does several peek-chars can act oddly when someone tries to end it with control-D.
We've confirmed this bug is true for guile 2.0, 1.8, and 1.6.
You can confirm this by placing this in "bug-demo":
===================
(write (peek-char))
(write (peek-char))
======================
Then run "guile bug-demo". Press control-D, then newline. You can see that two successive calls to peek-char are reporting different results (eof, then newline), which is NEVER supposed to happen!
I just had to write some code to work around this, but it'd be nice for this to work "correctly" in the future for interactive use. (This was code for the "readable" project, http://readable.sourceforge.net, where we do a lot with guile.)
Thanks for your time... and thanks for guile!
--- David A. Wheeler
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Tue, 05 Mar 2013 18:55:04 GMT)
Full text and
rfc822 format available.
Message #8 received at 12216 <at> debbugs.gnu.org (full text, mbox):
Hi,
On Fri 17 Aug 2012 03:53, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
> Guile's peek-char has a bug; it incorrectly *consumes* eof instead of
> just reporting it.
I'm not sure there is a sane way to deal with this. Guile's ports are
fundamentally binary, underneath. The backing store is a byte buffer
that is decoded and encoded to Scheme characters at need, along with
vtable to fill and flush that buffer. You can do strange things like
unread characters, switch the encoding, and read them back out in a
different encoding. There is no way to represent EOF in that buffer.
I have the feeling that for interactive use, if you expect to read a EOF
from a port and then continue, you have to not use peek-char. You need
to handle your own lookahead buffer. I know it's not a great answer,
but I can't think of anything else that makes sense.
What do you think?
Andy
--
http://wingolog.org/
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Tue, 05 Mar 2013 19:18:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 12216 <at> debbugs.gnu.org (full text, mbox):
I reported:
> > Guile's peek-char has a bug; it incorrectly *consumes* eof instead of
> > just reporting it.
Andy Wingo replied:
> I have the feeling that for interactive use, if you expect to read a EOF
> from a port and then continue, you have to not use peek-char. You need
> to handle your own lookahead buffer. I know it's not a great answer,
> but I can't think of anything else that makes sense.
We don't want to read an EOF and then continue.
We peek-char to not *CONSUME* an interactive EOF.
Once peek-char returns EOF, every following peek-char
should *also* return EOF until the EOF is read.
That isn't what currently happens.
E.G.:
(define (demo) (write (peek-char)) (write (peek-char)))
(demo)
; Now type control-D ENTER.
; It SHOULD produce #<eof> twice, after peeking EOF,
; but instead it produces #<eof> #\newline
#<eof>#<eof>guile> (demo)
--- David A. Wheeler
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Thu, 07 Mar 2013 21:34:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 12216 <at> debbugs.gnu.org (full text, mbox):
On Tue 05 Mar 2013 20:17, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
> I reported:
>> > Guile's peek-char has a bug; it incorrectly *consumes* eof instead of
>> > just reporting it.
>
> Andy Wingo replied:
>> I have the feeling that for interactive use, if you expect to read a EOF
>> from a port and then continue, you have to not use peek-char. You need
>> to handle your own lookahead buffer. I know it's not a great answer,
>> but I can't think of anything else that makes sense.
>
> We don't want to read an EOF and then continue.
> We peek-char to not *CONSUME* an interactive EOF.
I understand what it is you want. But I don't know of any sane way to
implement it.
Given that it is an edge case -- peek-char on an interactive port -- I
think the thing to do is to document this inconsistency, and for you (in
your code) to implement some sort of abstraction that does not use
peek-char.
That is to say, use read-char, and do unread-char as necessary. Or keep
your own buffer around.
I know it isn't nice, but I don't know how to do anything better.
Andy
--
http://wingolog.org/
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Sat, 09 Mar 2013 02:27:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 12216 <at> debbugs.gnu.org (full text, mbox):
On 8 March 2013 05:32, Andy Wingo <wingo <at> pobox.com> wrote:
> On Tue 05 Mar 2013 20:17, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
>
>> I reported:
>>> > Guile's peek-char has a bug; it incorrectly *consumes* eof instead of
>>> > just reporting it.
>>
>> Andy Wingo replied:
>>> I have the feeling that for interactive use, if you expect to read a EOF
>>> from a port and then continue, you have to not use peek-char. You need
>>> to handle your own lookahead buffer. I know it's not a great answer,
>>> but I can't think of anything else that makes sense.
>>
>> We don't want to read an EOF and then continue.
>> We peek-char to not *CONSUME* an interactive EOF.
>
> I understand what it is you want. But I don't know of any sane way to
> implement it.
Indeed. There is a distinction between the end-of-file _object_
(Scheme) and end-of-transmission _character_ (ASCII ^D). The two are
not equivalent, and when read-char and peek-char return the former
value it is only to signal a _current_ lack of characters and should
not be considered part of the character stream read from the port.
This is the same reason why, e.g. the end-of-file object can not be
passed to unread-char: it is not a character.
IIRC some other Schemes do handle the test case from OP, but I do not
agree that those semantics are sane.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Sat, 09 Mar 2013 17:13:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 12216 <at> debbugs.gnu.org (full text, mbox):
Daniel Hartwig:
> Indeed. There is a distinction between the end-of-file _object_
> (Scheme) and end-of-transmission _character_ (ASCII ^D). The two are
> not equivalent, ...
Sure.
> and when read-char and peek-char return the former
> value it is only to signal a _current_ lack of characters and should
> not be considered part of the character stream read from the port.
If I understand you correctly, that sounds more like "char-ready?".
In any case, I believe guile fails to meet the semantics
of R5RS, and I believe other versions of the Scheme spec,
in this area. R5RS says:
> (peek-char) procedure
> (peek-char port) procedure
> Returns the next character available from the input port,
> without updating the port to point to the following character.
> If no more characters are available, an end of le
> object is returned. Port may be omitted, in which case it
> defaults to the value returned by current-input-port.
> Note: The value returned by a call to peek-char is the same as
> the value that would have been returned by a call to read-char
> with the same port. The only difference is that the very next call
> to read-char or peek-char on that port will return the value
> returned by the preceding call to peek-char. In particular, a
> call to peek-char on an interactive port will hang waiting for
> input whenever a call to read-char would have hung.
Note that the spec says that the very next call
"to read-char or peek-char on that port will return the value
returned by the preceding call to peek-char." It does not say
the same *CHARACTER*, it says the same *VALUE*, and the
eof object is a possible value from peek-char and read-char.
I'd be okay if, after a read-char of EOF, it could return something else
as some sort of extension. But (peek-char) (peek-char) (peek-char)
should return the same values 3 times, no matter what... even
if it's EOF.
--- David A. Wheeler
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Sun, 10 Mar 2013 00:25:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 12216 <at> debbugs.gnu.org (full text, mbox):
On 10 March 2013 01:11, David A. Wheeler <dwheeler <at> dwheeler.com> wrote:
> Daniel Hartwig:
>> and when read-char and peek-char return the former
>> value it is only to signal a _current_ lack of characters and should
>> not be considered part of the character stream read from the port.
>
> If I understand you correctly, that sounds more like "char-ready?".
Erm yes, my mistake.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Wed, 13 Mar 2013 11:04:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 12216 <at> debbugs.gnu.org (full text, mbox):
On Sat 09 Mar 2013 18:11, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
> (peek-char) (peek-char) (peek-char) should return the same values 3
> times, no matter what... even if it's EOF.
So, we are repeating ourselves here :) I agree with you but I can't see
a good way of implementing this. For that reason I am inclined to
document this inconsistency and close as a wontfix.
Andy
--
http://wingolog.org/
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Wed, 13 Mar 2013 13:11:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 12216 <at> debbugs.gnu.org (full text, mbox):
Andy Wingo:
> So, we are repeating ourselves here :) I agree with you but I can't see
> a good way of implementing this.
Would the per-port reader options be reasonable place to store the info
about EOF?
--- David A. Wheeler
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Wed, 13 Mar 2013 14:35:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 12216 <at> debbugs.gnu.org (full text, mbox):
On Wed 13 Mar 2013 14:09, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
> Andy Wingo:
>
>> So, we are repeating ourselves here :) I agree with you but I can't see
>> a good way of implementing this.
>
> Would the per-port reader options be reasonable place to store the info
> about EOF?
For your own purposes that would be fine. But it cannot affect
read-char / peek-char / etc for everyone, because it would have bad
global effects on performance and correctness. That's why I'm pushing
back on fixing this in Guile itself.
Andy
--
http://wingolog.org/
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Wed, 13 Mar 2013 18:12:01 GMT)
Full text and
rfc822 format available.
Message #35 received at 12216 <at> debbugs.gnu.org (full text, mbox):
Andy Wingo <wingo <at> pobox.com> writes:
> On Wed 13 Mar 2013 14:09, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
>
>> Andy Wingo:
>>
>>> So, we are repeating ourselves here :) I agree with you but I can't see
>>> a good way of implementing this.
>>
>> Would the per-port reader options be reasonable place to store the info
>> about EOF?
>
> For your own purposes that would be fine. But it cannot affect
> read-char / peek-char / etc for everyone, because it would have bad
> global effects on performance and correctness. That's why I'm pushing
> back on fixing this in Guile itself.
I don't know, it might not be that bad, now that we've agreed on a way
to extend the port structure in 2.0. Maybe we could just have a "last
peek-char returned EOF" flag that would be consulted by the other read
primitives.
I agree that we should not allow EOF to be unread.
What do you think?
Mark
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Wed, 13 Mar 2013 18:24:01 GMT)
Full text and
rfc822 format available.
Message #38 received at 12216 <at> debbugs.gnu.org (full text, mbox):
On Wed 13 Mar 2013 19:10, Mark H Weaver <mhw <at> netris.org> writes:
> Andy Wingo <wingo <at> pobox.com> writes:
>
>> On Wed 13 Mar 2013 14:09, "David A. Wheeler" <dwheeler <at> dwheeler.com> writes:
>>
>>> Andy Wingo:
>>>
>>>> So, we are repeating ourselves here :) I agree with you but I can't see
>>>> a good way of implementing this.
>>>
>>> Would the per-port reader options be reasonable place to store the info
>>> about EOF?
>>
>> For your own purposes that would be fine. But it cannot affect
>> read-char / peek-char / etc for everyone, because it would have bad
>> global effects on performance and correctness. That's why I'm pushing
>> back on fixing this in Guile itself.
>
> I don't know, it might not be that bad, now that we've agreed on a way
> to extend the port structure in 2.0. Maybe we could just have a "last
> peek-char returned EOF" flag that would be consulted by the other read
> primitives.
>
> I agree that we should not allow EOF to be unread.
>
> What do you think?
I really doubt our ability to get it right. Consider that we have code
that accesses the buffer directly, binary and textual ports, etc
etc... I don't think we're going to get this right. Fixing this would
also have complexity and performance costs as well.
Maybe if it is somehow confined to scm_peek_char and scm_fill_input it
could be doable.
Andy
--
http://wingolog.org/
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Thu, 14 Mar 2013 17:16:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 12216 <at> debbugs.gnu.org (full text, mbox):
Andy Wingo <wingo <at> pobox.com> writes:
> On Wed 13 Mar 2013 19:10, Mark H Weaver <mhw <at> netris.org> writes:
>
>> I don't know, it might not be that bad, now that we've agreed on a way
>> to extend the port structure in 2.0. Maybe we could just have a "last
>> peek-char returned EOF" flag that would be consulted by the other read
>> primitives.
>>
>> I agree that we should not allow EOF to be unread.
>>
>> What do you think?
>
> I really doubt our ability to get it right. Consider that we have code
> that accesses the buffer directly, binary and textual ports, etc
> etc... I don't think we're going to get this right.
I think you're being overly pessimistic, but for the sake of argument,
suppose you're right. Then we do the best we can and fix any remaining
problems as they're discovered later. It's better to aim for
correctness and fail in our first attempt than to give up without
trying.
IMO, this is a bug, pure and simple. I think we can do better than
"it's too hard, let's just punt on this". We are implementors of
Scheme, which has lots of details that are hard to get right, yet we
rise to the challenge, do we not?
> Fixing this would also have complexity and performance costs as well.
One extra check per read primitive is hardly enough of a cost to
consider abandoning correctness, IMO.
Mark
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Thu, 14 Mar 2013 17:45:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 12216 <at> debbugs.gnu.org (full text, mbox):
On Thu, 14 Mar 2013 13:13:39 -0400, Mark H Weaver <mhw <at> netris.org> wrote:
> One extra check per read primitive is hardly enough of a cost to
> consider abandoning correctness, IMO.
Though I've not reviewed this code, I *think* you'd only need to check when the buffer is empty. When the buffer is empty, it'll take a while for the OS to fill it anyway, far more than checking a boolean value.
--- David A. Wheeler
Reply sent
to
Mark H Weaver <mhw <at> netris.org>
:
You have taken responsibility.
(Sun, 17 Mar 2013 23:50:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
dwheeler <at> dwheeler.com
:
bug acknowledged by developer.
(Sun, 17 Mar 2013 23:50:02 GMT)
Full text and
rfc822 format available.
Message #49 received at 12216-done <at> debbugs.gnu.org (full text, mbox):
I've pushed 1ea37620c2c1794f7685b312d2530676a078ada7 to stable-2.0,
which fixes our number printer. Closing this bug.
Thanks,
Mark
Did not alter fixed versions and reopened.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Mon, 18 Mar 2013 01:17:01 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Mon, 18 Mar 2013 01:31:01 GMT)
Full text and
rfc822 format available.
Message #54 received at 12216 <at> debbugs.gnu.org (full text, mbox):
I wrote:
> I've pushed 1ea37620c2c1794f7685b312d2530676a078ada7 to stable-2.0,
> which fixes our number printer. Closing this bug.
Sorry, I sent this to the wrong bug. Oops :)
Mark
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Sat, 30 Mar 2013 22:41:07 GMT)
Full text and
rfc822 format available.
Message #57 received at 12216 <at> debbugs.gnu.org (full text, mbox):
Andy Wingo <wingo <at> pobox.com> writes:
> Maybe if it is somehow confined to scm_peek_char and scm_fill_input it
> could be doable.
I think I now see a reasonable way to fix this.
First of all, all interfaces that do something like 'peek' would, when
returning an EOF, set a 'pending_eof' flag in the port structure. Then
'scm_fill_input' would start by checking this flag; if it is set, then
it would clear the flag and return EOF.
We would also need to clear the 'pending_eof' flag in a few other
places, most notably in 'scm_end_input'. In theory, that might be
enough, but to be on the safe side we should also clear the flag in any
procedure that writes or seeks.
Note that there is one inlined 'peek' function: 'scm_peek_byte_or_eof'
defined in inline.h. We can change it to set the 'pending_eof' flag,
but we must consider that some code will be compiled against the old
version. This is not a serious problem. It just means that if someone
calls the old 'scm_peek_byte_or_eof' and gets EOF, the EOF will be
swallowed and not properly repeated by the next 'read'. Oh well.
I plan to cook up a patch along these lines soon, as part of my ports
work for 2.0.8.
Thoughts?
Mark
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Sat, 30 Mar 2013 23:06:01 GMT)
Full text and
rfc822 format available.
Message #60 received at 12216 <at> debbugs.gnu.org (full text, mbox):
Mark H Weaver:
> I think I now see a reasonable way to fix this.
> First of all, all interfaces that do something like 'peek' would, when
> returning an EOF, set a 'pending_eof' flag in the port structure...
That's awesome, and it sounds like it'd work well.
Thanks for looking into this.
--- David A. Wheeler
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Mon, 01 Apr 2013 02:10:01 GMT)
Full text and
rfc822 format available.
Message #63 received at 12216 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Mark H Weaver <mhw <at> netris.org> writes:
> Andy Wingo <wingo <at> pobox.com> writes:
>> Maybe if it is somehow confined to scm_peek_char and scm_fill_input it
>> could be doable.
>
> I think I now see a reasonable way to fix this.
>
> First of all, all interfaces that do something like 'peek' would, when
> returning an EOF, set a 'pending_eof' flag in the port structure. Then
> 'scm_fill_input' would start by checking this flag; if it is set, then
> it would clear the flag and return EOF.
>
> We would also need to clear the 'pending_eof' flag in a few other
> places, most notably in 'scm_end_input'. In theory, that might be
> enough, but to be on the safe side we should also clear the flag in any
> procedure that writes or seeks.
Upon further thought, I decided that we shouldn't clear it on writes.
If 'rw_random' is set, then 'scm_end_input' will be called anyway. If
'rw_random' is not set, then we should *not* clear the flag on writes,
because it indicates that the read and write streams are independent,
e.g. a terminal or socket.
On the other hand, the 'pending_eof' flag should of course be cleared
when putting characters back (unget).
I've attached a proposed patch. It depends on the three preliminary
patches posted here:
http://lists.gnu.org/archive/html/guile-devel/2013-03/msg00221.html
Thoughts?
Regards,
Mark
[0004-Peeks-do-not-consume-EOFs.patch (text/x-diff, inline)]
From e732d035b86a990ea0cfb56a710ee9224d8dbd31 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 31 Mar 2013 19:06:51 -0400
Subject: [PATCH 4/5] Peeks do not consume EOFs.
* libguile/ports-internal.h (struct scm_port_internal): Add
'pending_eof' flag.
* libguile/inline.h (scm_peek_byte_or_eof): Set 'pending_eof' flag
before returning EOF.
* libguile/ports.c (scm_i_set_pending_eof): New function.
(scm_i_clear_pending_eof): New static function.
(scm_new_port_table_entry): Initialize 'pending_eof'.
(scm_fill_input): Check for 'pending_eof'.
(scm_end_input, scm_unget_byte, scm_seek): Clear 'pending_eof'.
(scm_peek_char): Set 'pending_eof' flag before returning EOF.
* libguile/ports.h (scm_i_set_pending_eof): Add prototype.
---
libguile/inline.h | 5 ++++-
libguile/ports-internal.h | 1 +
libguile/ports.c | 33 +++++++++++++++++++++++++++++++--
libguile/ports.h | 1 +
4 files changed, 37 insertions(+), 3 deletions(-)
diff --git a/libguile/inline.h b/libguile/inline.h
index 88ba7f7..fffe101 100644
--- a/libguile/inline.h
+++ b/libguile/inline.h
@@ -134,7 +134,10 @@ scm_peek_byte_or_eof (SCM port)
if (pt->read_pos >= pt->read_end)
{
if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
- return EOF;
+ {
+ scm_i_set_pending_eof (port);
+ return EOF;
+ }
}
c = *pt->read_pos;
diff --git a/libguile/ports-internal.h b/libguile/ports-internal.h
index d52eab2..5bde9d0 100644
--- a/libguile/ports-internal.h
+++ b/libguile/ports-internal.h
@@ -48,6 +48,7 @@ struct scm_port_internal
{
scm_t_port_encoding_mode encoding_mode;
scm_t_iconv_descriptors *iconv_descriptors;
+ int pending_eof;
};
typedef struct scm_port_internal scm_t_port_internal;
diff --git a/libguile/ports.c b/libguile/ports.c
index 1ac9106..89e1714 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -241,6 +241,18 @@ scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM))
scm_ptobs[SCM_TC2PTOBNUM (tc)].input_waiting = input_waiting;
}
+void
+scm_i_set_pending_eof (SCM port)
+{
+ SCM_INTERNAL_PTAB_ENTRY (port)->pending_eof = 1;
+}
+
+static void
+scm_i_clear_pending_eof (SCM port)
+{
+ SCM_INTERNAL_PTAB_ENTRY (port)->pending_eof = 0;
+}
+
SCM_DEFINE (scm_char_ready_p, "char-ready?", 0, 1, 0,
@@ -634,6 +646,8 @@ scm_new_port_table_entry (scm_t_bits tag)
entry->input_cd = pti; /* XXX pointer to the internal port structure */
entry->output_cd = NULL; /* XXX unused */
+ pti->pending_eof = 0;
+
SCM_SET_CELL_TYPE (z, tag);
SCM_SETPTAB_ENTRY (z, entry);
@@ -1410,9 +1424,16 @@ int
scm_fill_input (SCM port)
{
scm_t_port *pt = SCM_PTAB_ENTRY (port);
+ scm_t_port_internal *pti = SCM_INTERNAL_PTAB_ENTRY (port);
assert (pt->read_pos == pt->read_end);
+ if (pti->pending_eof)
+ {
+ pti->pending_eof = 0;
+ return EOF;
+ }
+
if (pt->read_buf == pt->putback_buf)
{
/* finished reading put-back chars. */
@@ -1652,6 +1673,7 @@ scm_end_input (SCM port)
long offset;
scm_t_port *pt = SCM_PTAB_ENTRY (port);
+ scm_i_clear_pending_eof (port);
if (pt->read_buf == pt->putback_buf)
{
offset = pt->read_end - pt->read_pos;
@@ -1675,6 +1697,7 @@ scm_unget_byte (int c, SCM port)
{
scm_t_port *pt = SCM_PTAB_ENTRY (port);
+ scm_i_clear_pending_eof (port);
if (pt->read_buf == pt->putback_buf)
/* already using the put-back buffer. */
{
@@ -1846,7 +1869,10 @@ SCM_DEFINE (scm_peek_char, "peek-char", 0, 1, 0,
result = SCM_BOOL_F;
}
else if (c == EOF)
- result = SCM_EOF_VAL;
+ {
+ scm_i_set_pending_eof (port);
+ result = SCM_EOF_VAL;
+ }
else
result = SCM_MAKE_CHAR (c);
@@ -1945,7 +1971,10 @@ SCM_DEFINE (scm_seek, "seek", 3, 0, 0,
SCM_MISC_ERROR ("port is not seekable",
scm_cons (fd_port, SCM_EOL));
else
- rv = ptob->seek (fd_port, off, how);
+ {
+ scm_i_clear_pending_eof (fd_port);
+ rv = ptob->seek (fd_port, off, how);
+ }
return scm_from_off_t_or_off64_t (rv);
}
else /* file descriptor?. */
diff --git a/libguile/ports.h b/libguile/ports.h
index 95545cd..5aca26d 100644
--- a/libguile/ports.h
+++ b/libguile/ports.h
@@ -319,6 +319,7 @@ SCM_API SCM scm_set_port_filename_x (SCM port, SCM filename);
SCM_INTERNAL const char *scm_i_default_port_encoding (void);
SCM_INTERNAL void scm_i_set_default_port_encoding (const char *);
SCM_INTERNAL void scm_i_set_port_encoding_x (SCM port, const char *str);
+SCM_INTERNAL void scm_i_set_pending_eof (SCM port);
SCM_API SCM scm_port_encoding (SCM port);
SCM_API SCM scm_set_port_encoding_x (SCM port, SCM encoding);
SCM_INTERNAL scm_t_string_failed_conversion_handler
--
1.7.10.4
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Mon, 01 Apr 2013 21:31:01 GMT)
Full text and
rfc822 format available.
Message #66 received at 12216 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
The prerequisite patches are now in stable-2.0. Here's an updated patch
that applies against current stable-2.0. Otherwise it has not changed.
Mark
[0001-Peeks-do-not-consume-EOFs.patch (text/x-diff, inline)]
From 7d9ebc691519af7236e1fec8dfefd5fc8784f875 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 31 Mar 2013 19:06:51 -0400
Subject: [PATCH] Peeks do not consume EOFs.
Fixes <http://bugs.gnu.org/12216>.
* libguile/ports-internal.h (struct scm_port_internal): Add
'pending_eof' flag.
* libguile/inline.h (scm_peek_byte_or_eof): Set 'pending_eof' flag
before returning EOF.
* libguile/ports.c (scm_i_set_pending_eof): New function.
(scm_i_clear_pending_eof): New static function.
(scm_new_port_table_entry): Initialize 'pending_eof'.
(scm_fill_input): Check for 'pending_eof'.
(scm_end_input, scm_unget_byte, scm_seek): Clear 'pending_eof'.
(scm_peek_char): Set 'pending_eof' flag before returning EOF.
* libguile/ports.h (scm_i_set_pending_eof): Add prototype.
---
libguile/inline.h | 5 ++++-
libguile/ports-internal.h | 1 +
libguile/ports.c | 32 ++++++++++++++++++++++++++++++--
libguile/ports.h | 1 +
4 files changed, 36 insertions(+), 3 deletions(-)
diff --git a/libguile/inline.h b/libguile/inline.h
index 88ba7f7..fffe101 100644
--- a/libguile/inline.h
+++ b/libguile/inline.h
@@ -134,7 +134,10 @@ scm_peek_byte_or_eof (SCM port)
if (pt->read_pos >= pt->read_end)
{
if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
- return EOF;
+ {
+ scm_i_set_pending_eof (port);
+ return EOF;
+ }
}
c = *pt->read_pos;
diff --git a/libguile/ports-internal.h b/libguile/ports-internal.h
index 73a788f..333d4fb 100644
--- a/libguile/ports-internal.h
+++ b/libguile/ports-internal.h
@@ -48,6 +48,7 @@ struct scm_port_internal
{
scm_t_port_encoding_mode encoding_mode;
scm_t_iconv_descriptors *iconv_descriptors;
+ int pending_eof;
SCM alist;
};
diff --git a/libguile/ports.c b/libguile/ports.c
index becdbed..65e2e6f 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -241,6 +241,18 @@ scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM))
scm_ptobs[SCM_TC2PTOBNUM (tc)].input_waiting = input_waiting;
}
+void
+scm_i_set_pending_eof (SCM port)
+{
+ SCM_PORT_GET_INTERNAL (port)->pending_eof = 1;
+}
+
+static void
+scm_i_clear_pending_eof (SCM port)
+{
+ SCM_PORT_GET_INTERNAL (port)->pending_eof = 0;
+}
+
SCM
scm_i_port_alist (SCM port)
{
@@ -645,6 +657,7 @@ scm_new_port_table_entry (scm_t_bits tag)
entry->input_cd = pti; /* XXX pointer to the internal port structure */
entry->output_cd = NULL; /* XXX unused */
+ pti->pending_eof = 0;
pti->alist = SCM_EOL;
SCM_SET_CELL_TYPE (z, tag);
@@ -1423,9 +1436,16 @@ int
scm_fill_input (SCM port)
{
scm_t_port *pt = SCM_PTAB_ENTRY (port);
+ scm_t_port_internal *pti = SCM_PORT_GET_INTERNAL (port);
assert (pt->read_pos == pt->read_end);
+ if (pti->pending_eof)
+ {
+ pti->pending_eof = 0;
+ return EOF;
+ }
+
if (pt->read_buf == pt->putback_buf)
{
/* finished reading put-back chars. */
@@ -1665,6 +1685,7 @@ scm_end_input (SCM port)
long offset;
scm_t_port *pt = SCM_PTAB_ENTRY (port);
+ scm_i_clear_pending_eof (port);
if (pt->read_buf == pt->putback_buf)
{
offset = pt->read_end - pt->read_pos;
@@ -1688,6 +1709,7 @@ scm_unget_byte (int c, SCM port)
{
scm_t_port *pt = SCM_PTAB_ENTRY (port);
+ scm_i_clear_pending_eof (port);
if (pt->read_buf == pt->putback_buf)
/* already using the put-back buffer. */
{
@@ -1859,7 +1881,10 @@ SCM_DEFINE (scm_peek_char, "peek-char", 0, 1, 0,
result = SCM_BOOL_F;
}
else if (c == EOF)
- result = SCM_EOF_VAL;
+ {
+ scm_i_set_pending_eof (port);
+ result = SCM_EOF_VAL;
+ }
else
result = SCM_MAKE_CHAR (c);
@@ -1958,7 +1983,10 @@ SCM_DEFINE (scm_seek, "seek", 3, 0, 0,
SCM_MISC_ERROR ("port is not seekable",
scm_cons (fd_port, SCM_EOL));
else
- rv = ptob->seek (fd_port, off, how);
+ {
+ scm_i_clear_pending_eof (fd_port);
+ rv = ptob->seek (fd_port, off, how);
+ }
return scm_from_off_t_or_off64_t (rv);
}
else /* file descriptor?. */
diff --git a/libguile/ports.h b/libguile/ports.h
index 53d5081..5c9334a 100644
--- a/libguile/ports.h
+++ b/libguile/ports.h
@@ -316,6 +316,7 @@ SCM_API SCM scm_port_column (SCM port);
SCM_API SCM scm_set_port_column_x (SCM port, SCM line);
SCM_API SCM scm_port_filename (SCM port);
SCM_API SCM scm_set_port_filename_x (SCM port, SCM filename);
+SCM_INTERNAL void scm_i_set_pending_eof (SCM port);
SCM_INTERNAL SCM scm_i_port_alist (SCM port);
SCM_INTERNAL void scm_i_set_port_alist_x (SCM port, SCM alist);
SCM_INTERNAL const char *scm_i_default_port_encoding (void);
--
1.7.10.4
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Mon, 01 Apr 2013 21:37:01 GMT)
Full text and
rfc822 format available.
Message #69 received at 12216 <at> debbugs.gnu.org (full text, mbox):
On Mon, 01 Apr 2013 17:27:14 -0400, Mark H Weaver <mhw <at> netris.org> wrote:
> The prerequisite patches are now in stable-2.0. Here's an updated patch
> that applies against current stable-2.0. Otherwise it has not changed.
Awesome!! Thanks for tracking that down and finding a solution.
--- David A. Wheeler
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Tue, 02 Apr 2013 08:02:01 GMT)
Full text and
rfc822 format available.
Message #72 received at 12216 <at> debbugs.gnu.org (full text, mbox):
Looks like a good start. Two comments:
On Mon 01 Apr 2013 23:27, Mark H Weaver <mhw <at> netris.org> writes:
> --- a/libguile/inline.h
> +++ b/libguile/inline.h
> @@ -134,7 +134,10 @@ scm_peek_byte_or_eof (SCM port)
> if (pt->read_pos >= pt->read_end)
> {
> if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
> - return EOF;
> + {
> + scm_i_set_pending_eof (port);
> + return EOF;
> + }
> }
>
1. I don't much like the addition of this call to this inline function.
Can we move all the non-fast-path code into some other function? That
would include the flush call as well.
2. I think we probably need some tests.
Thanks for looking at this :-)
Andy
--
http://wingolog.org/
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Tue, 02 Apr 2013 19:45:01 GMT)
Full text and
rfc822 format available.
Message #75 received at 12216 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi Andy,
Andy Wingo <wingo <at> pobox.com> writes:
> Looks like a good start. Two comments:
>
> On Mon 01 Apr 2013 23:27, Mark H Weaver <mhw <at> netris.org> writes:
>
> 1. I don't much like the addition of this call to this inline function.
> Can we move all the non-fast-path code into some other function? That
> would include the flush call as well.
Good idea! I've posted separate patches to do that here:
"[PATCH] Move slow path out of 'scm_get_byte_or_eof' et al"
http://lists.gnu.org/archive/html/guile-devel/2013-04/msg00032.html
> 2. I think we probably need some tests.
I've attached a new patch which adds tests. Note that this patch
depends upon the "Move slow path out" patches referenced above.
More thoughts?
Thanks!
Mark
[0001-Peeks-do-not-consume-EOFs.patch (text/x-diff, inline)]
From 688dd519d7be102b9d5ce2193883c33d82f8df7b Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 31 Mar 2013 19:06:51 -0400
Subject: [PATCH] Peeks do not consume EOFs.
Fixes <http://bugs.gnu.org/12216>.
* libguile/ports-internal.h (struct scm_port_internal): Add
'pending_eof' flag.
* libguile/ports.c (scm_i_set_pending_eof, scm_i_clear_pending_eof): New
static functions.
(scm_new_port_table_entry): Initialize 'pending_eof'.
(scm_i_fill_input): Check for 'pending_eof'.
(scm_i_peek_byte_or_eof): Set 'pending_eof' flag before returning EOF.
(scm_end_input, scm_unget_byte, scm_seek): Clear 'pending_eof'.
(scm_peek_char): Set 'pending_eof' flag before returning EOF.
* test-suite/tests/ports.test ("pending EOF behavior"): Add tests.
---
libguile/ports-internal.h | 1 +
libguile/ports.c | 37 +++++++++++++++++--
test-suite/tests/ports.test | 84 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 119 insertions(+), 3 deletions(-)
diff --git a/libguile/ports-internal.h b/libguile/ports-internal.h
index 73a788f..333d4fb 100644
--- a/libguile/ports-internal.h
+++ b/libguile/ports-internal.h
@@ -48,6 +48,7 @@ struct scm_port_internal
{
scm_t_port_encoding_mode encoding_mode;
scm_t_iconv_descriptors *iconv_descriptors;
+ int pending_eof;
SCM alist;
};
diff --git a/libguile/ports.c b/libguile/ports.c
index ee14ca5..61bfc72 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -241,6 +241,18 @@ scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM))
scm_ptobs[SCM_TC2PTOBNUM (tc)].input_waiting = input_waiting;
}
+static void
+scm_i_set_pending_eof (SCM port)
+{
+ SCM_PORT_GET_INTERNAL (port)->pending_eof = 1;
+}
+
+static void
+scm_i_clear_pending_eof (SCM port)
+{
+ SCM_PORT_GET_INTERNAL (port)->pending_eof = 0;
+}
+
SCM
scm_i_port_alist (SCM port)
{
@@ -645,6 +657,7 @@ scm_new_port_table_entry (scm_t_bits tag)
entry->input_cd = pti; /* XXX pointer to the internal port structure */
entry->output_cd = NULL; /* XXX unused */
+ pti->pending_eof = 0;
pti->alist = SCM_EOL;
SCM_SET_CELL_TYPE (z, tag);
@@ -1423,9 +1436,16 @@ static int
scm_i_fill_input (SCM port)
{
scm_t_port *pt = SCM_PTAB_ENTRY (port);
+ scm_t_port_internal *pti = SCM_PORT_GET_INTERNAL (port);
assert (pt->read_pos == pt->read_end);
+ if (pti->pending_eof)
+ {
+ pti->pending_eof = 0;
+ return EOF;
+ }
+
if (pt->read_buf == pt->putback_buf)
{
/* finished reading put-back chars. */
@@ -1481,7 +1501,10 @@ scm_i_peek_byte_or_eof (SCM port)
if (pt->read_pos >= pt->read_end)
{
if (SCM_UNLIKELY (scm_i_fill_input (port) == EOF))
- return EOF;
+ {
+ scm_i_set_pending_eof (port);
+ return EOF;
+ }
}
return *pt->read_pos;
@@ -1713,6 +1736,7 @@ scm_end_input (SCM port)
long offset;
scm_t_port *pt = SCM_PTAB_ENTRY (port);
+ scm_i_clear_pending_eof (port);
if (pt->read_buf == pt->putback_buf)
{
offset = pt->read_end - pt->read_pos;
@@ -1736,6 +1760,7 @@ scm_unget_byte (int c, SCM port)
{
scm_t_port *pt = SCM_PTAB_ENTRY (port);
+ scm_i_clear_pending_eof (port);
if (pt->read_buf == pt->putback_buf)
/* already using the put-back buffer. */
{
@@ -1907,7 +1932,10 @@ SCM_DEFINE (scm_peek_char, "peek-char", 0, 1, 0,
result = SCM_BOOL_F;
}
else if (c == EOF)
- result = SCM_EOF_VAL;
+ {
+ scm_i_set_pending_eof (port);
+ result = SCM_EOF_VAL;
+ }
else
result = SCM_MAKE_CHAR (c);
@@ -2006,7 +2034,10 @@ SCM_DEFINE (scm_seek, "seek", 3, 0, 0,
SCM_MISC_ERROR ("port is not seekable",
scm_cons (fd_port, SCM_EOL));
else
- rv = ptob->seek (fd_port, off, how);
+ {
+ scm_i_clear_pending_eof (fd_port);
+ rv = ptob->seek (fd_port, off, how);
+ }
return scm_from_off_t_or_off64_t (rv);
}
else /* file descriptor?. */
diff --git a/test-suite/tests/ports.test b/test-suite/tests/ports.test
index 886ab24..7b6ee22 100644
--- a/test-suite/tests/ports.test
+++ b/test-suite/tests/ports.test
@@ -1110,6 +1110,90 @@
(char-ready?))))))
+;;;; pending-eof behavior
+
+(with-test-prefix "pending EOF behavior"
+ ;; Make a test port that will produce the given sequence. Each
+ ;; element of 'lst' may be either a character or #f (which means EOF).
+ (define (test-soft-port . lst)
+ (make-soft-port
+ (vector (lambda (c) #f) ; write char
+ (lambda (s) #f) ; write string
+ (lambda () #f) ; flush
+ (lambda () ; read char
+ (let ((c (car lst)))
+ (set! lst (cdr lst))
+ c))
+ (lambda () #f)) ; close
+ "rw"))
+
+ (define (call-with-port p proc)
+ (dynamic-wind
+ (lambda () #f)
+ (lambda () (proc p))
+ (lambda () (close-port p))))
+
+ (define (call-with-test-file str proc)
+ (let ((filename (test-file)))
+ (dynamic-wind
+ (lambda () (call-with-output-file filename
+ (lambda (p) (display str p))))
+ (lambda () (call-with-input-file filename proc))
+ (lambda () (delete-file (test-file))))))
+
+ (pass-if "peek-char does not swallow EOF (soft port)"
+ (call-with-port (test-soft-port #\a #f #\b)
+ (lambda (p)
+ (and (char=? #\a (peek-char p))
+ (char=? #\a (read-char p))
+ (eof-object? (peek-char p))
+ (eof-object? (read-char p))
+ (char=? #\b (peek-char p))
+ (char=? #\b (read-char p))))))
+
+ (pass-if "unread clears pending EOF (soft port)"
+ (call-with-port (test-soft-port #\a #f #\b)
+ (lambda (p)
+ (and (char=? #\a (read-char p))
+ (eof-object? (peek-char p))
+ (begin (unread-char #\u p)
+ (char=? #\u (read-char p)))))))
+
+ (pass-if "unread clears pending EOF (string port)"
+ (call-with-input-string "a"
+ (lambda (p)
+ (and (char=? #\a (read-char p))
+ (eof-object? (peek-char p))
+ (begin (unread-char #\u p)
+ (char=? #\u (read-char p)))))))
+
+ (pass-if "unread clears pending EOF (file port)"
+ (call-with-test-file
+ "a"
+ (lambda (p)
+ (and (char=? #\a (read-char p))
+ (eof-object? (peek-char p))
+ (begin (unread-char #\u p)
+ (char=? #\u (read-char p)))))))
+
+ (pass-if "seek clears pending EOF (string port)"
+ (call-with-input-string "a"
+ (lambda (p)
+ (and (char=? #\a (read-char p))
+ (eof-object? (peek-char p))
+ (begin (seek p 0 SEEK_SET)
+ (char=? #\a (read-char p)))))))
+
+ (pass-if "seek clears pending EOF (file port)"
+ (call-with-test-file
+ "a"
+ (lambda (p)
+ (and (char=? #\a (read-char p))
+ (eof-object? (peek-char p))
+ (begin (seek p 0 SEEK_SET)
+ (char=? #\a (read-char p))))))))
+
+
;;;; Close current-input-port, and make sure everyone can handle it.
(with-test-prefix "closing current-input-port"
--
1.7.10.4
Information forwarded
to
bug-guile <at> gnu.org
:
bug#12216
; Package
guile
.
(Thu, 04 Apr 2013 20:04:02 GMT)
Full text and
rfc822 format available.
Message #78 received at 12216 <at> debbugs.gnu.org (full text, mbox):
On Tue 02 Apr 2013 21:41, Mark H Weaver <mhw <at> netris.org> writes:
> Good idea! I've posted separate patches to do that here:
>
> "[PATCH] Move slow path out of 'scm_get_byte_or_eof' et al"
> http://lists.gnu.org/archive/html/guile-devel/2013-04/msg00032.html
Cool, and thanks to Ludo for review there. Just to note, if you end up
doing any more work on these functions in the future, you should move
them to ports.h. See 18cd9aff9429c99ffae34448507f9b468e20e06f for an
example of what I'm talking about.
>> 2. I think we probably need some tests.
>
> I've attached a new patch which adds tests. Note that this patch
> depends upon the "Move slow path out" patches referenced above.
Looks great to me. Thanks for following up on this!
Andy
--
http://wingolog.org/
Reply sent
to
Mark H Weaver <mhw <at> netris.org>
:
You have taken responsibility.
(Thu, 04 Apr 2013 21:58:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
dwheeler <at> dwheeler.com
:
bug acknowledged by developer.
(Thu, 04 Apr 2013 21:58:02 GMT)
Full text and
rfc822 format available.
Message #83 received at 12216-done <at> debbugs.gnu.org (full text, mbox):
Andy Wingo <wingo <at> pobox.com> writes:
> Looks great to me. Thanks for following up on this!
Excellent! I pushed this to stable-2.0, so it will be in Guile 2.0.8.
Closing this bug now.
Thanks!
Mark
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 03 May 2013 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 10 years and 360 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.