GNU bug report logs - #20200
GUILE 2.0.11: open-bytevector-input-port fails to open in binary mode

Previous Next

Package: guile;

Reported by: David Kastrup <dak <at> gnu.org>

Date: Wed, 25 Mar 2015 14:33:02 UTC

Severity: normal

Done: Mark H Weaver <mhw <at> netris.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 20200 in the body.
You can then email your comments to 20200 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#20200; Package guile. (Wed, 25 Mar 2015 14:33:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Kastrup <dak <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Wed, 25 Mar 2015 14:33:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: David Kastrup <dak <at> gnu.org>
To: bug-guile <at> gnu.org
Subject: GUILE 2.0.11: open-bytevector-input-port fails to open in binary mode
Date: Wed, 25 Mar 2015 15:31:32 +0100
[Message part 1 (text/plain, inline)]
Run the following code in an UTF-8 capable locale:

[bad.scm (text/plain, inline)]
(setlocale LC_ALL "")
(use-modules (rnrs io ports) (rnrs bytevectors) (ice-9 format))
(let ((p (open-bytevector-input-port
	  (u8-list->bytevector '(#xc3 #x9f #xc3 #X9f)))))
  (format #t "~a ~a\n" (port-encoding p) (binary-port? p))
  (format #t "#x~x\n" (char->integer (read-char p)))
  (format #t "~a ~a\n" (port-encoding p) (binary-port? p))
  (set-port-encoding! p "ISO-8859-1")
  (format #t "~a ~a\n" (port-encoding p) (binary-port? p))
  (format #t "#x~x\n" (char->integer (read-char p)))
  (format #t "~a ~a\n" (port-encoding p) (binary-port? p)))
[Message part 3 (text/plain, inline)]
This results in the output
#f #t
#xdf
#f #t
ISO-8859-1 #f
#xc3
ISO-8859-1 #f

The manual, however, states:

 -- Scheme Procedure: port-encoding port
 -- C Function: scm_port_encoding (port)
     Returns, as a string, the character encoding that PORT uses to
     interpret its input and output.  The value ‘#f’ is equivalent to
     ‘"ISO-8859-1"’.

That would appear to be false since the value #f here is treated as
equivalent to "UTF-8" rather than "ISO-8859-1".

In addition, the manual states

 -- Scheme Procedure: binary-port? port
     Return ‘#t’ if PORT is a "binary port", suitable for binary data
     input/output.

     Note that internally Guile does not differentiate between binary
     and textual ports, unlike the R6RS. Thus, this procedure returns
     true when PORT does not have an associated encoding—i.e., when
     ‘(port-encoding PORT)’ is ‘#f’ (*note port-encoding: Ports.).  This
     is the case for ports returned by R6RS procedures such as
     ‘open-bytevector-input-port’ and ‘make-custom-binary-output-port’.

     However, Guile currently does not prevent use of textual I/O
     procedures such as ‘display’ or ‘read-char’ with binary ports.
     Doing so “upgrades” the port from binary to textual, under the
     ISO-8859-1 encoding.  Likewise, Guile does not prevent use of
     ‘set-port-encoding!’ on a binary port, which also turns it into a
     “textual” port.

But it would appear that the only way to actually get binary-encoded
read-char behavior is to switch the port to textual.  While the port is
in "binary" mode, it will decode as utf-8 rather than deliver binary
data.  Also it will not automagically switch itself away from the
nominal #f encoding which is not actually present.

Putting (with-fluids ((%default-port-encoding #f)) ...) around the
open-bytevector-input-port call results in the output
#f #t
#xc3
ISO-8859-1 #f
ISO-8859-1 #f
#x9f
ISO-8859-1 #f
which actually corresponds to the documentation.

-- 
David Kastrup

Information forwarded to bug-guile <at> gnu.org:
bug#20200; Package guile. (Thu, 26 Mar 2015 22:58:01 GMT) Full text and rfc822 format available.

Message #8 received at 20200 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: David Kastrup <dak <at> gnu.org>
Cc: 20200 <at> debbugs.gnu.org
Subject: Re: bug#20200: GUILE 2.0.11: open-bytevector-input-port fails to open
 in binary mode
Date: Thu, 26 Mar 2015 18:57:53 -0400
David Kastrup <dak <at> gnu.org> writes:

> Run the following code in an UTF-8 capable locale:
>
> (setlocale LC_ALL "")
> (use-modules (rnrs io ports) (rnrs bytevectors) (ice-9 format))
> (let ((p (open-bytevector-input-port
> 	  (u8-list->bytevector '(#xc3 #x9f #xc3 #X9f)))))
>   (format #t "~a ~a\n" (port-encoding p) (binary-port? p))
>   (format #t "#x~x\n" (char->integer (read-char p)))
>   (format #t "~a ~a\n" (port-encoding p) (binary-port? p))
>   (set-port-encoding! p "ISO-8859-1")
>   (format #t "~a ~a\n" (port-encoding p) (binary-port? p))
>   (format #t "#x~x\n" (char->integer (read-char p)))
>   (format #t "~a ~a\n" (port-encoding p) (binary-port? p)))
>
> This results in the output
> #f #t
> #xdf
> #f #t
> ISO-8859-1 #f
> #xc3
> ISO-8859-1 #f
>
> The manual, however, states:
>
>  -- Scheme Procedure: port-encoding port
>  -- C Function: scm_port_encoding (port)
>      Returns, as a string, the character encoding that PORT uses to
>      interpret its input and output.  The value ‘#f’ is equivalent to
>      ‘"ISO-8859-1"’.
>
> That would appear to be false since the value #f here is treated as
> equivalent to "UTF-8" rather than "ISO-8859-1".

This is indeed a bug, introduced in Guile 2.0.9.  The workaround is to
explicitly set the encoding to "ISO-8859-1".

      Mark




Reply sent to Mark H Weaver <mhw <at> netris.org>:
You have taken responsibility. (Sat, 28 Mar 2015 20:14:01 GMT) Full text and rfc822 format available.

Notification sent to David Kastrup <dak <at> gnu.org>:
bug acknowledged by developer. (Sat, 28 Mar 2015 20:14:02 GMT) Full text and rfc822 format available.

Message #13 received at 20200-done <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: David Kastrup <dak <at> gnu.org>
Cc: 20200-done <at> debbugs.gnu.org
Subject: Re: bug#20200: GUILE 2.0.11: open-bytevector-input-port fails to open
 in binary mode
Date: Sat, 28 Mar 2015 16:13:56 -0400
Fixed in d574d96f879c147c6c14df43f2e4ff9e8a6876b9, which will be in
Guile 2.0.12.  I'm closing this bug now.

    Thanks,
      Mark




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 26 Apr 2015 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 360 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.