GNU bug report logs -
#19388
grep 2.21-1 identifies iso encoded text files as binary
Previous Next
Reported by: Martin Hoch <hoch <at> fidion.de>
Date: Mon, 15 Dec 2014 16:49:01 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 19388 in the body.
You can then email your comments to 19388 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#19388
; Package
grep
.
(Mon, 15 Dec 2014 16:49:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Martin Hoch <hoch <at> fidion.de>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Mon, 15 Dec 2014 16:49:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi,
I noticed that grep 2.21-1 regards ISO-8859-15 encoded files as binary, if
LC_ALL is set to en_US.UTF.
I am not sure if this is a bug or an expected behaviour change in 2.21-1, but
since I could not find anything in the changelog that directly mentions it, I am
reporting it. (I could not find anything on http://debbugs.gnu.org)
How to reproduce:
Create a iso-8859-15 encoded test file with: test ä ö ü
export LC_ALL=en_US.UTF8
grep test testfile
Binary file test matches
export LC_ALL=en_US
(grep works as expected)
The behaviour for LC_ALL=en_US.UTF8 was changed in 2.21-1 and worked correctly
in 2.20-1.
I am testing this on arch with glibc 2.20-4 (if that is relevant).
Please let me know if you need more informations.
Regards,
Martin
--
Martin Hoch Friedrich-Bergius-Ring 15
fidion GmbH 97076 Würzburg
Information forwarded
to
bug-grep <at> gnu.org
:
bug#19388
; Package
grep
.
(Mon, 15 Dec 2014 16:50:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 19388 <at> debbugs.gnu.org (full text, mbox):
GNU bug Tracking System writes:
> Thank you for filing a new bug report with debbugs.gnu.org.
>
> This is an automatically generated reply to let you know your message
> has been received.
>
> Your message is being forwarded to the package maintainers and other
> interested parties for their attention; they will reply in due course.
>
> Your message has been sent to the package maintainer(s):
> bug-grep <at> gnu.org
>
> If you wish to submit further information on this problem, please
> send it to 19388 <at> debbugs.gnu.org.
>
> Please do not send mail to help-debbugs <at> gnu.org unless you wish
> to report a problem with the Bug-tracking system.
>
> --
> 19388: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19388
> GNU Bug Tracking System
> Contact help-debbugs <at> gnu.org with problems
Danke fuer Ihre E-Mail. Ich bin aktuell erkrankt. Ihre E-Mail
wird nicht weiter geleitet. Wenden Sie sich in dringenden Faellen bitte an
support <at> fidion.de.
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Tue, 16 Dec 2014 07:13:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Martin Hoch <hoch <at> fidion.de>
:
bug acknowledged by developer.
(Tue, 16 Dec 2014 07:13:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 19388-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Martin Hoch wrote:
> I noticed that grep 2.21-1 regards ISO-8859-15 encoded files as binary, if
> LC_ALL is set to en_US.UTF.
>
> I am not sure if this is a bug or an expected behaviour change in 2.21-1
It's an expected change. Although this was documented in NEWS:
If a file contains data improperly encoded for the current locale,
and this is discovered before any of the file's contents are output,
grep now treats the file as binary.
the grep manual is not so clear about it. I installed the attached patch to try
to fix that.
[0001-doc-document-binary-data-heuristic-better.patch (text/x-diff, attachment)]
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 13 Jan 2015 12:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 77 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.