GNU bug report logs - #24530
tests: revamp multibyte-white-space test to be more permissive

Previous Next

Package: grep;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Sat, 24 Sep 2016 23:04:02 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 24530 in the body.
You can then email your comments to 24530 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#24530; Package grep. (Sat, 24 Sep 2016 23:04:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jim Meyering <jim <at> meyering.net>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Sat, 24 Sep 2016 23:04:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: bug-grep <at> gnu.org
Subject: tests: revamp multibyte-white-space test to be more permissive
Date: Sat, 24 Sep 2016 16:02:55 -0700
[Message part 1 (text/plain, inline)]
grep's multibyte-white-space would too often fail.
Its failure was mainly a reflection on the system's poor locale
support, so this test did not give good signal on whether one would be
well-advised to install the resulting grep binary.

I've done this:

        tests: revamp multibyte-white-space test to be more permissive
        This test elicits too many failures. Whether a system has accurate
        unicode "whitespace" attributes should not influence whether grep's
        test suite passes.  In many cases, now you will see a warning that
        some multibyte characters do not pass whitespace-related tests, but
        this test no longer fails.  However, if you run this test on a modern
        enough system, it does require that \s and \S do work properly with
        most of the listed characters.
        * tests/multibyte-white-space: Confirm that Fedora 24's locale
        tables still declare those four Unicode code points *not* whitespace.
        Honor a new column telling how to handle failure.  Provide more
        information in each diagnostic.

With the attached patch, even on Fedora 24, we see new warnings like
this (before those characters were not even checked), and the test
passes as it did before:

 warning: \s failed to match \xe2\x80\x87 in the en_US.UTF-8 locale
 warning: \S mistakenly matched \xe2\x80\x87 in the en_US.UTF-8 locale
 warning: \s failed to match \xe2\x80\x8b in the en_US.UTF-8 locale
 warning: \S mistakenly matched \xe2\x80\x8b in the en_US.UTF-8 locale
 warning: \s failed to match \xe2\x80\xaf in the en_US.UTF-8 locale
 warning: \S mistakenly matched \xe2\x80\xaf in the en_US.UTF-8 locale

More importantly, on less modern systems, while this test would fail
before, now it will merely emit warnings like the above.
[tests--revamp-multibyte-white-space.diff (application/octet-stream, attachment)]

Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Sun, 25 Sep 2016 00:26:02 GMT) Full text and rfc822 format available.

Notification sent to Jim Meyering <jim <at> meyering.net>:
bug acknowledged by developer. (Sun, 25 Sep 2016 00:26:02 GMT) Full text and rfc822 format available.

Message #10 received at 24530-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: 24530-done <at> debbugs.gnu.org, "Nelson H. F. Beebe" <beebe <at> math.utah.edu>
Subject: Re: bug#24530: tests: revamp multibyte-white-space test to be more
 permissive
Date: Sat, 24 Sep 2016 17:25:23 -0700
On Sat, Sep 24, 2016 at 4:02 PM, Jim Meyering <jim <at> meyering.net> wrote:
> grep's multibyte-white-space would too often fail.
> Its failure was mainly a reflection on the system's poor locale
> support, so this test did not give good signal on whether one would be
> well-advised to install the resulting grep binary.
>
> I've done this:
>
>         tests: revamp multibyte-white-space test to be more permissive
>         This test elicits too many failures. Whether a system has accurate
>         unicode "whitespace" attributes should not influence whether grep's
>         test suite passes.  In many cases, now you will see a warning that
>         some multibyte characters do not pass whitespace-related tests, but
>         this test no longer fails.  However, if you run this test on a modern
>         enough system, it does require that \s and \S do work properly with
>         most of the listed characters.
>         * tests/multibyte-white-space: Confirm that Fedora 24's locale
>         tables still declare those four Unicode code points *not* whitespace.
>         Honor a new column telling how to handle failure.  Provide more
>         information in each diagnostic.
>
> With the attached patch, even on Fedora 24, we see new warnings like
> this (before those characters were not even checked), and the test
> passes as it did before:
>
>  warning: \s failed to match \xe2\x80\x87 in the en_US.UTF-8 locale
>  warning: \S mistakenly matched \xe2\x80\x87 in the en_US.UTF-8 locale
>  warning: \s failed to match \xe2\x80\x8b in the en_US.UTF-8 locale
>  warning: \S mistakenly matched \xe2\x80\x8b in the en_US.UTF-8 locale
>  warning: \s failed to match \xe2\x80\xaf in the en_US.UTF-8 locale
>  warning: \S mistakenly matched \xe2\x80\xaf in the en_US.UTF-8 locale
>
> More importantly, on less modern systems, while this test would fail
> before, now it will merely emit warnings like the above.

Pushed: http://git.sv.gnu.org/cgit/grep.git/commit/?id=7c4c69400c6ab




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 23 Oct 2016 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 184 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.