GNU bug report logs - #24516

Previous Next

Package: grep;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Fri, 23 Sep 2016 06:02:01 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 24516 in the body.
You can then email your comments to 24516 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#24516; Package grep. (Fri, 23 Sep 2016 06:02:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jim Meyering <jim <at> meyering.net>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 23 Sep 2016 06:02:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: bug-grep <at> gnu.org, "Nelson H. F. Beebe" <beebe <at> math.utah.edu>
Date: Thu, 22 Sep 2016 23:00:38 -0700
[Message part 1 (text/plain, inline)]
Nelson Beebe reported that the backref-multibyte-slow test was
failing. From his log, where I saw max_seconds=1, and the following
timeout 1s ... timed out.

I concluded that something (load on the underlying system, disk/kernel
latency) caused that final grep to take far more wall-clock time than
the former one. This would be expected if some other build/test were
running concurrently.

To work around that, I've written the attached: when the timeout would
be just 1s, increase it to 3. In coreutils tests (because there are so
many, and some were so abusive, and running them in parallel...), we
ended up increasing most timeouts to 10s, but for now, that seems like
it'd be overkill here.

I have so far been able to reproduce that failure, so the above is
just an educated guess. If someone can reproduce the failure and test
the patch, that'd be great.
[grep-unwarranted-failure-of-backref-multibyte-slow.diff (application/octet-stream, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#24516; Package grep. (Fri, 23 Sep 2016 15:09:01 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Nelson H. F. Beebe" <beebe <at> math.utah.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: bug-grep <at> gnu.org, "Nelson H. F. Beebe" <beebe <at> math.utah.edu>
Subject: Re:
Date: Fri, 23 Sep 2016 07:55:54 -0600
Thanks, Jim, for the patch to backref-multibyte-slow; I've applied it
to a new directory grep-2.25.92-f3e9.p1, and have builds in progress
now on the Fedora systems.

I think that any kind of wallclock-time-dependent test in a package
test suite is seriously prone to failure.  Although some developers
may have the luxury of a fast personal machine on which nothing else
is running when a package is built, many of us live in multiuser
environments where the load on the build machines is completely
unpredictable, and beyond our control.

Also, for someone like me with numerous VMs running on a single
desktop, parallel builds will certainly drive up the load.  With the
large number of test systems that I have, there is no reasonable way
to do the builds sequentially, because such a process could take hours
to days to complete.

My desktop has 16 cores (as far as Linux and the top utility are
concerned: it has a hyperthreaded 8-core 3GHz Xeon E5-1660v3 with 64GB
DDR-4 RAM).  Even during builds, it is rare for more than half the
cores to be more than 75% busy.  Here is a snapshot from top taken
during the Fedora grep builds:

    top - 07:46:43 up 71 days, 17:23,  7 users,  load average: 18.57, 18.60, 12.64
    Tasks: 1202 total,   3 running, 1187 sleeping,  11 stopped,   1 zombie
    %Cpu0  :  59.9/2.4    62[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                                      ]
    %Cpu1  :  18.9/3.2    22[||||||||||||||||||||||                                                                              ]
    %Cpu2  : 100.0/0.0   100[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||]
    %Cpu3  :  28.0/5.0    33[|||||||||||||||||||||||||||||||||                                                                   ]
    %Cpu4  : 100.0/0.0   100[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||]
    %Cpu5  :  28.4/6.0    34[||||||||||||||||||||||||||||||||||                                                                  ]
    %Cpu6  :  27.7/7.0    35[|||||||||||||||||||||||||||||||||||                                                                 ]
    %Cpu7  :  66.4/1.7    68[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                                ]
    %Cpu8  : 100.0/0.0   100[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||]
    %Cpu9  :  64.5/1.0    66[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                                  ]
    %Cpu10 :  31.9/2.3    34[||||||||||||||||||||||||||||||||||                                                                  ]
    %Cpu11 :  28.2/4.1    32[||||||||||||||||||||||||||||||||                                                                    ]
    %Cpu12 :   2.4/14.0   16[||||||||||||||||                                                                                    ]
    %Cpu13 :  26.5/4.5    31[|||||||||||||||||||||||||||||||                                                                     ]
    %Cpu14 :  19.2/3.8    23[|||||||||||||||||||||||                                                                             ]
    %Cpu15 :  13.5/47.6   61[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                                      ]
    KiB Mem : 65698984 total,  2975420 free, 54425204 used,  8298360 buff/cache
    KiB Swap: 98304000 total, 80390640 free, 17913356 used.  9423384 avail Mem 

Even with a load of 18 on the 16 cores, my desktop remains as
responsive as ever, so the 60 VMs that are also running there don't
bother me at all.

By the time I finished this letter, eight builds on Fedora 23, 24, 25,
and Rawhide with both cc and c99 had completed; all passed the
validation suite.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: beebe <at> math.utah.edu  -
- 155 S 1400 E RM 233                       beebe <at> acm.org  beebe <at> computer.org -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------




Information forwarded to bug-grep <at> gnu.org:
bug#24516; Package grep. (Fri, 23 Sep 2016 15:18:01 GMT) Full text and rfc822 format available.

Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: "Nelson H. F. Beebe" <beebe <at> math.utah.edu>
Cc: bug-grep <at> gnu.org
Subject: Re:
Date: Fri, 23 Sep 2016 08:16:52 -0700
On Fri, Sep 23, 2016 at 6:55 AM, Nelson H. F. Beebe <beebe <at> math.utah.edu> wrote:
> Thanks, Jim, for the patch to backref-multibyte-slow; I've applied it
> to a new directory grep-2.25.92-f3e9.p1, and have builds in progress
> now on the Fedora systems.
>
> I think that any kind of wallclock-time-dependent test in a package
> test suite is seriously prone to failure.

Writing portable perf-measuring tests is hard, indeed. I agree they're
prone to failure, *if* we're considering tests with naive absolute
limits. This however was a carefully written test to ensure that the
matcher no longer has O(N^2) behavior, using only relative duration
comparisons. The problem is that I don't want to make the test
duration unnecessarily long. If I didn't mind a longer-duration
initial test (to get a reference time), then there'd be no problem
requiring that the potentially much-longer-in-case-of-regression one
terminates in < 10x that duration.

Anyway, thanks for confirming that this simple fix is probably enough.

> Although some developers
> may have the luxury of a fast personal machine on which nothing else
> is running when a package is built, many of us live in multiuser

These tests are designed not just to work on my systems, but also on
bottom-end molasses-slow m68k simulators.

> environments where the load on the build machines is completely
> unpredictable, and beyond our control.
>
> Also, for someone like me with numerous VMs running on a single
> desktop, parallel builds will certainly drive up the load.  With the
> large number of test systems that I have, there is no reasonable way
> to do the builds sequentially, because such a process could take hours
> to days to complete.

Hah. I would never suggest that you or anyone else run builds/tests
sequentially. I have been militant about making things work in
parallel ever since Roland added job support to GNU make.




Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Fri, 23 Sep 2016 18:40:02 GMT) Full text and rfc822 format available.

Notification sent to Jim Meyering <jim <at> meyering.net>:
bug acknowledged by developer. (Fri, 23 Sep 2016 18:40:02 GMT) Full text and rfc822 format available.

Message #16 received at 24516-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: "Nelson H. F. Beebe" <beebe <at> math.utah.edu>
Cc: 24516-done <at> debbugs.gnu.org
Subject: Re: bug#24516:
Date: Fri, 23 Sep 2016 11:39:12 -0700
On Fri, Sep 23, 2016 at 8:16 AM, Jim Meyering <jim <at> meyering.net> wrote:
> Anyway, thanks for confirming that this simple fix is probably enough.

Adjusted s/3/5/ to be a little more consistent with the above
default-to-5s case, and pushed.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 22 Oct 2016 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 159 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.