GNU bug report logs -
#22144
--exclude no longer works against arguments with a directory name
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 22144 in the body.
You can then email your comments to 22144 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Fri, 11 Dec 2015 18:38:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Vincent Lefevre <vincent <at> vinc17.net>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Fri, 11 Dec 2015 18:38:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
In grep 2.22, --exclude no longer works in some cases:
$ cd /usr/share/doc/grep
$ grep e --exclude README README
is OK, but not:
$ grep e --exclude README /usr/share/doc/grep/README
Copyright (C) 1992, 1997-2002, 2004-2015 Free Software Foundation, Inc.
[...]
This breaks at least one of my scripts, where --exclude is used to
exclude filenames generated with globbing.
After reverting to grep 2.21, this problem disappeared.
My Debian bug report:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=807641
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Fri, 11 Dec 2015 21:38:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 22144 <at> debbugs.gnu.org (full text, mbox):
The change in grep 2.22 is due to an earlier bug report:
http://bugs.gnu.org/21027
and was implemented by this patch:
http://git.savannah.gnu.org/cgit/grep.git/commit/?id=c5c70eae261133d71a9436557d998a48aaf0a5fe
Although I can see arguments either way, the grep 2.22 behavior is
consistent with grep 2.6 and earlier, so in some sense it's
more-conservative.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Sat, 12 Dec 2015 01:57:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 22144 <at> debbugs.gnu.org (full text, mbox):
On 2015-12-11 13:37:46 -0800, Paul Eggert wrote:
> The change in grep 2.22 is due to an earlier bug report:
>
> http://bugs.gnu.org/21027
This one was about --exclude-dir, whose description in grep 2.21 is
very unclear and it was already broken anyway:
zira:~> grep -rl e --exclude-dir='usr*' /usr/include
zira:~> grep -rl e --exclude-dir='usr*' /usr/include/stdio.h
/usr/include/stdio.h
But for --exclude, the description is clear:
--exclude=GLOB
Skip files whose base name matches GLOB (using wildcard
^^^^^^^^^
matching). A file-name glob can use *, ?, and [...]
as wildcards, and \ to quote a wildcard or backslash
character literally.
"base name" means base name, not the full path!
I suggest that you revert the behavior for --exclude so that existing
scripts are not broken, and possibly add --exclude-path to match the
full path.
Now, --exclude was already broken:
$ grep -l . /usr/share/doc/grep/*
/usr/share/doc/grep/AUTHORS
/usr/share/doc/grep/NEWS.gz
/usr/share/doc/grep/README
/usr/share/doc/grep/THANKS.gz
/usr/share/doc/grep/TODO.gz
/usr/share/doc/grep/changelog.Debian.gz
/usr/share/doc/grep/changelog.gz
/usr/share/doc/grep/copyright
$ grep -l --exclude='*e*' . /usr/share/doc/grep/*
outputs nothing, while one should get:
/usr/share/doc/grep/AUTHORS
/usr/share/doc/grep/NEWS.gz
/usr/share/doc/grep/README
/usr/share/doc/grep/THANKS.gz
/usr/share/doc/grep/TODO.gz
/usr/share/doc/grep/copyright
excluding files whose base name contains a letter "e".
And with grep 2.22, this is still inconsistent:
ypig:~> grep -l --exclude=AUTHORS . /usr/share/doc/grep/*
/usr/share/doc/grep/AUTHORS
/usr/share/doc/grep/NEWS.gz
/usr/share/doc/grep/README
/usr/share/doc/grep/THANKS.gz
/usr/share/doc/grep/TODO.gz
/usr/share/doc/grep/changelog.Debian.gz
/usr/share/doc/grep/changelog.gz
/usr/share/doc/grep/copyright
ypig:~> grep -rl --exclude=AUTHORS . /usr/share/doc/grep
/usr/share/doc/grep/TODO.gz
/usr/share/doc/grep/changelog.gz
/usr/share/doc/grep/changelog.Debian.gz
/usr/share/doc/grep/THANKS.gz
/usr/share/doc/grep/NEWS.gz
/usr/share/doc/grep/copyright
/usr/share/doc/grep/README
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Sat, 12 Dec 2015 03:32:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 22144 <at> debbugs.gnu.org (full text, mbox):
On 12/11/2015 05:56 PM, Vincent Lefevre wrote:
> or --exclude, the description is clear:
The description changed in grep 2.22, to match the 2.22 (also,
2.6-and-earlier) behavior.
As you say, the 2.22 behavior does not seem ideal. However, the 2.7
through 2.21 behavior wasn't ideal either. It's not clear which behavior
is better overall, nor is it clear whether we could unify parts of the
two behaviors to get the best of both worlds. I'd rather not add yet
another option in this area, if that can be avoided.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Tue, 15 Dec 2015 09:00:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 22144 <at> debbugs.gnu.org (full text, mbox):
On 2015-12-11 19:31:46 -0800, Paul Eggert wrote:
> On 12/11/2015 05:56 PM, Vincent Lefevre wrote:
> >or --exclude, the description is clear:
>
> The description changed in grep 2.22, to match the 2.22 (also,
> 2.6-and-earlier) behavior.
My quote was from the grep 2.22 description (grep 2.22-1 Debian
package). It seems that this has changed later:
http://www.gnu.org/software/grep/manual/grep.html
which is different:
Skip files whose name matches the pattern glob, using wildcard
matching. When searching recursively, skip any subfile whose base
name matches glob; the base name is the part after the last ‘/’.
A pattern can use ‘*’, ‘?’, and ‘[’...‘]’ as wildcards, and \ to
quote a wildcard or backslash character literally.
The documentation is still ambiguous. For the "main case", is this
the canonical name as returned by realpath?
> As you say, the 2.22 behavior does not seem ideal.
By doing a difference for subfiles of a recursive search, this is
even worse!
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Tue, 15 Dec 2015 23:28:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 22144 <at> debbugs.gnu.org (full text, mbox):
Vincent Lefevre wrote:
> For the "main case", is this
> the canonical name as returned by realpath?
I don't see why. grep doesn't need to compute anything's realpath.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Wed, 16 Dec 2015 01:01:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 22144 <at> debbugs.gnu.org (full text, mbox):
On 2015-12-15 15:27:27 -0800, Paul Eggert wrote:
> Vincent Lefevre wrote:
> >For the "main case", is this
> >the canonical name as returned by realpath?
>
> I don't see why. grep doesn't need to compute anything's realpath.
How is the file name defined, then?
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Wed, 16 Dec 2015 06:25:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 22144 <at> debbugs.gnu.org (full text, mbox):
Vincent Lefevre wrote:
> How is the file name defined, then?
It's built as a string, which is passed to 'open' without worrying about realpath.
By the way in case it's not already clear, I agree with you that the current
behavior is not good, it's just that we can't simply revert (as that was also
not good), we need to make it better.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Wed, 16 Dec 2015 10:27:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 22144 <at> debbugs.gnu.org (full text, mbox):
On 2015-12-15 22:24:25 -0800, Paul Eggert wrote:
> Vincent Lefevre wrote:
> >How is the file name defined, then?
>
> It's built as a string, which is passed to 'open' without worrying
> about realpath.
So, for instance, "foo" and "./foo" are regarded as different?
The way how files are regarded to be the same needs to be clarified.
For instance, some utilities consider the device & inode numbers
(which have their own problems with broken FS implementations).
That's the case of cp (if this has not changed since 2004). That's
also the case of diff (if this has not changed since 2005), but you
know that. :)
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Wed, 16 Dec 2015 23:13:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 22144 <at> debbugs.gnu.org (full text, mbox):
Vincent Lefevre wrote:
> So, for instance, "foo" and "./foo" are regarded as different?
Yes. Grep does not need to worry about inodes or realpath or anything like that,
so it doesn't.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Mon, 28 Dec 2015 09:10:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 22144 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Vincent Lefevre wrote:
> The documentation is still ambiguous. For the "main case", is this
> the canonical name as returned by realpath?
>
>> >As you say, the 2.22 behavior does not seem ideal.
> By doing a difference for subfiles of a recursive search, this is
> even worse!
Please try the attached patch, which I've installed into the savannah
repository. It attempts to fix the behavior, and to clarify the documentation.
It is a tricky area. Hope this helps.
[0001-grep-exclude-matches-trailing-parts-of-args.patch (text/x-diff, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#22144
; Package
grep
.
(Wed, 30 Dec 2015 14:25:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 22144 <at> debbugs.gnu.org (full text, mbox):
On 2015-12-28 01:09:40 -0800, Paul Eggert wrote:
> Vincent Lefevre wrote:
> > The documentation is still ambiguous. For the "main case", is this
> > the canonical name as returned by realpath?
> >
> > > >As you say, the 2.22 behavior does not seem ideal.
> > By doing a difference for subfiles of a recursive search, this is
> > even worse!
>
> Please try the attached patch, which I've installed into the savannah
> repository. It attempts to fix the behavior, and to clarify the
> documentation. It is a tricky area. Hope this helps.
I've done various tests, and it seems fine. Thanks.
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Thu, 31 Dec 2015 06:38:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Vincent Lefevre <vincent <at> vinc17.net>
:
bug acknowledged by developer.
(Thu, 31 Dec 2015 06:38:02 GMT)
Full text and
rfc822 format available.
Message #43 received at 22144-done <at> debbugs.gnu.org (full text, mbox):
Vincent Lefevre wrote:
> I've done various tests, and it seems fine. Thanks.
You're welcome; closing the bug report.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 28 Jan 2016 12:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 8 years and 111 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.