GNU bug report logs - #19899
deleting lines of a file with sed - unexpected behaviour

Previous Next

Package: sed;

Reported by: Ethan Kaufman <ethan.kaufman <at> gmail.com>

Date: Thu, 19 Feb 2015 00:59:02 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 19899 in the body.
You can then email your comments to 19899 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-sed <at> gnu.org:
bug#19899; Package sed. (Thu, 19 Feb 2015 00:59:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ethan Kaufman <ethan.kaufman <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-sed <at> gnu.org. (Thu, 19 Feb 2015 00:59:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ethan Kaufman <ethan.kaufman <at> gmail.com>
To: bug-sed <at> gnu.org
Subject: deleting lines of a file with sed - unexpected behaviour
Date: Wed, 18 Feb 2015 19:56:58 -0500
[Message part 1 (text/plain, inline)]
To whom it may concern,

I noticed something odd while fooling around with sed.  If you try to
remove multiple line intervals (by number) from a file, but any interval
specified later in the list is a subset of an interval earlier in the list,
then an additional single line is removed after the specified (larger)
interval.

seq 10 > foo.txt

sed '2,7d;3,6d' foo.txt
1
9
10

Expected output is:

1
8
9
10

Additional tests:

For each additional redundant interval, another line is removed:
sed '2,7d;3,6d;4,5d' foo.txt
1
10

Reversing the order of the intervals produces the expected result!
sed '3,6d;2,7d' foo.txt
1
8
9
10

Specifying the intervals with '-e' produces the same result:
sed -e '2,7d' -e '3,6d' foo.txt
1
9
10

Using different interval syntax has mixed results:
sed -e '/2/,/7/d' -e '/3/,/6/d' foo.txt
1
8
9
10
sed -e '2,7d' -e '/3/,/6/d' foo.txt
1
8
9
10
sed -e '/2/,/7/d' -e '3,6d' foo.txt
1
9
10

Trailing list must be a subset for the additional line to be removed:

sed '2,5d;1,5d'
1
8
9
10
sed '2,5d;2,6d'
1
8
9
10
sed '2,5d;2,5d'
1
9
10

Versions:

Breakage appears to have occurred in the 4.1 release.  See the expected
output for all cases in GNU sed 3.02 and 4.09 (as well as BSD sed (Mac OS X
10.2 Yosemite and /bin/sed on Solaris), but not in 4.15 and 4.21.

This issue and above information has been discussed on stack overflow:
stackoverflow.com/questions/28595574/deleting-lines-of-a-file-with-sed-unexpected-behaviour

Cheers,
Ethan
[Message part 2 (text/html, inline)]

Information forwarded to bug-sed <at> gnu.org:
bug#19899; Package sed. (Thu, 19 Feb 2015 17:15:02 GMT) Full text and rfc822 format available.

Message #8 received at 19899 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Ethan Kaufman <ethan.kaufman <at> gmail.com>
Cc: 19899 <at> debbugs.gnu.org
Subject: Re: bug#19899: deleting lines of a file with sed - unexpected
 behaviour
Date: Fri, 20 Feb 2015 02:13:58 +0900
[Message part 1 (text/plain, inline)]
On Wed, 18 Feb 2015 19:56:58 -0500
Ethan Kaufman <ethan.kaufman <at> gmail.com> wrote:

> To whom it may concern,
> 
> I noticed something odd while fooling around with sed.  If you try to
> remove multiple line intervals (by number) from a file, but any interval
> specified later in the list is a subset of an interval earlier in the list,
> then an additional single line is removed after the specified (larger)
> interval.
> 
> seq 10 > foo.txt
> 
> sed '2,7d;3,6d' foo.txt
> 1
> 9
> 10
> 
> Expected output is:
> 
> 1
> 8
> 9
> 10
> 
> Additional tests:
> 
> For each additional redundant interval, another line is removed:
> sed '2,7d;3,6d;4,5d' foo.txt
> 1
> 10
> 
> Reversing the order of the intervals produces the expected result!
> sed '3,6d;2,7d' foo.txt
> 1
> 8
> 9
> 10
> 
> Specifying the intervals with '-e' produces the same result:
> sed -e '2,7d' -e '3,6d' foo.txt
> 1
> 9
> 10
> 
> Using different interval syntax has mixed results:
> sed -e '/2/,/7/d' -e '/3/,/6/d' foo.txt
> 1
> 8
> 9
> 10
> sed -e '2,7d' -e '/3/,/6/d' foo.txt
> 1
> 8
> 9
> 10
> sed -e '/2/,/7/d' -e '3,6d' foo.txt
> 1
> 9
> 10
> 
> Trailing list must be a subset for the additional line to be removed:
> 
> sed '2,5d;1,5d'
> 1
> 8
> 9
> 10
> sed '2,5d;2,6d'
> 1
> 8
> 9
> 10
> sed '2,5d;2,5d'
> 1
> 9
> 10
> 
> Versions:
> 
> Breakage appears to have occurred in the 4.1 release.  See the expected
> output for all cases in GNU sed 3.02 and 4.09 (as well as BSD sed (Mac OS X
> 10.2 Yosemite and /bin/sed on Solaris), but not in 4.15 and 4.21.
> 
> This issue and above information has been discussed on stack overflow:
> stackoverflow.com/questions/28595574/deleting-lines-of-a-file-with-sed-unexpected-behaviour
> 
> Cheers,
> Ethan

Hi Ethan,

I see that the behavior is a bug.  I tested following sed-4.2.2, old GNU
sed, sed on Solaris 11 and sed on HP-UX, and sed-4.2.2 only does not
return '8'.

  $ sed '2,7d;3,6d' foo.txt

When line number addresses are overwrapped in two editing commands,
sed always executes for at least one line with the second edition
command, even if out of between two addresses.  e.g. for '2,7d;3,6d',
sed executes for line 8 which should not be executed for.

Thanks,
Norihiro
[0001-sed-fix-mishandle-with-overwrapped-line-number-addre.patch (text/plain, attachment)]

Information forwarded to bug-sed <at> gnu.org:
bug#19899; Package sed. (Thu, 19 Feb 2015 17:21:02 GMT) Full text and rfc822 format available.

Message #11 received at 19899 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Ethan Kaufman <ethan.kaufman <at> gmail.com>
Cc: 19899 <at> debbugs.gnu.org
Subject: Re: bug#19899: deleting lines of a file with sed - unexpected
 behaviour
Date: Fri, 20 Feb 2015 02:20:37 +0900
[Message part 1 (text/plain, inline)]
Hi,

Sorry, I forgot 'git add' to add a new test.  I re-send fixed version of
the patch.

Thanks,
Norihiro
[0001-sed-fix-mishandle-with-overwrapped-line-number-addre.patch (text/plain, attachment)]

Information forwarded to bug-sed <at> gnu.org:
bug#19899; Package sed. (Tue, 05 May 2015 01:12:02 GMT) Full text and rfc822 format available.

Message #14 received at 19899 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: 19899 <at> debbugs.gnu.org, Ethan Kaufman <ethan.kaufman <at> gmail.com>, 
 Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Subject: Re: sed: fix mishandling of overlapping address ranges
Date: Mon, 4 May 2015 18:11:00 -0700
[Message part 1 (text/plain, inline)]
[resending, to @debbugs.gnu.org, not @bugs.gnu.org]

Thank you Ethan for the report, and Norihiro for the patch.
I've made adjustments to the patch, primarily to use the
init.sh-based style of test case (permitting to add just one
file for each test case, rather than 3 or more) and rewriting
the commit log text and NEWS entry.

Norihiro, please sanity-check before I push this.
[0001-sed-fix-mishandling-of-overlapping-address-ranges.patch (application/octet-stream, attachment)]

Information forwarded to bug-sed <at> gnu.org:
bug#19899; Package sed. (Tue, 05 May 2015 23:41:02 GMT) Full text and rfc822 format available.

Message #17 received at 19899 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Jim Meyering <jim <at> meyering.net>
Cc: Ethan Kaufman <ethan.kaufman <at> gmail.com>, 19899 <at> debbugs.gnu.org
Subject: Re: bug#19899: sed: fix mishandling of overlapping address ranges
Date: Wed, 06 May 2015 08:39:56 +0900
On Mon, 4 May 2015 18:11:00 -0700
Jim Meyering <jim <at> meyering.net> wrote:

> [resending, to @debbugs.gnu.org, not @bugs.gnu.org]
> 
> Thank you Ethan for the report, and Norihiro for the patch.
> I've made adjustments to the patch, primarily to use the
> init.sh-based style of test case (permitting to add just one
> file for each test case, rather than 3 or more) and rewriting
> the commit log text and NEWS entry.
> 
> Norihiro, please sanity-check before I push this.

Thanks for review and ajustment.  I confirmed them, and found no missing.
The test also does work expectedly.





Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Thu, 07 May 2015 02:22:02 GMT) Full text and rfc822 format available.

Notification sent to Ethan Kaufman <ethan.kaufman <at> gmail.com>:
bug acknowledged by developer. (Thu, 07 May 2015 02:22:02 GMT) Full text and rfc822 format available.

Message #22 received at 19899-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 19899-done <at> debbugs.gnu.org, Ethan Kaufman <ethan.kaufman <at> gmail.com>
Subject: Re: bug#19899: sed: fix mishandling of overlapping address ranges
Date: Wed, 6 May 2015 19:20:47 -0700
On Tue, May 5, 2015 at 4:39 PM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> On Mon, 4 May 2015 18:11:00 -0700
> Jim Meyering <jim <at> meyering.net> wrote:
>
>> [resending, to @debbugs.gnu.org, not @bugs.gnu.org]
>>
>> Thank you Ethan for the report, and Norihiro for the patch.
>> I've made adjustments to the patch, primarily to use the
>> init.sh-based style of test case (permitting to add just one
>> file for each test case, rather than 3 or more) and rewriting
>> the commit log text and NEWS entry.
>>
>> Norihiro, please sanity-check before I push this.
>
> Thanks for review and ajustment.  I confirmed them, and found no missing.
> The test also does work expectedly.

Thanks. Pushed.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 04 Jun 2015 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 331 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.