GNU bug report logs - #19842
sed bug: using -e instead of a literal newline in s replacement fails

Previous Next

Package: sed;

Reported by: Evan Gates <evan.gates <at> gmail.com>

Date: Thu, 12 Feb 2015 01:52:01 UTC

Severity: wishlist

Tags: moreinfo, notabug

To reply to this bug, email your comments to 19842 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-sed <at> gnu.org:
bug#19842; Package sed. (Thu, 12 Feb 2015 01:52:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Evan Gates <evan.gates <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-sed <at> gnu.org. (Thu, 12 Feb 2015 01:52:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Evan Gates <evan.gates <at> gmail.com>
To: bug-sed <at> gnu.org
Subject: sed bug: using -e instead of a literal newline in s replacement fails
Date: Wed, 11 Feb 2015 17:20:24 -0800
Hello,

sed 's/foo/bar\
baz/'

works as expected. But using multiple -e instead of a literal newline fails:

$ sed -e 's/foo/bar\' -e baz/
sed: -e expression #1, char 10: unterminated `s' command

The following is from POSIX[0]:

If any -e or -f options are specified, the script of editing commands
shall initially be empty. The commands specified by each -e or -f
option shall be added to the script in the order specified. When each
addition is made, if the previous addition (if any) was from a -e
option, a <newline> shall be inserted before the new addition. The
resulting script shall have the same properties as the script operand,
described in the OPERANDS section.

My reading of that leads me to believe that the two commands should
create identical scripts, but GNU sed seems to me interpreting the
script before the addition of the newline and second -e's argument.

Is this a bug or desired behavior?

Thanks,
Evan

$ sed --version
sed (GNU sed) 4.2.2
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Jay Fenlason, Tom Lord, Ken Pizzini,
and Paolo Bonzini.
GNU sed home page: <http://www.gnu.org/software/sed/>.
General help using GNU software: <http://www.gnu.org/gethelp/>.
E-mail bug reports to: <bug-sed <at> gnu.org>.
Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.

[0] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_04




Information forwarded to bug-sed <at> gnu.org:
bug#19842; Package sed. (Tue, 17 Feb 2015 23:49:01 GMT) Full text and rfc822 format available.

Message #8 received at 19842 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Evan Gates <evan.gates <at> gmail.com>
Cc: 19842 <at> debbugs.gnu.org
Subject: Re: bug#19842: sed bug: using -e instead of a literal newline in s
 replacement fails
Date: Wed, 18 Feb 2015 08:48:50 +0900
On Wed, 11 Feb 2015 17:20:24 -0800
Evan Gates <evan.gates <at> gmail.com> wrote:

> Hello,
> 
> sed 's/foo/bar\
> baz/'
> 
> works as expected. But using multiple -e instead of a literal newline fails:
> 
> $ sed -e 's/foo/bar\' -e baz/
> sed: -e expression #1, char 10: unterminated `s' command
> 
> The following is from POSIX[0]:
> 
> If any -e or -f options are specified, the script of editing commands
> shall initially be empty. The commands specified by each -e or -f
> option shall be added to the script in the order specified. When each
> addition is made, if the previous addition (if any) was from a -e
> option, a <newline> shall be inserted before the new addition. The
> resulting script shall have the same properties as the script operand,
> described in the OPERANDS section.
> 
> My reading of that leads me to believe that the two commands should
> create identical scripts, but GNU sed seems to me interpreting the
> script before the addition of the newline and second -e's argument.
> 
> Is this a bug or desired behavior?
> 
> Thanks,
> Evan
> 
> $ sed --version
> sed (GNU sed) 4.2.2
> Copyright (C) 2012 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> 
> Written by Jay Fenlason, Tom Lord, Ken Pizzini,
> and Paolo Bonzini.
> GNU sed home page: <http://www.gnu.org/software/sed/>.
> General help using GNU software: <http://www.gnu.org/gethelp/>.
> E-mail bug reports to: <bug-sed <at> gnu.org>.
> Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.
> 
> [0] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_04

Hi,

I interprete following as multiple `-e' option does not merge fragments
of two commands.

  # If any -e or -f options are specified, the script of editing commands
  # shall initially be empty.

i.e. the command by first -e option is parsed without the completion,
the buffer is initialized into empty before next -e option.

Thanks,
Norihiro





Information forwarded to bug-sed <at> gnu.org:
bug#19842; Package sed. (Wed, 18 Feb 2015 00:25:01 GMT) Full text and rfc822 format available.

Message #11 received at 19842 <at> debbugs.gnu.org (full text, mbox):

From: Evan Gates <evan.gates <at> gmail.com>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 19842 <at> debbugs.gnu.org
Subject: Re: bug#19842: sed bug: using -e instead of a literal newline in s
 replacement fails
Date: Tue, 17 Feb 2015 16:23:54 -0800
On Tue, Feb 17, 2015 at 3:48 PM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> Hi,
>
> I interprete following as multiple `-e' option does not merge fragments
> of two commands.
>
>   # If any -e or -f options are specified, the script of editing commands
>   # shall initially be empty.
>
> i.e. the command by first -e option is parsed without the completion,
> the buffer is initialized into empty before next -e option.
>
> Thanks,
> Norihiro
>

Hi Norihiro,

Thanks for replying. I respectfully disagree with your interpretation.

The -e and -f options talk about "the script of editing commands" as
the entire script/program that will run once sed starts reading input.
Both -e and -f add commands to "the end of the script of editing
commands." The line you quoted uses the exact same phrase:

1) -e  script
Add the editing commands specified by the script option-argument to
the end of the script of editing commands.
2) -f  script_file
Add the editing commands in the file script_file to the end of the
script of editing commands.
3) If any -e or -f options are specified, the script of editing
commands shall initially be empty.

I posit that "the script of editing commands" means the same thing in
all three places. Therefore (3) means that the script/program that sed
will run is empty before the first -e or -f.

Your interpretation would cause "the script of editing commands" to
mean something different in (3) than it means in (1) and (2).

Thank you,
Evan




Information forwarded to bug-sed <at> gnu.org:
bug#19842; Package sed. (Wed, 18 Feb 2015 13:25:02 GMT) Full text and rfc822 format available.

Message #14 received at 19842 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Evan Gates <evan.gates <at> gmail.com>
Cc: 19842 <at> debbugs.gnu.org
Subject: Re: bug#19842: sed bug: using -e instead of a literal newline in s
 replacement fails
Date: Wed, 18 Feb 2015 22:24:46 +0900
On Tue, 17 Feb 2015 16:23:54 -0800
Evan Gates <evan.gates <at> gmail.com> wrote:

> Hi Norihiro,
> 
> Thanks for replying. I respectfully disagree with your interpretation.
> 
> The -e and -f options talk about "the script of editing commands" as
> the entire script/program that will run once sed starts reading input.
> Both -e and -f add commands to "the end of the script of editing
> commands." The line you quoted uses the exact same phrase:
> 
> 1) -e  script
> Add the editing commands specified by the script option-argument to
> the end of the script of editing commands.
> 2) -f  script_file
> Add the editing commands in the file script_file to the end of the
> script of editing commands.
> 3) If any -e or -f options are specified, the script of editing
> commands shall initially be empty.
> 
> I posit that "the script of editing commands" means the same thing in
> all three places. Therefore (3) means that the script/program that sed
> will run is empty before the first -e or -f.
> 
> Your interpretation would cause "the script of editing commands" to
> mean something different in (3) than it means in (1) and (2).
> 
> Thank you,
> Evan

Hi Evan,

Sorry, I was wrong.  However, it is not written in anywhere that multiple
-e and/or -f options must be analyzed after the concatenation.  It may be
concatenated after each -e and/or -f options are parsed.

By the way, /usr/bin/sed and /usr/xpg4/bin/sed on Solaris 11, /usr/bin/sed
on HP-UX 11.23 also behave as same as GNU sed.  In other words, they also
return an error code for following.

  $ sed -e 's/foo/bar\' -e baz/

Thanks,
Noririro





Information forwarded to bug-sed <at> gnu.org:
bug#19842; Package sed. (Wed, 18 Feb 2015 18:55:01 GMT) Full text and rfc822 format available.

Message #17 received at 19842 <at> debbugs.gnu.org (full text, mbox):

From: Evan Gates <evan.gates <at> gmail.com>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 19842 <at> debbugs.gnu.org
Subject: Re: bug#19842: sed bug: using -e instead of a literal newline in s
 replacement fails
Date: Wed, 18 Feb 2015 10:54:32 -0800
On Wed, Feb 18, 2015 at 5:24 AM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> Hi Evan,
>
> Sorry, I was wrong.  However, it is not written in anywhere that multiple
> -e and/or -f options must be analyzed after the concatenation.  It may be
> concatenated after each -e and/or -f options are parsed.
>
> By the way, /usr/bin/sed and /usr/xpg4/bin/sed on Solaris 11, /usr/bin/sed
> on HP-UX 11.23 also behave as same as GNU sed.  In other words, they also
> return an error code for following.
>
>   $ sed -e 's/foo/bar\' -e baz/
>
> Thanks,
> Noririro
>

Hi Norihiro,

The POSIX description of sed doesn't make any assumptions about the
inner workings of sed. It says nothing about analyzing, compiling,
building the internal structures, it only talks about adding to "the
script of editing commands" which will then be run. "The script of
editing commands" refers to a representation of the editing commands
in accord with the  descriptions that follow on the page, i.e. text
commands, a sed script. This is supported by the following line:

When each addition is made, if the previous addition (if any) was from
a -e option, a <newline> shall be inserted before the new addition.

The newline is mentioned because it is referring to the text
representation of the script (the only representation mentioned on the
page). It explains that all the parts from -e and -f should in effect
be concatenated to a single sed script, with newlines after the -e
option arguments. e.g.:

sed -e 's/foo/bar\' -e 'baz/'

and

printf %s\\n 's/foo/bar\' 'baz/' > script
sed -f script

should be identical, due to that newline.

Thanks,
Evan




Information forwarded to bug-sed <at> gnu.org:
bug#19842; Package sed. (Wed, 22 Jul 2015 13:58:01 GMT) Full text and rfc822 format available.

Message #20 received at 19842 <at> debbugs.gnu.org (full text, mbox):

From: Hans-Bernhard Bröker <HBBroeker <at> t-online.de>
To: 19842 <at> debbugs.gnu.org
Subject: sed bug: using -e instead of a literal newline in s replacement fails
Date: Wed, 22 Jul 2015 15:56:43 +0200
> The following is from POSIX[0]:

> If any -e or -f options are specified, the script of editing commands
> shall initially be empty. The commands specified by each -e or -f
> option shall be added to the script in the order specified.

I think the solution to this mystery might be that the above statement 
is a good deal more strict than people have taken it.  It speaks of 
"commands specified by each -e".  Well, the example case's -e options 
_do_not_ each specify commands.  They each only specify part of one command.

So as I read this, this report is invalid by way of its expectations not 
being backed by the POSIX specification.




Information forwarded to bug-sed <at> gnu.org:
bug#19842; Package sed. (Wed, 25 Jan 2017 04:07:02 GMT) Full text and rfc822 format available.

Message #23 received at 19842 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: 19842 <at> debbugs.gnu.org
Subject: re: sed bug: using -e instead of a literal newline in s replacement
 fails
Date: Tue, 24 Jan 2017 23:05:58 -0500
tag 19842 notabug
close 19842
stop


Hello Evan and all,

I'm triaging old sed bugs.
(for past discussion, see https://bugs.gnu.org/19842 ).


First,
for completeness, this is the behavior with various sed implementations:

$ echo foo | sed-gnu-4.3 -e 's/foo/bar\' -e 'baz/'
sed: -e expression #1, char 10: unterminated `s' command

$ echo foo | sed-netbsd-7.0 -e 's/foo/bar\' -e 'baz/'
bar
baz

$ echo foo | sed-freebsd-10 -e 's/foo/bar\' -e 'baz/'
bar
baz

$ echo foo | sed-openbsd-5.9 -e 's/foo/bar\' -e 'baz/'
barbaz

$ echo foo | sed-heirloom -e 's/foo/bar\' -e 'baz/'
Undefined label: az/

$ echo foo | sed-busybox -e 's/foo/bar\' -e 'baz/'
bar
baz


Second,
Notice the backslash plays a role, indicating continuation for some
implementations. Without backslash, '-e' are not always concatenated:

$ echo foo | sed-netbsd-7.0 -e 's/foo/bar' -e 'baz/'
sed-netbsd-7.0: 1: "s/foo/bar
": unescaped newline inside substitute pattern

$ echo foo | sed-freebsd-10 -e 's/foo/bar' -e 'baz/'
sed-freebsd-10: 1: "s/foo/bar
": unescaped newline inside substitute pattern

$ echo foo | sed-openbsd-5.9 -e 's/foo/bar' -e 'baz/'
barbaz

$ echo foo | sed-busybox -e 's/foo/bar' -e 'baz/'
sed: unmatched '/'



Third,
even in OpenBSD's sed which accepts this construct,
it seems this is limited to 's'. It doesn't "just work"
in all commands:

$ echo a | sed-openbsd-5.9 -e 'y/abc/123/'
1

$ echo a | sed-openbsd-5.9 -e 'y/abc/1\' -e '23/'
sed: 1: "y/abc/1\": unterminated transform target string

$ echo a | sed-openbsd-5.9 -e 'y/abc/1' -e '23/'
sed: 1: "y/abc/1": unterminated transform target string

as opposed to Busybox where it does work:

$ echo a | sed-busybox -e 'y/abc\' -e '/123/'
1



Lastly,
GNU sed does have one special case where trailing backslash
plays a role: in a/c/i commands. This is specifically done
to facilitate programs such as:

$ echo a | sed-gnu-4.3 -e '1i\' -e 'foobar'
foobar
a

$ echo a | sed-openbsd-5.9 -e '1i\' -e 'foobar'
foobara

$ echo a | sed-netbsd-7.0 -e '1i\' -e 'foobar'
sed-netbsd-7.0: 1: "foobar
": invalid command code f


As such,
I would say that this is not bug per-se in gnu sed.
It is not clear what is the correct behavior, and depending
on one's POSIX interpretation might even be undefined.

I'm marking this as 'done', but discussion can continue by replying to
this thread. Better yet - if you have a patch that adds this
functionality without causing regressions, we can look into
incorporating it.


regards,
- assaf










Added tag(s) notabug. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 25 Jan 2017 04:07:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 19842 <at> debbugs.gnu.org and Evan Gates <evan.gates <at> gmail.com> Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 25 Jan 2017 04:07:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-sed <at> gnu.org:
bug#19842; Package sed. (Wed, 25 Jan 2017 17:07:02 GMT) Full text and rfc822 format available.

Message #30 received at 19842 <at> debbugs.gnu.org (full text, mbox):

From: Evan Gates <evan.gates <at> gmail.com>
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: 19842 <at> debbugs.gnu.org
Subject: Re: bug#19842: sed bug: using -e instead of a literal newline in s
 replacement fails
Date: Wed, 25 Jan 2017 09:06:21 -0800
On Tue, Jan 24, 2017 at 8:05 PM, Assaf Gordon <assafgordon <at> gmail.com> wrote:
> Hello Evan and all,
>
> I'm triaging old sed bugs.
> (for past discussion, see https://bugs.gnu.org/19842 ).

Hi Assaf, thanks for taking the time to dig though this and respond.

> Second,
> Notice the backslash plays a role, indicating continuation for some
> implementations. Without backslash, '-e' are not always concatenated:
>
> Third,
> even in OpenBSD's sed which accepts this construct,
> it seems this is limited to 's'. It doesn't "just work"
> in all commands:

Yes the backslash is important and specific to the s command. In the
description of the s command POSIX says:

A line can be split by substituting a <newline> into it. The
application shall escape the <newline> in the replacement by preceding
it by a <backslash>.

So the examples without the backslash are irrelevant as are the
examples with the y command.

As for concatenation, POSIX says:

-e script
Add the editing commands specified by the script option-argument to
the end of the script of editing commands.

And more importantly:

If any -e or -f options are specified, the script of editing commands
shall initially be empty. The commands specified by each -e or -f
option shall be added to the script in the order specified. When each
addition is made, if the previous addition (if any) was from a -e
option, a <newline> shall be inserted before the new addition. The
resulting script shall have the same properties as the script operand,
described in the OPERANDS section.

This is a case where the previous addition was from a -e option, so a
newline should be inserted. After the newline is inserted the script
we have is

s/foo/bar\
baz/

And that "resulting script shall have the same properties..." means
that it should work the same as if it were in a file or in a single
argument with a literal newline.

> I'm marking this as 'done', but discussion can continue by replying to
> this thread. Better yet - if you have a patch that adds this
> functionality without causing regressions, we can look into
> incorporating it.

I do not have a patch, but if you are interested I can ask on the
Austin group list and see if there is any consensus and if so whether
they would change the wording of the standard.

Thank you for your time and work,
Evan




Information forwarded to bug-sed <at> gnu.org:
bug#19842; Package sed. (Thu, 26 Jan 2017 02:38:01 GMT) Full text and rfc822 format available.

Message #33 received at 19842 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Evan Gates <evan.gates <at> gmail.com>
Cc: 19842 <at> debbugs.gnu.org
Subject: Re: bug#19842: sed bug: using -e instead of a literal newline in s
 replacement fails
Date: Thu, 26 Jan 2017 02:37:16 +0000
reopen 19842
tags 19842 +moreinfo
thanks

Hello Evan,

On Wed, Jan 25, 2017 at 09:06:21AM -0800, Evan Gates wrote:
>I do not have a patch, but if you are interested I can ask on the
>Austin group list and see if there is any consensus and if so whether
>they would change the wording of the standard.

So let's re-open this for now, especially if you're going to raise this 
issue further on the Austin group. When you do, please send a note with 
the URL so we can keep track as well.

The behavior already differs between implementations, so it'll be 
intersting to see what's decided.

I assume that a discussion about this will cover all commands,
e.g. this as well:

 $ echo xyz | sed -e 'y/xyz/1\n3/'
 1
 3

versus:

 $ echo xyz | sed -e 'y/xyz/1\' -e '3/'
 sed: -e expression #1, char 8: unterminated `y' command


thanks,
- assaf





Did not alter fixed versions and reopened. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 26 Jan 2017 02:38:02 GMT) Full text and rfc822 format available.

Severity set to 'wishlist' from 'normal' Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Tue, 09 Oct 2018 12:05:02 GMT) Full text and rfc822 format available.

Added tag(s) moreinfo. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Tue, 09 Oct 2018 12:14:01 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 198 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.