GNU bug report logs -
#19842
sed bug: using -e instead of a literal newline in s replacement fails
Previous Next
To reply to this bug, email your comments to 19842 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-sed <at> gnu.org
:
bug#19842
; Package
sed
.
(Thu, 12 Feb 2015 01:52:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Evan Gates <evan.gates <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-sed <at> gnu.org
.
(Thu, 12 Feb 2015 01:52:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hello,
sed 's/foo/bar\
baz/'
works as expected. But using multiple -e instead of a literal newline fails:
$ sed -e 's/foo/bar\' -e baz/
sed: -e expression #1, char 10: unterminated `s' command
The following is from POSIX[0]:
If any -e or -f options are specified, the script of editing commands
shall initially be empty. The commands specified by each -e or -f
option shall be added to the script in the order specified. When each
addition is made, if the previous addition (if any) was from a -e
option, a <newline> shall be inserted before the new addition. The
resulting script shall have the same properties as the script operand,
described in the OPERANDS section.
My reading of that leads me to believe that the two commands should
create identical scripts, but GNU sed seems to me interpreting the
script before the addition of the newline and second -e's argument.
Is this a bug or desired behavior?
Thanks,
Evan
$ sed --version
sed (GNU sed) 4.2.2
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Jay Fenlason, Tom Lord, Ken Pizzini,
and Paolo Bonzini.
GNU sed home page: <http://www.gnu.org/software/sed/>.
General help using GNU software: <http://www.gnu.org/gethelp/>.
E-mail bug reports to: <bug-sed <at> gnu.org>.
Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.
[0] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_04
Information forwarded
to
bug-sed <at> gnu.org
:
bug#19842
; Package
sed
.
(Tue, 17 Feb 2015 23:49:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 19842 <at> debbugs.gnu.org (full text, mbox):
On Wed, 11 Feb 2015 17:20:24 -0800
Evan Gates <evan.gates <at> gmail.com> wrote:
> Hello,
>
> sed 's/foo/bar\
> baz/'
>
> works as expected. But using multiple -e instead of a literal newline fails:
>
> $ sed -e 's/foo/bar\' -e baz/
> sed: -e expression #1, char 10: unterminated `s' command
>
> The following is from POSIX[0]:
>
> If any -e or -f options are specified, the script of editing commands
> shall initially be empty. The commands specified by each -e or -f
> option shall be added to the script in the order specified. When each
> addition is made, if the previous addition (if any) was from a -e
> option, a <newline> shall be inserted before the new addition. The
> resulting script shall have the same properties as the script operand,
> described in the OPERANDS section.
>
> My reading of that leads me to believe that the two commands should
> create identical scripts, but GNU sed seems to me interpreting the
> script before the addition of the newline and second -e's argument.
>
> Is this a bug or desired behavior?
>
> Thanks,
> Evan
>
> $ sed --version
> sed (GNU sed) 4.2.2
> Copyright (C) 2012 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
>
> Written by Jay Fenlason, Tom Lord, Ken Pizzini,
> and Paolo Bonzini.
> GNU sed home page: <http://www.gnu.org/software/sed/>.
> General help using GNU software: <http://www.gnu.org/gethelp/>.
> E-mail bug reports to: <bug-sed <at> gnu.org>.
> Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.
>
> [0] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_04
Hi,
I interprete following as multiple `-e' option does not merge fragments
of two commands.
# If any -e or -f options are specified, the script of editing commands
# shall initially be empty.
i.e. the command by first -e option is parsed without the completion,
the buffer is initialized into empty before next -e option.
Thanks,
Norihiro
Information forwarded
to
bug-sed <at> gnu.org
:
bug#19842
; Package
sed
.
(Wed, 18 Feb 2015 00:25:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 19842 <at> debbugs.gnu.org (full text, mbox):
On Tue, Feb 17, 2015 at 3:48 PM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> Hi,
>
> I interprete following as multiple `-e' option does not merge fragments
> of two commands.
>
> # If any -e or -f options are specified, the script of editing commands
> # shall initially be empty.
>
> i.e. the command by first -e option is parsed without the completion,
> the buffer is initialized into empty before next -e option.
>
> Thanks,
> Norihiro
>
Hi Norihiro,
Thanks for replying. I respectfully disagree with your interpretation.
The -e and -f options talk about "the script of editing commands" as
the entire script/program that will run once sed starts reading input.
Both -e and -f add commands to "the end of the script of editing
commands." The line you quoted uses the exact same phrase:
1) -e script
Add the editing commands specified by the script option-argument to
the end of the script of editing commands.
2) -f script_file
Add the editing commands in the file script_file to the end of the
script of editing commands.
3) If any -e or -f options are specified, the script of editing
commands shall initially be empty.
I posit that "the script of editing commands" means the same thing in
all three places. Therefore (3) means that the script/program that sed
will run is empty before the first -e or -f.
Your interpretation would cause "the script of editing commands" to
mean something different in (3) than it means in (1) and (2).
Thank you,
Evan
Information forwarded
to
bug-sed <at> gnu.org
:
bug#19842
; Package
sed
.
(Wed, 18 Feb 2015 13:25:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 19842 <at> debbugs.gnu.org (full text, mbox):
On Tue, 17 Feb 2015 16:23:54 -0800
Evan Gates <evan.gates <at> gmail.com> wrote:
> Hi Norihiro,
>
> Thanks for replying. I respectfully disagree with your interpretation.
>
> The -e and -f options talk about "the script of editing commands" as
> the entire script/program that will run once sed starts reading input.
> Both -e and -f add commands to "the end of the script of editing
> commands." The line you quoted uses the exact same phrase:
>
> 1) -e script
> Add the editing commands specified by the script option-argument to
> the end of the script of editing commands.
> 2) -f script_file
> Add the editing commands in the file script_file to the end of the
> script of editing commands.
> 3) If any -e or -f options are specified, the script of editing
> commands shall initially be empty.
>
> I posit that "the script of editing commands" means the same thing in
> all three places. Therefore (3) means that the script/program that sed
> will run is empty before the first -e or -f.
>
> Your interpretation would cause "the script of editing commands" to
> mean something different in (3) than it means in (1) and (2).
>
> Thank you,
> Evan
Hi Evan,
Sorry, I was wrong. However, it is not written in anywhere that multiple
-e and/or -f options must be analyzed after the concatenation. It may be
concatenated after each -e and/or -f options are parsed.
By the way, /usr/bin/sed and /usr/xpg4/bin/sed on Solaris 11, /usr/bin/sed
on HP-UX 11.23 also behave as same as GNU sed. In other words, they also
return an error code for following.
$ sed -e 's/foo/bar\' -e baz/
Thanks,
Noririro
Information forwarded
to
bug-sed <at> gnu.org
:
bug#19842
; Package
sed
.
(Wed, 18 Feb 2015 18:55:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 19842 <at> debbugs.gnu.org (full text, mbox):
On Wed, Feb 18, 2015 at 5:24 AM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> Hi Evan,
>
> Sorry, I was wrong. However, it is not written in anywhere that multiple
> -e and/or -f options must be analyzed after the concatenation. It may be
> concatenated after each -e and/or -f options are parsed.
>
> By the way, /usr/bin/sed and /usr/xpg4/bin/sed on Solaris 11, /usr/bin/sed
> on HP-UX 11.23 also behave as same as GNU sed. In other words, they also
> return an error code for following.
>
> $ sed -e 's/foo/bar\' -e baz/
>
> Thanks,
> Noririro
>
Hi Norihiro,
The POSIX description of sed doesn't make any assumptions about the
inner workings of sed. It says nothing about analyzing, compiling,
building the internal structures, it only talks about adding to "the
script of editing commands" which will then be run. "The script of
editing commands" refers to a representation of the editing commands
in accord with the descriptions that follow on the page, i.e. text
commands, a sed script. This is supported by the following line:
When each addition is made, if the previous addition (if any) was from
a -e option, a <newline> shall be inserted before the new addition.
The newline is mentioned because it is referring to the text
representation of the script (the only representation mentioned on the
page). It explains that all the parts from -e and -f should in effect
be concatenated to a single sed script, with newlines after the -e
option arguments. e.g.:
sed -e 's/foo/bar\' -e 'baz/'
and
printf %s\\n 's/foo/bar\' 'baz/' > script
sed -f script
should be identical, due to that newline.
Thanks,
Evan
Information forwarded
to
bug-sed <at> gnu.org
:
bug#19842
; Package
sed
.
(Wed, 22 Jul 2015 13:58:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 19842 <at> debbugs.gnu.org (full text, mbox):
> The following is from POSIX[0]:
> If any -e or -f options are specified, the script of editing commands
> shall initially be empty. The commands specified by each -e or -f
> option shall be added to the script in the order specified.
I think the solution to this mystery might be that the above statement
is a good deal more strict than people have taken it. It speaks of
"commands specified by each -e". Well, the example case's -e options
_do_not_ each specify commands. They each only specify part of one command.
So as I read this, this report is invalid by way of its expectations not
being backed by the POSIX specification.
Information forwarded
to
bug-sed <at> gnu.org
:
bug#19842
; Package
sed
.
(Wed, 25 Jan 2017 04:07:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 19842 <at> debbugs.gnu.org (full text, mbox):
tag 19842 notabug
close 19842
stop
Hello Evan and all,
I'm triaging old sed bugs.
(for past discussion, see https://bugs.gnu.org/19842 ).
First,
for completeness, this is the behavior with various sed implementations:
$ echo foo | sed-gnu-4.3 -e 's/foo/bar\' -e 'baz/'
sed: -e expression #1, char 10: unterminated `s' command
$ echo foo | sed-netbsd-7.0 -e 's/foo/bar\' -e 'baz/'
bar
baz
$ echo foo | sed-freebsd-10 -e 's/foo/bar\' -e 'baz/'
bar
baz
$ echo foo | sed-openbsd-5.9 -e 's/foo/bar\' -e 'baz/'
barbaz
$ echo foo | sed-heirloom -e 's/foo/bar\' -e 'baz/'
Undefined label: az/
$ echo foo | sed-busybox -e 's/foo/bar\' -e 'baz/'
bar
baz
Second,
Notice the backslash plays a role, indicating continuation for some
implementations. Without backslash, '-e' are not always concatenated:
$ echo foo | sed-netbsd-7.0 -e 's/foo/bar' -e 'baz/'
sed-netbsd-7.0: 1: "s/foo/bar
": unescaped newline inside substitute pattern
$ echo foo | sed-freebsd-10 -e 's/foo/bar' -e 'baz/'
sed-freebsd-10: 1: "s/foo/bar
": unescaped newline inside substitute pattern
$ echo foo | sed-openbsd-5.9 -e 's/foo/bar' -e 'baz/'
barbaz
$ echo foo | sed-busybox -e 's/foo/bar' -e 'baz/'
sed: unmatched '/'
Third,
even in OpenBSD's sed which accepts this construct,
it seems this is limited to 's'. It doesn't "just work"
in all commands:
$ echo a | sed-openbsd-5.9 -e 'y/abc/123/'
1
$ echo a | sed-openbsd-5.9 -e 'y/abc/1\' -e '23/'
sed: 1: "y/abc/1\": unterminated transform target string
$ echo a | sed-openbsd-5.9 -e 'y/abc/1' -e '23/'
sed: 1: "y/abc/1": unterminated transform target string
as opposed to Busybox where it does work:
$ echo a | sed-busybox -e 'y/abc\' -e '/123/'
1
Lastly,
GNU sed does have one special case where trailing backslash
plays a role: in a/c/i commands. This is specifically done
to facilitate programs such as:
$ echo a | sed-gnu-4.3 -e '1i\' -e 'foobar'
foobar
a
$ echo a | sed-openbsd-5.9 -e '1i\' -e 'foobar'
foobara
$ echo a | sed-netbsd-7.0 -e '1i\' -e 'foobar'
sed-netbsd-7.0: 1: "foobar
": invalid command code f
As such,
I would say that this is not bug per-se in gnu sed.
It is not clear what is the correct behavior, and depending
on one's POSIX interpretation might even be undefined.
I'm marking this as 'done', but discussion can continue by replying to
this thread. Better yet - if you have a patch that adds this
functionality without causing regressions, we can look into
incorporating it.
regards,
- assaf
Added tag(s) notabug.
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Wed, 25 Jan 2017 04:07:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
19842 <at> debbugs.gnu.org and Evan Gates <evan.gates <at> gmail.com>
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Wed, 25 Jan 2017 04:07:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-sed <at> gnu.org
:
bug#19842
; Package
sed
.
(Wed, 25 Jan 2017 17:07:02 GMT)
Full text and
rfc822 format available.
Message #30 received at 19842 <at> debbugs.gnu.org (full text, mbox):
On Tue, Jan 24, 2017 at 8:05 PM, Assaf Gordon <assafgordon <at> gmail.com> wrote:
> Hello Evan and all,
>
> I'm triaging old sed bugs.
> (for past discussion, see https://bugs.gnu.org/19842 ).
Hi Assaf, thanks for taking the time to dig though this and respond.
> Second,
> Notice the backslash plays a role, indicating continuation for some
> implementations. Without backslash, '-e' are not always concatenated:
>
> Third,
> even in OpenBSD's sed which accepts this construct,
> it seems this is limited to 's'. It doesn't "just work"
> in all commands:
Yes the backslash is important and specific to the s command. In the
description of the s command POSIX says:
A line can be split by substituting a <newline> into it. The
application shall escape the <newline> in the replacement by preceding
it by a <backslash>.
So the examples without the backslash are irrelevant as are the
examples with the y command.
As for concatenation, POSIX says:
-e script
Add the editing commands specified by the script option-argument to
the end of the script of editing commands.
And more importantly:
If any -e or -f options are specified, the script of editing commands
shall initially be empty. The commands specified by each -e or -f
option shall be added to the script in the order specified. When each
addition is made, if the previous addition (if any) was from a -e
option, a <newline> shall be inserted before the new addition. The
resulting script shall have the same properties as the script operand,
described in the OPERANDS section.
This is a case where the previous addition was from a -e option, so a
newline should be inserted. After the newline is inserted the script
we have is
s/foo/bar\
baz/
And that "resulting script shall have the same properties..." means
that it should work the same as if it were in a file or in a single
argument with a literal newline.
> I'm marking this as 'done', but discussion can continue by replying to
> this thread. Better yet - if you have a patch that adds this
> functionality without causing regressions, we can look into
> incorporating it.
I do not have a patch, but if you are interested I can ask on the
Austin group list and see if there is any consensus and if so whether
they would change the wording of the standard.
Thank you for your time and work,
Evan
Information forwarded
to
bug-sed <at> gnu.org
:
bug#19842
; Package
sed
.
(Thu, 26 Jan 2017 02:38:01 GMT)
Full text and
rfc822 format available.
Message #33 received at 19842 <at> debbugs.gnu.org (full text, mbox):
reopen 19842
tags 19842 +moreinfo
thanks
Hello Evan,
On Wed, Jan 25, 2017 at 09:06:21AM -0800, Evan Gates wrote:
>I do not have a patch, but if you are interested I can ask on the
>Austin group list and see if there is any consensus and if so whether
>they would change the wording of the standard.
So let's re-open this for now, especially if you're going to raise this
issue further on the Austin group. When you do, please send a note with
the URL so we can keep track as well.
The behavior already differs between implementations, so it'll be
intersting to see what's decided.
I assume that a discussion about this will cover all commands,
e.g. this as well:
$ echo xyz | sed -e 'y/xyz/1\n3/'
1
3
versus:
$ echo xyz | sed -e 'y/xyz/1\' -e '3/'
sed: -e expression #1, char 8: unterminated `y' command
thanks,
- assaf
Did not alter fixed versions and reopened.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 26 Jan 2017 02:38:02 GMT)
Full text and
rfc822 format available.
Severity set to 'wishlist' from 'normal'
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Tue, 09 Oct 2018 12:05:02 GMT)
Full text and
rfc822 format available.
Added tag(s) moreinfo.
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Tue, 09 Oct 2018 12:14:01 GMT)
Full text and
rfc822 format available.
This bug report was last modified 5 years and 198 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.