Paul Eggert | 3 Jan 2008 01:16
Favicon

Re: [PATCH 3/3] Be nice with file systems that don't handle unusual characters.

Thanks for those three patches (in
<http://lists.gnu.org/archive/html/autoconf-patches/2007-12/msg00070.html>,
<http://lists.gnu.org/archive/html/autoconf-patches/2007-12/msg00069.html>,
<http://lists.gnu.org/archive/html/autoconf-patches/2007-12/msg00071.html>).
I assume the 3rd patch is still experimental and not meant to be installed
yet.  In reviewing the first two I see a minor issue:

+# Neutralize special characters interpreted by sed in replacement strings.
+case $configure_input in #(
+  *'&'*) ac_sed_conf_input=`AS_ECHO(["$configure_input"]) |
+                             sed 's/\\\\/\\\\\\\\/g;s/&/\\\\\\&/g'`;; #(

I'm a bit lost here, but shouldn't this check for \ in
$configure_input as well?  Also, wouldn't the last line be a bit
simpler as:

                              sed 's/[[\\\\&]]/\\\\&/g'`;; #(

Benoit Sigoure | 3 Jan 2008 11:55
Picon
Picon
Picon

Re: [PATCH 3/3] Be nice with file systems that don't handle unusual characters.

Hi Paul,
Happy new year :)

On Jan 3, 2008, at 1:16 AM, Paul Eggert wrote:

> Thanks for those three patches (in
> <http://lists.gnu.org/archive/html/autoconf-patches/2007-12/ 
> msg00070.html>,
> <http://lists.gnu.org/archive/html/autoconf-patches/2007-12/ 
> msg00069.html>,
> <http://lists.gnu.org/archive/html/autoconf-patches/2007-12/ 
> msg00071.html>).
> I assume the 3rd patch is still experimental and not meant to be  
> installed
> yet.  In reviewing the first two I see a minor issue:
>
> +# Neutralize special characters interpreted by sed in replacement  
> strings.
> +case $configure_input in #(
> +  *'&'*) ac_sed_conf_input=`AS_ECHO(["$configure_input"]) |
> +                             sed 's/\\\\/\\\\\\\\/g;s/&/\\\\\\&/ 
> g'`;; #(
>
> I'm a bit lost here, but shouldn't this check for \ in
> $configure_input as well?

Nope, since $configure_input is used to expand  <at> configure_input <at>   
which would lead to file containing a message saying "<file-name- 
escaped>.  Generated from <file-name-escaped>.in" rather than "<file- 
name>.  Generated from <file-name>.in".  It's not really a bug but  
(Continue reading)

Ralf Wildenhues | 5 Jan 2008 04:09
Picon
Picon

Re: [PATCH 1/3] Properly expand <at> configure_input <at> in config.status.

Hello Benoit,

Thanks for working on this.

* Benoit Sigoure wrote on Tue, Dec 18, 2007 at 01:54:06PM CET:
> 	Fixes test 119.
> 	* lib/autoconf/status.m4 (_AC_OUTPUT_FILE): Escape the
> 	backslashes and ampersands in $configure_input before using
> 	it in the sed replacement string to expand  <at> configure_input <at> .

This is missing a test to expose the failure, and as such, it's no
wonder the fix doesn't work.  ;-)

The other patch to add to the testsuite should include  <at> configure_input <at> 
somewhere in the *.in file, and the test should check for that.

> --- a/lib/autoconf/status.m4
> +++ b/lib/autoconf/status.m4
>  <at>  <at>  -624,6 +624,13  <at>  <at>  esac
>  _ACEOF
>  ])dnl
>  
> +# Neutralize special characters interpreted by sed in replacement strings.
> +case $configure_input in #(
> +  *'&'*) ac_sed_conf_input=`AS_ECHO(["$configure_input"]) |
> +			      sed 's/\\\\/\\\\\\\\/g;s/&/\\\\\\&/g'`;; #(
> +  *) ac_sed_conf_input=$configure_input;;
> +esac

This bit of code is executed at configure time, but needs to be run at
(Continue reading)

Paul Eggert | 5 Jan 2008 09:24
Favicon

Re: [PATCH 1/3] Properly expand <at> configure_input <at> in config.status.

Ralf Wildenhues <Ralf.Wildenhues <at> gmx.de> writes:

> If & is used as sed s delimiter, then escaping & in the RHS right is
> tricky (portably), as it's now both delimiter and replacement
> operator, and literal.

I don't see why it's tricky to do portably.  If the RHS contains '&',
escape it with a backslash.  That's simple and portable, no?

The advantage of using & as a delimiter is that any other choice of a
delimiter means one more character to escape (namely, the delimiter).

(Admittedly these are all minor points.)

Ralf Wildenhues | 5 Jan 2008 11:28
Picon
Picon

Re: [PATCH 1/3] Properly expand <at> configure_input <at> in config.status.

Hello Paul,

* Paul Eggert wrote on Sat, Jan 05, 2008 at 09:24:46AM CET:
> Ralf Wildenhues <Ralf.Wildenhues <at> gmx.de> writes:
> 
> > If & is used as sed s delimiter, then escaping & in the RHS right is
> > tricky (portably), as it's now both delimiter and replacement
> > operator, and literal.
> 
> I don't see why it's tricky to do portably.  If the RHS contains '&',
> escape it with a backslash.  That's simple and portable, no?

But it doesn't work the way it should.  Sorry, I should have given an
example:  With GNU sed 4.1.5,
  echo x | sed 's&x&ab\&c&'

results in
  abxc

rather than
  ab&c

which is what is returned by the seds on AIX, BSDs, Solaris, IRIX,
Tru64, and reading SUSv3, I think it does not specify which is right.

> The advantage of using & as a delimiter is that any other choice of a
> delimiter means one more character to escape (namely, the delimiter).
> 
> (Admittedly these are all minor points.)

(Continue reading)

Paul Eggert | 5 Jan 2008 18:41
Favicon

Re: [PATCH 1/3] Properly expand <at> configure_input <at> in config.status.

Ralf Wildenhues <Ralf.Wildenhues <at> gmx.de> writes:

> example:  With GNU sed 4.1.5,
>   echo x | sed 's&x&ab\&c&'

Thanks for explaining.  I emailed a bug report to bonzini <at> gnu.org.

Paolo Bonzini | 5 Jan 2008 19:36
Picon
Gravatar

Re: [PATCH 1/3] Properly expand <at> configure_input <at> in config.status.


> But it doesn't work the way it should.  Sorry, I should have given an
> example:  With GNU sed 4.1.5,
>   echo x | sed 's&x&ab\&c&'
> 
> results in
>   abxc
> 
> rather than
>   ab&c
> 
> which is what is returned by the seds on AIX, BSDs, Solaris, IRIX,
> Tru64, and reading SUSv3, I think it does not specify which is right.

Yes, it is undefined and the current behavior is more coherent with what 
is done on the LHS of the `s' command.  There, you want to strip slashes 
or the following command

    s/a\/b//

will rely on the regex matcher matching a literal slash for the escape 
\/.  This behavior changed in 4.1.x as part of a general reorganization 
of the sed parser to better support 7-bit multibyte character sets (e.g. 
ISO-2022) in the sed commands.  In the command above, sed sees

    s&x&ab\&c&

as if it was:

    s/x/ab&c/
(Continue reading)

Paul Eggert | 6 Jan 2008 07:25
Favicon

Re: [PATCH 1/3] Properly expand <at> configure_input <at> in config.status.

Paolo Bonzini <bonzini <at> gnu.org> writes:

> Yes, it is undefined

I don't see why 's&foo&\&&' is undefined.  The POSIX standard says:

   Any character other than backslash or <newline> can be used instead
   of a slash to delimit the BRE and the replacement. Within the BRE
   and the replacement, the BRE delimiter itself can be used as a
   literal character if it is preceded by a backslash.

Therefore, preceding & by a backslash makes it a "literal character",
i.e., a character that is not special.  Where's the ambiguity?

Even if the standard were ambiguous (which I don't yet see), there is
a practical advantage to behaving compatibly with other 'sed'
implementations in this area.

Anyway, if you like, I can file an interpretation request with the
POSIX folks about this.

Paolo Bonzini | 6 Jan 2008 10:20
Picon

Re: [PATCH 1/3] Properly expand <at> configure_input <at> in config.status.

Paul Eggert wrote:
> Paolo Bonzini <bonzini <at> gnu.org> writes:
> 
>> Yes, it is undefined
> 
> I don't see why 's&foo&\&&' is undefined.  The POSIX standard says:
> 
>    Any character other than backslash or <newline> can be used instead
>    of a slash to delimit the BRE and the replacement. Within the BRE
>    and the replacement, the BRE delimiter itself can be used as a
>    literal character if it is preceded by a backslash.
> 
> Therefore, preceding & by a backslash makes it a "literal character",
> i.e., a character that is not special.  Where's the ambiguity?

I'm interested in how other seds behave for say

s{a\{1,2\}{b{

since the current way GNU sed works for the RHS depends on a change in 
the interpretation of the RHS.

Paolo

Paolo Bonzini | 6 Jan 2008 10:26
Picon
Gravatar

Re: [PATCH 1/3] Properly expand <at> configure_input <at> in config.status.


> I'm interested in how other seds behave for say
> 
> s{a\{1,2\}{b{

and also

   s.a\.c.xyz.

though I already suspect that the two will be different in GNU sed and 
other seds.  Thanks!

Paolo


Gmane