Bruno Haible | 1 Jan 2009 20:35

Re: [PATCH 3/4] gnulib-tool: abort loops early where possible.

Hello Ralf,

> * gnulib-tool (func_modules_add_dummy, func_emit_lib_Makefile_am)
> (func_emit_tests_Makefile_am, func_import): Abort loops early if
> we already know the answer.

Thanks Ralf. According to your timings, this patch provides a noticeable
speedup and the code remains pretty. I've applied it like this:

--- gnulib-tool.orig	2009-01-01 20:32:57.000000000 +0100
+++ gnulib-tool	2009-01-01 20:32:19.000000000 +0100
 <at>  <at>  -1,6 +1,6  <at>  <at> 
 #! /bin/sh
 #
-# Copyright (C) 2002-2008 Free Software Foundation, Inc.
+# Copyright (C) 2002-2009 Free Software Foundation, Inc.
 #
 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
 <at>  <at>  -1590,6 +1590,7  <at>  <at> 
 # - modules         list of modules, including 'dummy' if needed
 func_modules_add_dummy ()
 {
+  # Determine whether any module provides a lib_SOURCES augmentation.
   have_lib_SOURCES=
   sed_remove_backslash_newline=':a
 /\\$/{
 <at>  <at>  -1606,7 +1607,10  <at>  <at> 
         # Ignore .h files since they are not compiled.
         case "$file" in
(Continue reading)

Bruno Haible | 1 Jan 2009 20:47

Re: sed, SIGPIPE, cmp -s [Re: [PATCH 3/4] gnulib-tool: abort loops early where possible.

Hi Jim,

>    ./bootstrap: Copying ._bootmp/gnulib-tests/Makefile.am to gnulib-tests/gnulib.mk...
>   -sed: couldn't write 60 items to stdout: Broken pipe
>   +sed: couldn't write 56 items to stdout: Broken pipe
> 
> I see that it's technically harmless and comes from here:
> 
>         sed "$remove_intl" $1/$dir/$file | cmp -s - $dir/$gnulib_mk || {
>               echo "$0: Copying $1/$dir/$file to $dir/$gnulib_mk ..." &&
> 
> because using GNU cmp's -s (--silent/ exit-status-only) option
> makes it refrain from reading all input, so sed's write evokes SIGPIPE.
> 
> I may take this as encouragement not to use -s,
> and just redirect to /dev/null instead.

The purpose of SIGPIPE is to speed up pipes of commands. By piping
cmp's result to /dev/null you renounce this optimization. In order to
exploit this optimization, without getting bash's error message, all
you need is func_reset_sigpipe from gnulib-tool, and

  (func_reset_sigpipe
   sed "$remove_intl" $1/$dir/$file | cmp -s - $dir/$gnulib_mk || {
               echo "$0: Copying $1/$dir/$file to $dir/$gnulib_mk ..." && ... } )

Let me add one more comment:

2009-01-01  Bruno Haible  <bruno <at> clisp.org>

(Continue reading)

Tom G. Christensen | 1 Jan 2009 21:20

Conflicting types for mblen on Solaris 2.6

The current daily snapshot is failing to build fprintftime:
depbase=`echo fprintftime.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
        gcc -std=gnu99 -DHAVE_CONFIG_H -DEXEEXT=\"\" -DEXEEXT=\"\" -DNO_XMALLOC -DEXEEXT=\"\" -I. -I.. 
-I../intl -I/usr/tgcware/include -D_REENTRANT  -g -O2 -MT fprintftime.o -MD -MP -MF $depbase.Tpo -c -o
fprintftime.o fprintftime.c &&\
        mv -f $depbase.Tpo $depbase.Po
In file included from ./stdint.h:482,
                 from ./stdlib.h:46,
                 from strftime.c:67,
                 from fprintftime.c:2:
./wchar.h:197: error: conflicting types for 'mblen'
/usr/tgcware/gcc-4.3.2/lib/gcc/sparc-sun-solaris2.6/4.3.2/include-fixed/stdlib.h:146:
error: previous declaration of 'mblen' was here
make[4]: *** [fprintftime.o] Error 1

stdlib.h:146 is:
extern int mblen(const char *, size_t);

wchar.h:197 is:
extern size_t mbrlen (const char *s, size_t n, mbstate_t *ps);

The type conflict I guess is caused by strftime.c:58:
#  define mbrlen(s, n, ps) mblen (s, n)

-tgc

Bruno Haible | 1 Jan 2009 21:56

gnulib-tool: fix security bugs

This fixes a couple of security bugs. A user of gnulib-tool could cause damage
to the fellow developers of his project by storing in gnulib-cache.m4 or
gnulib-comp.m4 text like:

gl_VC_FILES(`rm -rf /tmp/*`)

AC_DEFUN([gl_FILE_LIST], [
foo
bar
`rm -rf /tmp/*`
])

2009-01-01  Bruno Haible  <bruno <at> clisp.org>

	Fix a security bug.
	* gnulib-tool (func_import, import, update): Don't allow the characters
	'"', '$', '`', '\' in macro arguments that become part of commands that
	are evaluated.

--- gnulib-tool.orig	2009-01-01 21:55:17.000000000 +0100
+++ gnulib-tool	2009-01-01 21:48:32.000000000 +0100
 <at>  <at>  -2314,7 +2314,7  <at>  <at> 
       s,^dnl .*$,,
       s, dnl .*$,,
       /gl_LOCAL_DIR(/ {
-        s,^.*gl_LOCAL_DIR([[ ]*\([^])]*\).*$,cached_local_gnulib_dir="\1",p
+        s,^.*gl_LOCAL_DIR([[ ]*\([^]"$`\\)]*\).*$,cached_local_gnulib_dir="\1",p
       }
       /gl_MODULES(/ {
         ta
(Continue reading)

Jim Meyering | 1 Jan 2009 21:57
Gravatar

Re: sed, SIGPIPE, cmp -s [Re: [PATCH 3/4] gnulib-tool: abort loops early where possible.

Bruno Haible <bruno <at> clisp.org> wrote:
>>    ./bootstrap: Copying ._bootmp/gnulib-tests/Makefile.am to gnulib-tests/gnulib.mk...
>>   -sed: couldn't write 60 items to stdout: Broken pipe
>>   +sed: couldn't write 56 items to stdout: Broken pipe
>>
>> I see that it's technically harmless and comes from here:
>>
>>         sed "$remove_intl" $1/$dir/$file | cmp -s - $dir/$gnulib_mk || {
>>               echo "$0: Copying $1/$dir/$file to $dir/$gnulib_mk ..." &&
>>
>> because using GNU cmp's -s (--silent/ exit-status-only) option
>> makes it refrain from reading all input, so sed's write evokes SIGPIPE.
>>
>> I may take this as encouragement not to use -s,
>> and just redirect to /dev/null instead.
>
> The purpose of SIGPIPE is to speed up pipes of commands. By piping
> cmp's result to /dev/null you renounce this optimization. In order to
> exploit this optimization, without getting bash's error message, all
> you need is func_reset_sigpipe from gnulib-tool, and
>
>   (func_reset_sigpipe
>    sed "$remove_intl" $1/$dir/$file | cmp -s - $dir/$gnulib_mk || {
>                echo "$0: Copying $1/$dir/$file to $dir/$gnulib_mk ..." && ... } )

Hi Bruno,

Thanks, but this is independent of which shell I use.
(the diagnostic doesn't come from bash)
I.e., I can reproduce it with zsh:
(Continue reading)

Jim Meyering | 1 Jan 2009 22:11
Gravatar

Re: Conflicting types for mblen on Solaris 2.6

"Tom G. Christensen" <tgc <at> jupiterrise.com> wrote:
> The current daily snapshot is failing to build fprintftime:
> depbase=`echo fprintftime.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
>         gcc -std=gnu99 -DHAVE_CONFIG_H -DEXEEXT=\"\" -DEXEEXT=\"\" -DNO_XMALLOC -DEXEEXT=\"\" -I. -I.. 
-I../intl -I/usr/tgcware/include -D_REENTRANT  -g -O2 -MT fprintftime.o -MD -MP -MF $depbase.Tpo -c -o
fprintftime.o fprintftime.c &&\
>         mv -f $depbase.Tpo $depbase.Po
> In file included from ./stdint.h:482,
>                  from ./stdlib.h:46,
>                  from strftime.c:67,
>                  from fprintftime.c:2:
> ./wchar.h:197: error: conflicting types for 'mblen'
> /usr/tgcware/gcc-4.3.2/lib/gcc/sparc-sun-solaris2.6/4.3.2/include-fixed/stdlib.h:146:
error: previous declaration of 'mblen' was here
> make[4]: *** [fprintftime.o] Error 1
>
> stdlib.h:146 is:
> extern int mblen(const char *, size_t);
>
> wchar.h:197 is:
> extern size_t mbrlen (const char *s, size_t n, mbstate_t *ps);
>
> The type conflict I guess is caused by strftime.c:58:
> #  define mbrlen(s, n, ps) mblen (s, n)
>
> -tgc

Thanks for the report.
Here's a totally untested knee-jerk patch:

(Continue reading)

Bruno Haible | 2 Jan 2009 01:09

Re: bash, sed, SIGPIPE

Hi Jim,

> Also, even if I do reset the shell's sigpipe handler,
> it doesn't change the fact that sed complains:
> 
>     $ (trap - SIGPIPE; seq 1000000|sed s/1/2/|cmp -s - /dev/null )
>     sed: couldn't write 4 items to stdout: Broken pipe
>     [Exit 1]

Now this looks like a bug either in 'sed' or in your shell. I don't reproduce it
with bash 3.2.39 and sed 4.1.5.

After 'trap - SIGPIPE', sed should get a fatal SIGPIPE signal in these conditions.
Quoting <http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#trap>

  "If action is '-' , the shell shall reset each condition to the default value."

Whereas the bash manual page says:

  "If arg is absent (and  there  is  a
   single  sigspec)  or  -,  each  specified signal is reset to its
   original disposition (the value it  had  upon  entrance  to  the
   shell)."

Does this last parenthesized sentence mean that the 'trap' command from
build-aux/bootstrap line 263 has an effect on the 'trap' commands in all its
subshells??

Bruno

(Continue reading)

Eric Blake | 2 Jan 2009 01:25
Gravatar

Re: bash, sed, SIGPIPE


According to Bruno Haible on 1/1/2009 5:09 PM:
> After 'trap - SIGPIPE', sed should get a fatal SIGPIPE signal in these conditions.
> Quoting <http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#trap>
> 
>   "If action is '-' , the shell shall reset each condition to the default value."

You're missing one other piece.

"Signals that were ignored on entry to a non-interactive shell cannot be
trapped or reset, although no error need be reported when attempting to do
so."

Therefore, if a bash script is started while SIGPIPE is already ignored,
there is nothing the script can do to turn it off ('trap - SIGPIPE' only
re-enables default behavior if your script was the one that disabled it in
the first place, but not if your script started with it already disabled).

--
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9 <at> byu.net
Bruno Haible | 2 Jan 2009 02:17

Re: [PATCH 0/4] faster gnulib-tool

Hello Ralf,

Thank you for your speedups to gnulib-tool. At first I was, of course,
excited about the 2x speedup. But when looking at the maintainability
of the code that you propose, I'm not fine with all of it any more.

My four objections are:

1) You observe that forking programs in a shell script is slow, and
   therefore propose to use more shell built-ins. The problem with it
   is that I chose to implement gnulib-tool in shell (for the control
   structure) and sed (for the text processing). The shell also has
   some inferior commands for string processing, but sed is vastly
   superior for this purpose. I want to stick with 'sed' for the
   text processing, otherwise we write some parts of the code to use
   shell built-ins, and when we notice that a little more text processing
   is required, we have to rewrite the code to use 'sed'. So the use
   of shell built-ins for text processing turns out to be a
   "premature optimization" (in the sense of Knuth) and hampers
   maintainability.

   If you want to achieve good speedups for scripts that use 'sed':
   can you work towards making 'sed' a bash built-in? This is challenging,
   but if you are after performance, that would be promising.

2) Your patches change the generation of code so that it goes through
   intermediate shell variables. The problem with this is that the
   transformation from string to standard output is not simple:
     echo $string
   outputs the string plus a newline, and 'echo -n' is not portable.
(Continue reading)

Bruno Haible | 2 Jan 2009 02:26

Re: [PATCH 4/4] gnulib-tool: faster string handling for Posix shells.

Hello Ralf,

20% is a good speedup that you achieve here by use of shell built-ins instead of
subprocesses and sed invocations.

> * gnulib-tool (func_strip): New function.

You only ever need to strip the prefix _or_ the suffix, never both
simultaneously. Therefore it will be faster to define 2 functions.

> +# func_strip name prefix suffix

This function is insufficiently documented: What restrictions apply to
'name', 'prefix', 'suffix'?

> +if ( eval 'foo=bar; test "${foo%r}" = ba' ) >/dev/null 2>&1; then
> +  func_strip ()
> +  {
> +    stripped=$1
> +    stripped=${stripped#$2}
> +    stripped=${stripped%$3}
> +  }

I don't much like functions which assign to a particular variable: it makes
it hard to remember how to use the function.

>  <at>  <at>  -1495,11 +1516,15  <at>  <at>  func_get_automake_snippet ()
>          | sed -n -e "$sed_extract_mentioned_files" | sed -e 's/#.*//'`
>        func_get_filelist "$mymodule"
>        all_files=$module_files
(Continue reading)


Gmane