fche at redhat dot com | 28 Jun 17:13 2016

[Bug runtime/11308] aggregate operations for <at> variance, <at> skew, <at> kurtosis

https://sourceware.org/bugzilla/show_bug.cgi?id=11308

--- Comment #5 from Frank Ch. Eigler <fche at redhat dot com> ---
Excellent progress!

> This gives somewhat reasonable results for a "few" "small" integers with
> normal distribution, but almost any other set of values makes it behave
> crazily because of the integer arithmetic being used for the dividing. 

I suspect the loss of precision is occurring in the per-cpu sd->variance =
stp_div64() ... code rather than the cross-cpu agg->variance one.  Your test
case (probe oneshot) runs only on one CPU, so a lot of the cross-cpu terms (S1)
should be zero.

> At the first glance, the floating point arithmetic inside the linux kernel
> doesn't look like something usual or straightforward.  But an attempt to
> implement it might be an interesting one.  Not sure about this though.

We'll eventually get -some- FP capabilities for stap code, but let's see how
much farther we can get without.

If you have any more time/interest in this problem, I'd suggest investigating
whether we could track the sd->avg / sd->_M2 / sd->variance values in a scaled
form.  We continuously track max/min, so have a good idea about the dynamic
range of the <<<'d values.  The code could track a "shift" parameter that
scales those numbers (via << shift) during the online update phase so as to
preserve as much precision as possible.

Another thing we should consider afterwards is bug #10234.  For a stap script
not interested in  <at> variance, this code should not be run at all.
(Continue reading)

dsmith at redhat dot com | 28 Jun 16:05 2016

[Bug translator/20307] New: tapset/linux/rpc.stp error (possible because of 'private' keyword)

https://sourceware.org/bugzilla/show_bug.cgi?id=20307

            Bug ID: 20307
           Summary: tapset/linux/rpc.stp error (possible because of
                    'private' keyword)
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: translator
          Assignee: systemtap at sourceware dot org
          Reporter: dsmith at redhat dot com
  Target Milestone: ---

I'm seeing something odd with rpc.stp:

====
# stap -vp4 ../src/testsuite/buildok/rpc-all-probes.stp 
Pass 1: parsed user script and 115 library scripts using
159296virt/42368res/8576shr/29248data kb, in 490usr/0sys/489real ms.
WARNING: cannot probe .return of 1 inlined functions  rpc_release_task
WARNING: cannot probe .return of 1 inlined functions  rpc_release_task
WARNING: cannot probe .return of 1 inlined functions  rpc_release_task
WARNING: cannot probe .return of 1 inlined functions  rpc_release_task
WARNING: cannot probe .return of 1 inlined functions  rpc_release_task
semantic error: unresolved arity-1 global array __rpc_create_args, missing
global declaration?: identifier '__rpc_create_args' at
/usr/local/share/systemtap/tapset/linux/rpc.stp:182:12
        source:                 __args = __rpc_create_args[tid()]
(Continue reading)

mcermak at redhat dot com | 28 Jun 13:58 2016

[Bug runtime/11308] aggregate operations for <at> variance, <at> skew, <at> kurtosis

https://sourceware.org/bugzilla/show_bug.cgi?id=11308

--- Comment #4 from Martin Cermak <mcermak at redhat dot com> ---
Created attachment 9370
  --> https://sourceware.org/bugzilla/attachment.cgi?id=9370&action=edit
a little testing script

--

-- 
You are receiving this mail because:
You are the assignee for the bug.
mcermak at redhat dot com | 28 Jun 13:57 2016

[Bug runtime/11308] aggregate operations for <at> variance, <at> skew, <at> kurtosis

https://sourceware.org/bugzilla/show_bug.cgi?id=11308

--- Comment #3 from Martin Cermak <mcermak at redhat dot com> ---
Created attachment 9369
  --> https://sourceware.org/bugzilla/attachment.cgi?id=9369&action=edit
working version of a patch

Attached patch computes the individual per CPU variances using the Knuth's
algorithm from Comment #1. Based on that, the aggregated variance over all the
CPUs is being computed using the "Total Variance" formula from the above paper.

This gives somewhat reasonable results for a "few" "small" integers with normal
distribution, but almost any other set of values makes it behave crazily
because of the integer arithmetic being used for the dividing.  Below I am
going to attach a little python script that helps comparing this stap variance
implementation with python's statistics.variance().

At the first glance, the floating point arithmetic inside the linux kernel
doesn't look like something usual or straightforward.  But an attempt to
implement it might be an interesting one.  Not sure about this though.

--

-- 
You are receiving this mail because:
You are the assignee for the bug.
Nikolay Borisov | 27 Jun 18:46 2016
Picon

[PATCH] Add the '-p4' options when exemplifying the module compilation

Currently the example command which supposedly should compile the
instrumentation module is missing the '-p4' option, meaning that
upon running it will compile and run the module. In order to make
the command more in sync with what this particular chapter is about,
add the -p4 options so that after running the command the user gets
to copy the resulting module.

Signed-off-by: Nikolay Borisov <n.borisov.lkml <at> gmail.com>
---
 doc/SystemTap_Beginners_Guide/en-US/CrossInstrumenting.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/SystemTap_Beginners_Guide/en-US/CrossInstrumenting.xml b/doc/SystemTap_Beginners_Guide/en-US/CrossInstrumenting.xml
index 3e5a4855ec60..c840a969014f 100644
--- a/doc/SystemTap_Beginners_Guide/en-US/CrossInstrumenting.xml
+++ b/doc/SystemTap_Beginners_Guide/en-US/CrossInstrumenting.xml
 <at>  <at>  -258,7 +258,7  <at>  <at> 
     appropriate values):
   </para>

-<screen><command>stap -r <replaceable>kernel_version</replaceable>
<replaceable>script</replaceable> -m <replaceable>module_name</replaceable></command></screen>
+<screen><command>stap -p4 -r <replaceable>kernel_version</replaceable>
<replaceable>script</replaceable> -m <replaceable>module_name</replaceable></command></screen>

   <para>
     Here, <replaceable>kernel_version</replaceable> refers to
--

-- 
2.7.4

(Continue reading)

dsmith at redhat dot com | 24 Jun 20:09 2016

[Bug testsuite/20298] New: the unprivileged_embedded_C.exp testcase needs updating

https://sourceware.org/bugzilla/show_bug.cgi?id=20298

            Bug ID: 20298
           Summary: the unprivileged_embedded_C.exp testcase needs
                    updating
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: testsuite
          Assignee: systemtap at sourceware dot org
          Reporter: dsmith at redhat dot com
  Target Milestone: ---

There are at least 2 problems I see with
testsuite/systemtap.stress/unprivileged_embedded_C.exp:

1) It only searches tapset functions in tapset/ and tapset/${ARCH}/. It doesn't
search tapsets in tapset/linux/ and tapset/linux/${ARCH}/. It needs to be
updated for the dyninst/linux tapset directory rearrangement.

2) It gets spurious failures. It assumes that if a tapset function doesn't have
any embedded C (i.e. a pure stap script function), a unprivileged user can call
it. That isn't the case if that pure stap script function calls a function that
is privileged.

For example, note the following 2 functions from tapset/linux/conversions.exp:

====
(Continue reading)

mcermak at redhat dot com | 24 Jun 15:11 2016

[Bug runtime/20297] New: parse error: array size out of range on i686

https://sourceware.org/bugzilla/show_bug.cgi?id=20297

            Bug ID: 20297
           Summary: parse error: array size out of range on i686
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: mcermak at redhat dot com
  Target Milestone: ---

On i686 kernels I seem to be hitting an unexpected problem with array of size N
not being sufficient to accomodate N elements.  This seems to work as expected
on all other arches AFAICT (incl. also single-abi 32-bit arm).

=======
 6.8 S i686 # stap -p2 --poison-cache -ue 'global arr[1] probe oneshot{
arr["a"]=3  }' >/dev/null
parse error: array size out of range
        saw: /usr/local/share/systemtap/tapset/linux/i386/syscall_num.stp EOF

2 parse errors.
WARNING: tapset "/usr/local/share/systemtap/tapset/linux/i386/syscall_num.stp"
has errors, and will be skipped
Number of similar error messages suppressed: 1.
Rerun with -v to see them.
 6.8 S i686 # 
(Continue reading)

fche at redhat dot com | 21 Jun 23:38 2016

[Bug translator/20288] New: dwfl/elfutils problem when gathering line-record data for *symfile/*symline functions

https://sourceware.org/bugzilla/show_bug.cgi?id=20288

            Bug ID: 20288
           Summary: dwfl/elfutils problem when gathering line-record data
                    for *symfile/*symline functions
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: translator
          Assignee: systemtap at sourceware dot org
          Reporter: fche at redhat dot com
  Target Milestone: ---

random git stap build running against f22

% ./stap -V
Systemtap translator/driver (version 3.1/0.166, commit
release-3.0-110-g713029398d38 + changes)

% ./stap -p4 -e 'probe process("/bin/ls").function("main") { log(usymfile(0))
}' --vp 0044
[...]
dump_unwindsyms /usr/bin/ls index=2 base=0x400000
Found build-id in /usr/bin/ls, length 20, start at 0x400284
WARNING: No debug line data for /usr/bin/ls, no error
[...]

wait, wha?
(Continue reading)

dsmith at redhat dot com | 21 Jun 22:39 2016

[Bug runtime/20286] New: probe handlers using hrtimers taking too long

https://sourceware.org/bugzilla/show_bug.cgi?id=20286

            Bug ID: 20286
           Summary: probe handlers using hrtimers taking too long
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: dsmith at redhat dot com
  Target Milestone: ---

When running the full testsuite (in parallel mode, but I'm not sure it
matters), I see the following on the console (3.10.0-442.el7.ppc64):

[ 6881.740026] hrtimer: interrupt took 3721 ns

While that doesn't look too alarming, what it means is that the next timer has
already expired because the current timer callback took too long. I'm looking
at the source of hrtimer_interrupt() in kernel/time/hrtimer.c. Here's a comment
from there:

         * The next timer was already expired due to:
         * - tracing
         * - long lasting callbacks
         * - being scheduled away when running in a VM
         *
         * We need to prevent that we loop forever in the hrtimer
(Continue reading)

mcermak at redhat dot com | 21 Jun 12:02 2016

[Bug runtime/20282] New: implicit declaration of function ‘__get_user_bad’ on recent aarch64 kernel

https://sourceware.org/bugzilla/show_bug.cgi?id=20282

            Bug ID: 20282
           Summary: implicit declaration of function ‘__get_user_bad’ on
                    recent aarch64 kernel
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: mcermak at redhat dot com
  Target Milestone: ---

With kernel-4.5.0-0.40.el7.aarch64 kbuild fails thusly:

=======
Pass 3: translated to C into
"/tmp/stapCJl99c/stap_355af5cf9305976cecf5dce2a13ff16c_53860_src.c" using
191104virt/82880res/6208shr/76096data kb, in 20usr/150sys/178real ms.
In file included from /usr/local/share/systemtap/runtime/linux/runtime.h:214:0,
                 from /usr/local/share/systemtap/runtime/runtime.h:26,
                 from
/tmp/stapCJl99c/stap_355af5cf9305976cecf5dce2a13ff16c_53860_src.c:25:
/usr/local/share/systemtap/runtime/stp_string.c: In function
‘_stp_decode_utf8’:
/usr/local/share/systemtap/runtime/stp_string.c:73:2: error: implicit
declaration of function ‘__get_user_bad’
[-Werror=implicit-function-declaration]
(Continue reading)

mcermak at redhat dot com | 21 Jun 11:58 2016

[Bug translator/20281] New: probe process("") kills stap with SIGABRT

https://sourceware.org/bugzilla/show_bug.cgi?id=20281

            Bug ID: 20281
           Summary: probe process("")  kills stap with SIGABRT
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: translator
          Assignee: systemtap at sourceware dot org
          Reporter: mcermak at redhat dot com
  Target Milestone: ---

Following script makes stap die with a not very user-friendly error message:

=======
work #  stap -p3 -e 'probe process("").syscall { println("hey")}' -c /bin/ls
stap: ../../home/mcermak/stap/src/translate.cxx:7424: void
emit_symbol_data(systemtap_session&): Assertion `modname.length() != 0' failed.
Aborted (core dumped)
work #
=======

This happens at translation time at 'assert (modname.length() != 0);' when
processing user modules (files).  Maybe this might get caught earlier, possibly
at elaboration time within match_node::find_and_build() (?) and a nicer error
message might get returned.

--

-- 
(Continue reading)


Gmane