Masami Hiramatsu | 1 Sep 2006 04:55
Picon

Re: [RFC] Proposal of marker implementation

Hi Frank,

I discussed this idea and your question in Hitachi, and I
decide to shelve it, because realizing this idea is not easy
on the other arch, and there are no proofs of big advantages.

I just minded the overhead of current approach which came from
accessing variables. And now I'd like to prioritize other things
which should be done, for example, flight-recorder,
kprobe-booster <at> other arch, integrated tracing scripts,
and non-marker-based djprobe.

Thanks,

Frank Ch. Eigler wrote:
> Masami Hiramatsu <masami.hiramatsu.pt <at> hitachi.com> writes:
> 
>> I'd like to suggest my marker idea which I spoke in OLS.
>> My idea is based on the "section" of elf binary and the djprobe.
>> [...]
> 
> Can this approach could be made "pluggable" in the sense of
> interchangeable with the other type at the call site? Can you make a
> version of these macros that adds reliable parameter passing?  Can you
> outline a proof-of-concept of the probe that would use these hooks?
> Is live activation/deactivation of the probes a problem on complex
> hosts (smp / preempt)?
> 
> - FChE
> 
(Continue reading)

Masami Hiramatsu | 1 Sep 2006 04:55
Picon

Re: stptracer-20060828 has released.

Hi,

Li Guanglei wrote:
> The first thing I want to figure out about STPTracer is how it
> performs compared with LKET. STPTracer uses an interface named gBTI
> which could only print fix number integers into a pre-reserved relayfs
> buffer, while LKET uses the _stp_printf with binary print support.

It's interesting. Thank you for the benchmarking.

> <1> run app_getsid with being probed:
> cpu 0: loops=5000000, average=442 ns
> 
> <2> run app_getsid with an stp script with empty probe handlers, i.e.,
>        probe syscall.getsid {}
> cpu 0: loops=5000000, average=1523 ns
> 
> <3> probe using lket_getsid.stp:
> cpu 0: loops=5000000, average=3079 ns
> 
> <4> probe using lkst_getsid.stp:
> cpu 0: loops=5000000, average=2341 ns

I checked that by a similar benchmark(I used gettimeofday, instead of getsid)

<1> no probe handler
148 ns
<2> empty probe handler
406 ns
<3> lket probe handler
(Continue reading)

Masami Hiramatsu | 1 Sep 2006 05:07
Picon

Re: stptracer-20060828 has released.

Hi Frank,

Frank Ch. Eigler wrote:
> Li Guanglei <guanglei <at> cn.ibm.com> writes:
> 
>> [...]  So we use _stp_printf() for its fancy printing format in
>> trade of its slower speed compared with gBTI.  But the interface
>> like gBTI imposes too much restriction on trace data format and the
>> number of data items to be traced. Maybe we should find some places
>> inside _stp_printf() for further performance improvement while still
>> have the capability to print data freely.
> 
> To avoid the overhead inherent in dynamic interpretation of formatting
> strings, we would need to gradually adopt a compiled approach.  The
> translator is already parsing formatting strings.  It could emit
> low-level equivalent code to write binary chunks directly.  The
> runtime would only need to provide buffer-reservation/commit routines.

It's a good idea.
I suggest that the stpd should handle each binary chunks correctly.

In the gBTI approach, each binary data has its length information in
the head of the entry. So the enhanced merging routine (gbti_merge
command) can separate those entries correctly even if the routine
doesn't know the format of the binary data.

Current systemtap can't merge the temporary files which include
binary data correctly. I think if each binary chunks has its
length information, we can merge them correctly.

(Continue reading)

Li Guanglei | 1 Sep 2006 05:47
Picon
Favicon

Re: stptracer-20060828 has released.

Masami Hiramatsu wrote:
> Hi Frank,
> 
> Frank Ch. Eigler wrote:
>> Li Guanglei <guanglei <at> cn.ibm.com> writes:
>>
>>> [...]  So we use _stp_printf() for its fancy printing format in
>>> trade of its slower speed compared with gBTI.  But the interface
>>> like gBTI imposes too much restriction on trace data format and the
>>> number of data items to be traced. Maybe we should find some places
>>> inside _stp_printf() for further performance improvement while still
>>> have the capability to print data freely.
>> To avoid the overhead inherent in dynamic interpretation of formatting
>> strings, we would need to gradually adopt a compiled approach.  The
>> translator is already parsing formatting strings.  It could emit
>> low-level equivalent code to write binary chunks directly.  The
>> runtime would only need to provide buffer-reservation/commit routines.
> 
> It's a good idea.
> I suggest that the stpd should handle each binary chunks correctly.
> 
> In the gBTI approach, each binary data has its length information in
> the head of the entry. So the enhanced merging routine (gbti_merge
> command) can separate those entries correctly even if the routine
> doesn't know the format of the binary data.
> 
> Current systemtap can't merge the temporary files which include
> binary data correctly. I think if each binary chunks has its
> length information, we can merge them correctly.
> 
(Continue reading)

Frank Ch. Eigler | 1 Sep 2006 15:53
Picon
Favicon
Gravatar

Re: [RFC] Proposal of marker implementation

Hi -

> I discussed this idea and your question in Hitachi, and I
> decide to shelve it, because realizing this idea is not easy
> on the other arch, and there are no proofs of big advantages.
> 
> I just minded the overhead of current approach [...]

I understand.  Thank you for thinking of these other methods.
In time, we may include several of the ideas in the code.

- FChE

Frank Ch. Eigler | 1 Sep 2006 22:37
Picon
Favicon
Gravatar

Re: stptracer-20060828 has released.

Hi -

On Fri, Sep 01, 2006 at 11:47:05AM +0800, Li Guanglei wrote:

> [...]  But I have some questions about compiled approach. stap will
> treat all integer data as 64-bit, but most binary trace integer data
> items need only 1 or 2 bytes. [...] And will it use a fix length for
> all string items? Some string trace data items only have a few
> chars.

The translator could apply obvious policies to this question.  If the
formatting string includes "%1b", it would write just the low-order
byte of the integer.  If the formatting string includes "%64s", this
would result in a fixed-width substring field.  For plain "%s", it
would be a dynamic-width field, which alone would not be ideal for
binary format streams.

The idea here is to make this compiled tracing a pure optimization:
not to change the script input nor data output, but just to produce it
quicker.

> [...] But in some situations I will put the print/trace statement in
> embedded c codes to avoid the calls to function__dwarf_tvar_get*.

If you don't call those tvar_get functions, how are you safely
extracting target data?

> One example is the struct scsi_cmnd in scsi trace hooks in LKET. I
> need to retrieve 10 arguments from this struct. In the embedded c
> codes they are only 10 assignments while in stap script they will be
(Continue reading)

fche | 2 Sep 2006 16:27
Favicon

new systemtap snapshot available

A new automated systemtap CVS snapshot is available.
ftp://sources.redhat.com/pub/systemtap/snapshots/systemtap-20060902.tar.bz2
555074 bytes

Li Guanglei | 5 Sep 2006 04:30
Picon
Favicon

Re: stptracer-20060828 has released.

Frank Ch. Eigler wrote:
> Hi -
> 
> On Fri, Sep 01, 2006 at 11:47:05AM +0800, Li Guanglei wrote:
> 
>> [...]  But I have some questions about compiled approach. stap will
>> treat all integer data as 64-bit, but most binary trace integer data
>> items need only 1 or 2 bytes. [...] And will it use a fix length for
>> all string items? Some string trace data items only have a few
>> chars.
> 
> The translator could apply obvious policies to this question.  If the
> formatting string includes "%1b", it would write just the low-order
> byte of the integer.  If the formatting string includes "%64s", this
> would result in a fixed-width substring field.  For plain "%s", it
> would be a dynamic-width field, which alone would not be ideal for
> binary format streams.
> 
> The idea here is to make this compiled tracing a pure optimization:
> not to change the script input nor data output, but just to produce it
> quicker.
> 

Thanks. I see.

> 
>> [...] But in some situations I will put the print/trace statement in
>> embedded c codes to avoid the calls to function__dwarf_tvar_get*.
> 
> If you don't call those tvar_get functions, how are you safely
(Continue reading)

Li Guanglei | 5 Sep 2006 04:36
Picon
Favicon

Re: stptracer-20060828 has released.

Li Guanglei wrote:
> Frank Ch. Eigler wrote:
>>
>> The idea here is to make this compiled tracing a pure optimization:
>> not to change the script input nor data output, but just to produce it
>> quicker.

BTW, should we open a new bug for this compiled approach binary tracing?

- Guanglei

Gui,Jian | 5 Sep 2006 10:45
Picon
Favicon

draft RPC tapset

Hi folks,

I am working on RPC trace hooks for Systemtap/LKET. These trace
hooks can help dynamically trace the activities on both RPC
clients and servers.

The functions in the sunrpc module (see net/sunrpc/sunrpc_syms.c
and others) are roughly categorized into several groups:
* for RPC scheduler
* for RPC client
* for RPC client transport
* for RPC client credential cache
* for RPC server
* for RPC statistics
* for RPC caching
* for generic XDR

As a start point, I picked the trace hooks mainly from RPC client,
scheduler and server side. I am sure it is not always enough, so
anyone can extend this tapset whenever necessary.

And I want to make sure the trace hooks I chose are in right places
and the parameters are correctly handled. It will be very appreciated
if you can take a look at it. Please fell free to let me know if
you have any questions/suggestions/comments.

Thanks.

Gui,Jian

(Continue reading)


Gmane