Ben Pearre | 1 May 04:16 2006
Picon

Measuring hyperthreading slowdown?

Hi!

    On a Xeon processor, how much does running A slow down B if A and
    B run simultaneously on the two hyperthreaded cores?

I'm working on a hack to the scheduler that tries to figure this out,
and then to choose tasks that cooperate reasonably nicely.  This means
that I need to get some (very rough) measure of slowdown of one task
when another runs on the other htcore.

The part that's stumping me is measuring the performance hit.  It
should be sufficient for me to measure, say, resource stalls; it
should not even matter if I counted resource stalls for both processes
in the same register (enough data would identify asymmetries).  If I
can get the numbers per hthread, of course such things as instructions
retired wouldn't be a bad addition, but that is inessential.

I was just playing with papi/perfctr, and found that it is really not
suited to this: it disables hyperthreading since it can't share the
HPCs between simultaneous tasks.  They suggested I check out oprofile.

Can oprofile do what I describe?  The manual (section 3.4) says that
ht is not supported under 2.4; will 2.6 let me do what I want?  I'm
currently working on 2.6.16, but that is not set in stone.

Any other suggestions?

Many thanks for reading!!
-Ben

(Continue reading)

SourceForge.net | 1 May 08:10 2006
Picon
Picon

[ oprofile-Bugs-1479211 ] Unable to use Oprofile examples on Alpha platform

Bugs item #1479211, was opened at 2006-04-30 00:19
Message generated for change (Comment added) made by nobody
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=116191&aid=1479211&group_id=16191

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Unable to use Oprofile examples on Alpha platform

Initial Comment:
E-mail:  Mike(at)flyingpenguins.org

I am unable to use some of the OProfile examples on the
alpha platform using OProfile 0.9.1.

Steps to reproduce:
1) opcontrol --vmlinux=/usr/src/linux-2.6.16.1/vmlinux
2) opcontrol --start
3a) opreport --demangle=smart --symbols `which xmms`
Segmentation fault
3b) opannotate --source --assembly `which xmms`
Segmentation fault
(Continue reading)

John Levon | 1 May 17:14 2006

Re: Measuring hyperthreading slowdown?

On Sun, Apr 30, 2006 at 08:16:03PM -0600, Ben Pearre wrote:

> Can oprofile do what I describe?  The manual (section 3.4) says that
> ht is not supported under 2.4; will 2.6 let me do what I want?  I'm
> currently working on 2.6.16, but that is not set in stone.

You can use CPU separation and the right events to profile each thread
with the 2.6 kernel

john

-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
pak333 | 2 May 06:31 2006
Picon
Picon

Oprofile for Java

I am using oprofile to profile a java App and though I see thru top that "java" is using 99% of CPU, oprofile does not collect any samples on Java. I am using the SUN JVM. Also tried Jrockit.
 
Why is oprofile not able to assign any samples to the "java" binary? I do not see any jitted code samples.
 
 
Thanks
- Padma
 
Andi Kleen | 3 May 11:40 2006
Picon

Re: oprofile results in Kernel panic

John Kacur <jkacur <at> ca.ibm.com> writes:

> I believe that OProfile is incorrectly identifying your processor as a
> 32-bit Xeon instead of a 64-bit Xeon. In function nmi_init, it matches
> case X86_VENDOR_INTEL for the vendor, and case 0xf for the family and
> then calls p4_init which just looks if you have a model number > 4. I'll
> see if I can come up with a patch to recognize this in p4_init and
> return 0. This would effectively put OProfile in timer mode. (A better
> patch for the long run would figure out what's needed for your processor
> to work with the nmi code.) As a work around you can try running
> OProfile in timer mode. The OProfile manual says "You can force use of
> the timer interrupt by using the timer=1 module parameter (or
> oprofile.timer=1 on the boot command line if OProfile is built-in)."

The standard P4 code should actually work. Why do you think it shouldn't?
I have used it many times on 64bit.  I don't think forcing timer
mode is a good idea.

-Andi

-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Andi Kleen | 3 May 11:43 2006
Picon

Re: oprofile results in Kernel panic

Olaf Bachmann <olaf.bachmann <at> is-teledata.com> writes:

> Hi,
> a run of oprofile on one of our production-hosts has resulted in a
> kernel panic (see attached screenshot).
> Can anyone help?

The png is unfortunately useless because the interesting backtrace of
the actual panic has scrolled away. You can increase the resolution
with vga=0x0f07 when video mode switching is compiled in.

-Andi

-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
William Cohen | 3 May 15:51 2006
Picon

Re: oprofile results in Kernel panic

Andi Kleen wrote:
> Olaf Bachmann <olaf.bachmann <at> is-teledata.com> writes:
> 
> 
>>Hi,
>>a run of oprofile on one of our production-hosts has resulted in a
>>kernel panic (see attached screenshot).
>>Can anyone help?
> 
> 
> The png is unfortunately useless because the interesting backtrace of
> the actual panic has scrolled away. You can increase the resolution
> with vga=0x0f07 when video mode switching is compiled in.
> 
> -Andi

Would it be possible to set up a serial console on the machine and log 
the serial output on another machine? That would allow one to get the 
complete oops message in nice ascii format.

-Will

-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
John Kacur | 3 May 16:04 2006
Picon

Re: oprofile results in Kernel panic

On Wed, 2006-05-03 at 11:40 +0200, Andi Kleen wrote:
> John Kacur <jkacur <at> ca.ibm.com> writes:
> 
> > I believe that OProfile is incorrectly identifying your processor as a
> > 32-bit Xeon instead of a 64-bit Xeon. In function nmi_init, it matches
> > case X86_VENDOR_INTEL for the vendor, and case 0xf for the family and
> > then calls p4_init which just looks if you have a model number > 4. I'll
> > see if I can come up with a patch to recognize this in p4_init and
> > return 0. This would effectively put OProfile in timer mode. (A better
> > patch for the long run would figure out what's needed for your processor
> > to work with the nmi code.) As a work around you can try running
> > OProfile in timer mode. The OProfile manual says "You can force use of
> > the timer interrupt by using the timer=1 module parameter (or
> > oprofile.timer=1 on the boot command line if OProfile is built-in)."
> 
> The standard P4 code should actually work. Why do you think it shouldn't?
> I have used it many times on 64bit.  I don't think forcing timer
> mode is a good idea.
>  
> -Andi
> 
My mistake - it should work.

-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Ben Pearre | 4 May 03:10 2006
Picon

Re: Measuring hyperthreading slowdown?

Thanks!

This is a good start, but I'm wading through drivers/oprofile/*
etc. and the "internals" documentation and something is still unclear:

Is there a kernel data structure that I can read from the running
kernel (while profiling is turned on) contains the total number of
counter events (or interrupts) credited to a given process/thread?

I can use opcontrol to add some events to be profiled (at fairly high
resolution).  I will get NMIs and lots of add_sample()s.  But it looks
like the only thing that's logged is the PC, and that not until the
post-gathering user code runs do I figure out which counter totals are
credited to which processes.  Is that true?

If not, any hints on which kernel structure I can read?

I'd like to stress that I don't need exact numbers (I know I won't get
those), but just _some_ numbers.  I'm still in proof-of-concept mode
for now, although if the approach shows promise, the ultimate
objective is a learning scheduler :)

I really appreciate any comments, including ones along the lines of
"You're MAD!!"

Cheers :)
-Ben

On 2006-05-01 16:14, John Levon wrote:
> On Sun, Apr 30, 2006 at 08:16:03PM -0600, Ben Pearre wrote:
> > > Can oprofile do what I describe?  The manual (section 3.4) says that
> > ht is not supported under 2.4; will 2.6 let me do what I want?  I'm
> > currently working on 2.6.16, but that is not set in stone.
> 
> You can use CPU separation and the right events to profile each thread
> with the 2.6 kernel
> 
> john

--
Ben Pearre     http://koryukanboulder.com/ben      PGP: CFDA6CDA
Don't let Bush read your email!             http://www.gnupg.org

-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
John Levon | 4 May 03:35 2006

Re: Measuring hyperthreading slowdown?

On Wed, May 03, 2006 at 07:10:27PM -0600, Ben Pearre wrote:

> I can use opcontrol to add some events to be profiled (at fairly high
> resolution).  I will get NMIs and lots of add_sample()s.  But it looks
> like the only thing that's logged is the PC, and that not until the
> post-gathering user code runs do I figure out which counter totals are
> credited to which processes.  Is that true?

Yes.

> If not, any hints on which kernel structure I can read?

There is no such structure.

> I'd like to stress that I don't need exact numbers (I know I won't get
> those), but just _some_ numbers.  I'm still in proof-of-concept mode
> for now, although if the approach shows promise, the ultimate
> objective is a learning scheduler :)

You need perfmon or some such. There is no kernel API for oprofile.

regards
john

-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

Gmane