Sharyathi Nagesh | 5 May 14:03 2009
Picon

[PATCH 2/2] Provide stack unwinding feature

gmane.linux.kernel.crash-dump.crash-utility
gmane.linux.kernel.crash-dump.crash-utility
Sharyathi Nagesh | 5 May 14:01 2009
Picon

[PATCH 1/2] Display local variables and function parameters

gmane.linux.kernel.crash-dump.crash-utility
gmane.linux.kernel.crash-dump.crash-utility
Sharyathi Nagesh | 5 May 13:58 2009
Picon

[PATCH 0/2] Display local variables & function parameters from stack frames

Hi
	Mohan, Sachin and myself have implemented this feature in crash to 
display local variables and arguments from vmcore dumps. This feature
introduces a new command 'local' in crash utility which provides
interface for stack unwinding along with option to display local
variables and arguments. This patch is based on crash utility
crash-4.0-8.9. It has dependency on libdw/libelf libraries provided by
elfutils package.
	This has been tested on dumps taken on ppc64 machine. We were able to 
unwind the stack as well as display local variables, arguments. It 
currently displays values for non-optimized variables only (this fallows 
gdb's convention)

TODO Items:
	1. Support on x86_64 and x86 need to be implemented/tested
	2. Makefile need to be updated to help packaging this feature
	
Regards
Sharyathi Nagesh

Dave Anderson | 6 May 17:10 2009
Picon

Re: [PATCH 0/2] Display local variables & function parameters from stack frames


----- "Sharyathi Nagesh" <sharyath <at> in.ibm.com> wrote:

> Hi
> 	Mohan, Sachin and myself have implemented this feature in crash to 
> display local variables and arguments from vmcore dumps. This feature
> introduces a new command 'local' in crash utility which provides
> interface for stack unwinding along with option to display local
> variables and arguments. This patch is based on crash utility
> crash-4.0-8.9. It has dependency on libdw/libelf libraries provided by
> elfutils package.
> 	This has been tested on dumps taken on ppc64 machine. We were able to 
> unwind the stack as well as display local variables, arguments. It 
> currently displays values for non-optimized variables only (this fallows 
> gdb's convention)
> 
> TODO Items:
> 	1. Support on x86_64 and x86 need to be implemented/tested
> 	2. Makefile need to be updated to help packaging this feature
> 	
> Regards
> Sharyathi Nagesh

A couple suggestions -- move the get_netdump_arch() and get_regs_from_elf_notes()
prototypes to defs.h under the others listed for netdump.c.

Then remove this from local.c:

  + #include <../netdump.h>

(Continue reading)

Shahar Luxenberg | 7 May 11:53 2009
Picon

x86_64 bt

Hi,

 

I've bumped into two issues while using crash' bt command on x86_64 architecture:

  1. Incomplete disassembly of gdb: gdb's x/i command was unable to detect the nopl machine instruction (opcode 0x0f) – output was "(bad)". This resulted in an incorrect stack back trace since the frame size couldn't be calculated correctly. I've done a quick test, replacing some gdb files with a newer version taken from binutils (i386-dis.c for example) which solved the problem. Is there a plan of updating gdb version or part of it?

  2. x86_64_get_framesize() is very naïve. It is bailing out once the 'retq' instruction is seen. Is this issue going to be addressed?

 

Thanks,

Shahar.



Email secured by Check Point

<div>

<div class="Section1">

<p class="MsoNormal"><span>Hi,<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>I've bumped into two issues while using crash' bt command on
x86_64 architecture:<p></p></span></p>

<ol start="1" type="1">
<li class="MsoNormal"><span>Incomplete disassembly of gdb: gdb's
     x/i command was unable to detect the nopl machine instruction (opcode
     0x0f) &ndash; output was "(bad)". This resulted in an incorrect
     stack back trace since the frame size couldn't be calculated correctly. I've
     done a quick test, replacing some gdb files with a newer version taken
     from binutils (i386-dis.c for example) which solved the problem. Is there
     a plan of updating gdb version or part of it?<p></p></span></li>
 <li class="MsoNormal"><span>x86_64_get_framesize() is very na&iuml;ve.
     It is bailing out once the 'retq' instruction is seen. Is this issue going
     to be addressed?<p></p></span></li>
</ol>
<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>Thanks,<p></p></span></p>

<p class="MsoNormal"><span>Shahar.<p></p></span></p>

</div>

<br><br>Email secured by Check Point

<br><br>
</div>
Dave Anderson | 7 May 15:18 2009
Picon

Re: x86_64 bt


----- "Shahar Luxenberg" <shahar <at> checkpoint.com> wrote:

> Hi,
> 
> 
> 
> I've bumped into two issues while using crash' bt command on x86_64
> architecture:
> 
>     1. Incomplete disassembly of gdb: gdb's x/i command was unable to
> detect the nopl machine instruction (opcode 0x0f) – output was
> "(bad)". This resulted in an incorrect stack back trace since the
> frame size couldn't be calculated correctly. I've done a quick test,
> replacing some gdb files with a newer version taken from binutils
> (i386-dis.c for example) which solved the problem. Is there a plan of
> updating gdb version or part of it?

No, not at this time.  If the gdb code can be safely patched, and for
it to recognize a new instruction, that sounds do-able.  If you can
pare down the requirement, please forward a patch.

BTW, the wholesale replacement of the embedded gdb code is a massive
undertaking.  And since its primary purpose is for gathering structure
data type information and text disassembly, a patch to the existing
version is preferable. 

>     2. x86_64_get_framesize() is very naïve. It is bailing out once
> the 'retq' instruction is seen. Is this issue going to be addressed?

Well continuing on from that point would most likely end up calculating
a framesize that is too large, so it's bailing out on the "short" side.

Dave

Sharyathi Nagesh | 11 May 08:16 2009
Picon

Re: [PATCH 0/2] Display local variables & function parameters fromstack frames

Dave
	Excuse me for the late response, My laptop HW crashed and had some 
difficulty accessing mails.
	Packaging and x86_64/x86 support is still work in progress, we are 
hitting some unwinding issues with x86_64 code and trying to fix them 
along with local.mk changes. More help on the packaging front / 
re-writing local.mk will be an added favor :)

Dave Anderson wrote:
> ----- "Sharyathi Nagesh" <sharyath <at> in.ibm.com> wrote:
> 
>> Hi
>> 	Mohan, Sachin and myself have implemented this feature in crash to 
>> display local variables and arguments from vmcore dumps. This feature
>> introduces a new command 'local' in crash utility which provides
>> interface for stack unwinding along with option to display local
>> variables and arguments. This patch is based on crash utility
>> crash-4.0-8.9. It has dependency on libdw/libelf libraries provided by
>> elfutils package.
>> 	This has been tested on dumps taken on ppc64 machine. We were able to 
>> unwind the stack as well as display local variables, arguments. It 
>> currently displays values for non-optimized variables only (this fallows 
>> gdb's convention)
>>
>> TODO Items:
>> 	1. Support on x86_64 and x86 need to be implemented/tested
>> 	2. Makefile need to be updated to help packaging this feature
>> 	
>> Regards
>> Sharyathi Nagesh
> 
> A couple suggestions -- move the get_netdump_arch() and get_regs_from_elf_notes()
> prototypes to defs.h under the others listed for netdump.c.
> 
> Then remove this from local.c:
> 
>   + #include <../netdump.h>
> 
> By removing the netdump.h inclusion, you can build your package with just
> the "defs.h" file like so:
Sure that can be done
> 
>   # make -f local.mk TARGET=X86_64
>   gcc -nostartfiles -shared -g -rdynamic -o local.so local.c unwind_dw.c -fPIC \
>       -ldw -L ../../elfutils-0.137/libdw -I ../../elfutils-0.137/libdw \
>       -I ../../elfutils-0.137/libelf/ -DX86_64   -Wall;
>   #
> 
> Also, the TARGET_FLAGS setting you have in your local.mk doesn't do anything.
> I see that you copied it from the sial.mk file, where it's used as a replacement
> in sial.mk to replace the suggested "-D$(TARGET) $(TARGET_CFLAGS)" part of the
> compile line.  For x86_64 nothing is needed in TARGET_CFLAGS -- these are what
> the supported arches need:

Oops my apologies for overlooking this, yes we will correct this

>   #define TARGET_CFLAGS_X86    "TARGET_CFLAGS=-D_FILE_OFFSET_BITS=64"
>   #define TARGET_CFLAGS_ALPHA  "TARGET_CFLAGS="
>   #define TARGET_CFLAGS_PPC    "TARGET_CFLAGS=-D_FILE_OFFSET_BITS=64"
>   #define TARGET_CFLAGS_IA64   "TARGET_CFLAGS="
>   #define TARGET_CFLAGS_S390   "TARGET_CFLAGS=-D_FILE_OFFSET_BITS=64"
>   #define TARGET_CFLAGS_S390X  "TARGET_CFLAGS="
>   #define TARGET_CFLAGS_PPC64  "TARGET_CFLAGS=-m64"
>   #define TARGET_CFLAGS_X86_64 "TARGET_CFLAGS="

Ok we will include this, after verifying
> 
> So I presume you do need the -m64 for ppc64, but I don't see how your
> local.mk file would pick it up?  I also don't understand where your extra
> $ADD_CFLAGS is supposed to get set up?
This is again copied from some other code and need to be removed

> 
> For that matter, the additional -L and -I for the elfutils stuff you've added
> seem to be unnecessary, just -ldw seems to be suffice:
> 
>   # make -f local2.mk TARGET=X86_64
>   gcc -nostartfiles -shared -g -rdynamic -o local.so local.c unwind_dw.c -fPIC -ldw -DX86_64 -Wall; 
>   # 
Yes you are right, along with that code has some requirements like 
libdw/libelf already installed and elfutils installed is version > 
0.125, as this library has a bug that breaks the code, needs to be checked

Thanks for the information, we will try our best to fix some issues we 
are facing and incorporate these changes

Thanks
Yeehaw

urgrue | 11 May 22:29 2009

Re: crash without namelist?

----- "Jun Koi" <junkoi2004 gmail com> wrote:
>> But I think sometimes we only have the System.map file in hand,
>> without the namelist. Can we do anything with that? Because in many
>> cases only some restricted operations are enough to debug the crashed
>> dump.
>
>The basic design of the crash utility is that it is essentially
>a huge wrapper around the embedded gdb module, taking advantage 
>of gdb's allowance of an alternative user interface.  So during

But is it really so that one can not get _any_ useful info from the dump
without the namelist?
Most of the time just the trace is all I need (i.e. that which is
displayed on the console in a panic). That info is not accessible
conveniently because a) the console can only show the last bits of it
and b) I'd rather autoreboot on a panic.
I know I can use netconsole, but it would just be more convenient if I
could read this from the core file easily.

Dave Anderson | 11 May 22:55 2009
Picon

Re: crash without namelist?


----- "urgrue" <urgrue <at> bulbous.org> wrote:

> ----- "Jun Koi" <junkoi2004 gmail com> wrote:
> >> But I think sometimes we only have the System.map file in hand,
> >> without the namelist. Can we do anything with that? Because in many
> >> cases only some restricted operations are enough to debug the crashed
> >> dump.
> >
> >The basic design of the crash utility is that it is essentially
> >a huge wrapper around the embedded gdb module, taking advantage 
> >of gdb's allowance of an alternative user interface.  So during
> 
> But is it really so that one can not get _any_ useful info from the dump
> without the namelist?
> Most of the time just the trace is all I need (i.e. that which is
> displayed on the console in a panic). That info is not accessible
> conveniently because a) the console can only show the last bits of it
> and b) I'd rather autoreboot on a panic.
> I know I can use netconsole, but it would just be more convenient if I
> could read this from the core file easily.

I understand, but that's not how the crash utility is designed.
You basically want to roll your own utility that doesn't have all
the gdb dependencies that crash has.

If I'm not mistaken, wasn't there a proposed makedumpfile feature that
would just pull the log buffer from a vmcore?  Or is that something
from the diskdumputils package?  I know I've heard talk of it, but
I can't recall where.

Dave

Dave Anderson | 12 May 15:41 2009
Picon

Fwd: crash without namelist?


Thanks Cai -- the capability will be in RHEL5.4 and it
looks to have been put in place in the upstream kernel
and sourceforge makedumpfile already.

The bugzillas referenced below are restricted access so I
removed the numbers, but the option allows you to extract
the log buffer from a kdump vmcore into an output file 
like this:

  # makedumpfile --dump-dmesg /proc/vmcore outputfile

Dave

----- Forwarded Message -----
From: "CAI Qian" <caiqian <at> redhat.com>
To: anderson <at> redhat.com
Cc: caiqian <at> redhat.com
Sent: Monday, May 11, 2009 7:56:36 PM GMT -05:00 US/Canada Eastern
Subject: Re: [Crash-utility] crash without namelist?

Hi Dave,

From: Dave Anderson <anderson <at> redhat.com>
Subject: Re: [Crash-utility] crash without namelist?
Date: Mon, 11 May 2009 16:55:00 -0400 (EDT)

> 
> ----- "urgrue" <urgrue <at> bulbous.org> wrote:
> 
>> ----- "Jun Koi" <junkoi2004 gmail com> wrote:
>> >> But I think sometimes we only have the System.map file in hand,
>> >> without the namelist. Can we do anything with that? Because in many
>> >> cases only some restricted operations are enough to debug the crashed
>> >> dump.
>> >
>> >The basic design of the crash utility is that it is essentially
>> >a huge wrapper around the embedded gdb module, taking advantage 
>> >of gdb's allowance of an alternative user interface.  So during
>> 
>> But is it really so that one can not get _any_ useful info from the dump
>> without the namelist?
>> Most of the time just the trace is all I need (i.e. that which is
>> displayed on the console in a panic). That info is not accessible
>> conveniently because a) the console can only show the last bits of it
>> and b) I'd rather autoreboot on a panic.
>> I know I can use netconsole, but it would just be more convenient if I
>> could read this from the core file easily.
> 
> I understand, but that's not how the crash utility is designed.
> You basically want to roll your own utility that doesn't have all
> the gdb dependencies that crash has.
> 
> If I'm not mistaken, wasn't there a proposed makedumpfile feature that
> would just pull the log buffer from a vmcore?  Or is that something
> from the diskdumputils package?  I know I've heard talk of it, but
> I can't recall where.
> 

Yes, it is in RHEL5.4.

This is the kexec-tools part of changes,
https://bugzilla.redhat.com/show_bug.cgi?id=xxxxxx

This is the kernel part of changes,
https://bugzilla.redhat.com/show_bug.cgi?id=xxxxxx

Thanks,
CAI Qian


Gmane