Aaron Tomlin | 21 Nov 16:10 2014
Picon

[PATCH] crash: Show kernel tainted status in display_sys_stats()

The sys command displays essential system information and it is often
shown by default when crash is started in non-silent mode. It might be
considered helpful to report if the kernel is tainted or not when
sys is run without any arguments. This patch makes this change.
The intended output is as follows:

      KERNEL: /usr/lib/debug/lib/modules/3.16.4-200.fc20.x86_64/vmlinux
    DUMPFILE: /dev/crash
        CPUS: 4
        DATE: Fri Nov 21 15:02:56 2014
      UPTIME: 6 days, 07:41:22
LOAD AVERAGE: 0.29, 0.20, 0.15
       TASKS: 397
    NODENAME: atomlin.usersys.redhat.com
     RELEASE: 3.16.4-200.fc20.x86_64
     VERSION: #1 SMP Mon Oct 6 12:57:00 UTC 2014
     TAINTED: YES
     MACHINE: x86_64  (2693 Mhz)
      MEMORY: 7.7 GB
         PID: 12172
     COMMAND: "crash"
        TASK: ffff8801b0aacf00  [THREAD_INFO: ffff8801b1e1c000]
         CPU: 0
       STATE: TASK_RUNNING (ACTIVE)

If the tainted_mask or tainted symbol does not exist then nothing is
displayed.

Signed-off-by: Aaron Tomlin <atomlin <at> redhat.com>
---
(Continue reading)

Aaron Tomlin | 21 Nov 13:13 2014
Picon

[RFC PATCH 0/2] crash: Show memory overcommit data in dump_kmeminfo()

Hi,

The first patch changes dump_kmeminfo() to report overcommit information
similar to that displayed under the proc/meminfo file. It may be useful to
indicate memory over commitment abuse, for example with forced vmcores from
system hangs due to shortage of memory. The intended output is as follows:

  crash> kmem -i
		   PAGES        TOTAL      PERCENTAGE
      TOTAL MEM  1965332       7.5 GB         ----
	   FREE    78080       305 MB    3% of TOTAL MEM
	   USED  1887252       7.2 GB   96% of TOTAL MEM
	 SHARED   789954         3 GB   40% of TOTAL MEM
	BUFFERS   110606     432.1 MB    5% of TOTAL MEM
	 CACHED  1212645       4.6 GB   61% of TOTAL MEM
	   SLAB   146563     572.5 MB    7% of TOTAL MEM

     TOTAL SWAP  1970175       7.5 GB         ----
      SWAP USED        5        20 KB    0% of TOTAL SWAP
      SWAP FREE  1970170       7.5 GB   99% of TOTAL SWAP

   COMMIT LIMIT  2952841      11.3 GB         ----
      COMMITTED  1150595       4.4 GB   38% of TOTAL LIMIT

The second patch simply removes dump_zone_page_usage().

Tested under 3.16.4-200.fc20.x86_64 only.
Though this should work under RHEL5 (2.6.18) and above.

Aaron Tomlin (2):
(Continue reading)

Sebastian Ott | 14 Nov 14:56 2014
Picon

[PATCH] s390: support irq command via generic_dump_irq

Hi,

here is a simple patch to rudimentary support the irq command on s390.
Nothing special like irq statistics, just the plain list of irqs. Also
this will only work on recent kernels. Old kernels (without
GENERIC_HARDIRQ support) will print "cannot determine number of IRQs".

Regards,
Sebastian

>From aa13aff5450686ac4438d771596e0faa041aa454 Mon Sep 17 00:00:00 2001
From: Sebastian Ott <sebott <at> linux.vnet.ibm.com>
Date: Fri, 14 Nov 2014 13:52:54 +0100
Subject: [PATCH] s390: support irq command via generic_dump_irq

Signed-off-by: Sebastian Ott <sebott <at> linux.vnet.ibm.com>
---
 kernel.c |  6 ------
 s390.c   | 25 ++++++++++++-------------
 s390x.c  | 24 +++++++++++-------------
 3 files changed, 23 insertions(+), 32 deletions(-)

diff --git a/kernel.c b/kernel.c
index 1cb0967..da1e48e 100644
--- a/kernel.c
+++ b/kernel.c
 <at>  <at>  -5575,9 +5575,6  <at>  <at>  cmd_irq(void)
 			return;

 		case 'u':
(Continue reading)

Dave Anderson | 13 Nov 22:41 2014
Picon

[ANNOUNCE] crash version 7.0.9 is available


Download from: http://people.redhat.com/anderson
                 or
               https://github.com/crash-utility/crash/releases

The master branch serves as a development branch that will contain all 
patches that are queued for the next release:

  $ git clone git://github.com/crash-utility/crash.git

Changelog:

 - Fix the CPU timer and clock comparator output for the "bt -a" command
   on S390X machines.  The output of CPU timer and clock comparator has
   always been incorrect because:
     - We added S390X_WORD_SIZE (8) instead of 4 to get the second word
     - We did not left shift the clock comparator by 8
   The fix gets the complete 64 bit values and by shifting the clock 
   comparator correctly.
   (holzheu <at> linux.vnet.ibm.com)

 - Add "/lib/modules/≤version>/build" to the list of directories that
   are searched for the currently-running kernel on live systems.  This
   will automatically locate the vmlinux namelist for kernels that were
   locally installed with "make modules_install install".
   (lrintel <at> redhat.com)

 - Addressed 3 Coverity Scan issues:
     (1) task.c: initialize the "curr" and "curr_my_q" variables in the
         dump_tasks_in_task_group_cfs_rq() function.
(Continue reading)

HATAYAMA Daisuke | 6 Nov 11:15 2014

[ANNOUNCE] crash gcore command, version 1.3.1 is released

This is the release of crash gcore command, version 1.3.1.

This release only aims at fixing building failure on x86 I overlooked
at the release of version 1.3.0.

ChangeLog:

[bugfixes]

 - Fix building failure on x86 caused by a static reference to type
   struct user_i387_struct that is used on x86_64 only. This reference
   was introduced at v1.3.0 by the bugfix of segfault issue due to a
   buffer overwrite of NT_FPREGSET. Correct one on x86 is struct
   user_i387_ia32_struct, and we use it now.
   (d.hatayama <at> jp.fujitsu.com)

MD5 CheckSum:

$ md5sum ./crash-gcore-command-1.3.1.tar.gz 
b89be347111c0d26f3c0882e7ad09953  ./crash-gcore-command-1.3.1.tar.gz

--
Thanks.
HATAYAMA, Daisuke
Attachment (crash-gcore-command-1.3.1.tar.gz): application/octet-stream, 67 KiB
This is the release of crash gcore command, version 1.3.1.

This release only aims at fixing building failure on x86 I overlooked
(Continue reading)

HATAYAMA Daisuke | 4 Nov 06:51 2014

[ANNOUNCE] crash gcore command, version 1.3.0 is released

This is the release of crash gcore command, version 1.3.0.

This release newly adds ARM64 and PPC64 supports, thanks to respective
maintainers for their development of patch sets and verifications at
each rc release.

The remaining changes are all bugfixes.

# The ChangeLog includes those that appeared at each rc release.

ChangeLog:

[new features]

 - Add ARM64 support. In addition to native ARM64 build, like crash
   utility, we can build x86_64 executable of crash gcore command for
   ARM64 crash dump by make target=ARM64, just like crash utility.
   (anderson <at> redhat.com)

 - Add ARM64 compat mode support. This allows gcore to create
   corefiles for tasks running in 32-bit compatible mode on ARM64.
   (weishu <at> marvell.com)

 - Add PPC64 support. This includes both big-endian and little-endian
   formats.
   (mtoman <at> redhat.com, anderson <at> redhat.com)

[bugfixes]

 - Correct a read buffer size for NT_FPREGSET as sizeof(struct
(Continue reading)

qiaonuohan | 29 Oct 09:30 2014

[PATCH] kdump: fix to get the correct page at the edge of split files

Hello Dave,

I found a bug when analyzing split vmcore in kdump-compressed format.
Please check the patch.

-- 
Regards
Qiao Nuohan
Hello Dave,

I found a bug when analyzing split vmcore in kdump-compressed format.
Please check the patch.

--

-- 
Regards
Qiao Nuohan
Subramanian Karunanithi | 25 Oct 18:36 2014
Picon

Regarding crash-gcore-command

Hi,

I am using crash-gcore-command 1.2.0.
I am trying to cross compile this tool for PPC arch. However, looks like gcore_defs.h is having only x86, x86_64 and ARM capability. 

Is there any plan to support this tool for PPC?

Regards,
Subramanian. K
<div><div dir="ltr">Hi,<div><br></div>
<div>I am using&nbsp;crash-gcore-command 1.2.0.</div>
<div>I am trying to cross compile this tool for PPC arch. However, looks like gcore_defs.h is having only x86, x86_64 and ARM capability.&nbsp;</div>
<div><br></div>
<div>Is there any plan to support this tool for PPC?</div>
<div><br></div>
<div>Regards,</div>
<div>Subramanian. K</div>
</div></div>
HATAYAMA Daisuke | 24 Oct 12:12 2014

[ANNOUNCE] crash gcore command, version 1.3.0-rc2 is released

This is the release of crash gcore command, version 1.3.0-rc2.

The version 1.3.0 is going to newly add ARM64 support, including
compat mode, and PPC64 support, and the purpose of this serise of rc
version releases is for verification by other architecture
maintainers. Please give me a verfication result as a reply to this
mail.

The remaining changes are all bugfixes.

# The changes include those that appeared in v1.3.0-rc.

ChangeLog:

[new features]

 - Add ARM64 support. In addition to native ARM64 build, like crash
   utility, we can build x86_64 executable of crash gcore command for
   ARM64 crash dump by make target=ARM64, just like crash utility.
   (anderson <at> redhat.com)

 - Add ARM64 compat mode support. This allows gcore to create
   corefiles for tasks running in 32-bit compatible mode on ARM64.
   (weishu <at> marvell.com)

 - Add PPC64 support. This includes both big-endian and little-endian
   formats.
   (mtoman <at> redhat.com, anderson <at> redhat.com)

[bugfixes]

 - Correct a read buffer size for NT_FPREGSET as sizeof(struct
   user_i387_struct). So far we had used sizeof(union thread_xstate)
   falsely as a read buffer size but it had accidentally been equal to
   sizeof(struct user_i387_struct). However, the following patch
   extended union thread_xstate and sizeof(union thread_xstate) became
   larger than sizeof(struct user_i387_struct):

    commit e7d820a5e549b3eb6c3f9467507566565646a669
    Author: Qiaowei Ren <qiaowei.ren <at> intel.com>
    Date:   Thu Dec 5 17:15:34 2013 +0800

        x86, xsave: Support eager-only xsave features, add MPX support

        Some features, like Intel MPX, work only if the kernel uses eagerfpu
        model.  So we should force eagerfpu on unless the user has explicitly
        disabled it.

        Add definitions for Intel MPX and add it to the supported list.

        [ hpa: renamed XSTATE_FLEXIBLE to XSTATE_LAZY and added comments ]

        Signed-off-by: Qiaowei Ren <qiaowei.ren <at> intel.com>
        Link: http://lkml.kernel.org/r/9E0BE1322F2F2246BD820DA9FC397ADE014A6115 <at> SHSMSX102.ccr.corp.intel.com
        Signed-off-by: H. Peter Anvin <hpa <at> linux.intel.com>

   Without this patch, for vmcores whose kernel versions are v3.14 or
   later, gcore results in segmentation fault due to a buffer overrite
   of NT_FPREGSET.
   (d.hatayama <at> jp.fujitsu.com)

 - Although ELF_DATA is defined in gcore_defs.h, ELFDATA2LSB is used
   directly at elf{64,32}_fill_elf_header(). There's so far been no
   problem since the exisitng supported architectures are all
   little-endian systems. Fix this to support PPC64 that uses
   little-endian format.
   (anderson <at> redhat.com)

 - Fix a bug that registers in NT_PRSTATUS note information is
   broken. This had been since v1.2.2 when O(1) note informaiton
   collection was added. Without this fix, we can never get reliable
   register values for failure analysis.
   (weishu <at> marvell.com)

 - Fix a bug that NT_386_IOPERM note information is not collected. So
   far, ioperm_get() had always returned 1. As a result, NT_386_IOPERM
   note information had never been not included in a generated core
   file even if it is available for a given task on a given crash
   dump.
   (d.hatayama <at> jp.fujitsu.com)

 - Add new member offset initialization for struct
   nsproxy::pid_ns_for_children. In upstream, the following patch
   renamed struct nsproxy::pid_ns into struct
   nsproxy::pid_ns_for_children.

    $ git log -1 c2b1df2e
    commit c2b1df2eb42978073ec27c99cc199d20ae48b849
    Author: Andy Lutomirski <luto <at> amacapital.net>
    Date:   Thu Aug 22 11:39:16 2013 -0700

        Rename nsproxy.pid_ns to nsproxy.pid_ns_for_children

        nsproxy.pid_ns is *not* the task's pid namespace. The name
        should clarify that.

        This makes it more obvious that setns on a pid namespace is weird --
        it won't change the pid namespace shown in procfs.

        Signed-off-by: Andy Lutomirski <luto <at> amacapital.net>
        Reviewed-by: "Eric W. Biederman" <ebiederm <at> xmission.com>
        Signed-off-by: David S. Miller <davem <at> davemloft.net>

   Without this fix, gcore exited abnormally at its initialization
   part and so core file is never generated.
   (d.hatayama <at> jp.fujitsu.com)

 - Fix a bug that a wrong way of checking return value of
   fopen(). fopen() returns NULL in case of error, but gcore had seen
   it as returning a minus integer. As a result, gcore continues
   execution after the check even in case of error and then exits
   abnormally at the first call of fwrite() with the broken file
   pointer gcore failed to open.

   From users' viewpoint, we face this bug when trying to overwrite an
   existing corefile with more priviledged permission and resulting in
   EPERM failure.
   (d.hatayama <at> jp.fujitsu.com)

MD5 CheckSum:

$ md5sum ./crash-gcore-command-1.3.0-rc2.tar.gz
07757d2ee044b19cac6b652de0d757fc  ./crash-gcore-command-1.3.0-rc2.tar.gz

--
Thanks.
HATAYAMA, Daisuke
Attachment (crash-gcore-command-1.3.0-rc2.tar.gz): application/octet-stream, 79 KiB
This is the release of crash gcore command, version 1.3.0-rc2.

The version 1.3.0 is going to newly add ARM64 support, including
compat mode, and PPC64 support, and the purpose of this serise of rc
version releases is for verification by other architecture
maintainers. Please give me a verfication result as a reply to this
mail.

The remaining changes are all bugfixes.

# The changes include those that appeared in v1.3.0-rc.

ChangeLog:

[new features]

 - Add ARM64 support. In addition to native ARM64 build, like crash
   utility, we can build x86_64 executable of crash gcore command for
   ARM64 crash dump by make target=ARM64, just like crash utility.
   (anderson <at> redhat.com)

 - Add ARM64 compat mode support. This allows gcore to create
   corefiles for tasks running in 32-bit compatible mode on ARM64.
   (weishu <at> marvell.com)

 - Add PPC64 support. This includes both big-endian and little-endian
   formats.
   (mtoman <at> redhat.com, anderson <at> redhat.com)

[bugfixes]

 - Correct a read buffer size for NT_FPREGSET as sizeof(struct
   user_i387_struct). So far we had used sizeof(union thread_xstate)
   falsely as a read buffer size but it had accidentally been equal to
   sizeof(struct user_i387_struct). However, the following patch
   extended union thread_xstate and sizeof(union thread_xstate) became
   larger than sizeof(struct user_i387_struct):

    commit e7d820a5e549b3eb6c3f9467507566565646a669
    Author: Qiaowei Ren <qiaowei.ren <at> intel.com>
    Date:   Thu Dec 5 17:15:34 2013 +0800

        x86, xsave: Support eager-only xsave features, add MPX support

        Some features, like Intel MPX, work only if the kernel uses eagerfpu
        model.  So we should force eagerfpu on unless the user has explicitly
        disabled it.

        Add definitions for Intel MPX and add it to the supported list.

        [ hpa: renamed XSTATE_FLEXIBLE to XSTATE_LAZY and added comments ]

        Signed-off-by: Qiaowei Ren <qiaowei.ren <at> intel.com>
        Link: http://lkml.kernel.org/r/9E0BE1322F2F2246BD820DA9FC397ADE014A6115 <at> SHSMSX102.ccr.corp.intel.com
        Signed-off-by: H. Peter Anvin <hpa <at> linux.intel.com>

   Without this patch, for vmcores whose kernel versions are v3.14 or
   later, gcore results in segmentation fault due to a buffer overrite
   of NT_FPREGSET.
   (d.hatayama <at> jp.fujitsu.com)

 - Although ELF_DATA is defined in gcore_defs.h, ELFDATA2LSB is used
   directly at elf{64,32}_fill_elf_header(). There's so far been no
   problem since the exisitng supported architectures are all
   little-endian systems. Fix this to support PPC64 that uses
   little-endian format.
   (anderson <at> redhat.com)

 - Fix a bug that registers in NT_PRSTATUS note information is
   broken. This had been since v1.2.2 when O(1) note informaiton
   collection was added. Without this fix, we can never get reliable
   register values for failure analysis.
   (weishu <at> marvell.com)

 - Fix a bug that NT_386_IOPERM note information is not collected. So
   far, ioperm_get() had always returned 1. As a result, NT_386_IOPERM
   note information had never been not included in a generated core
   file even if it is available for a given task on a given crash
   dump.
   (d.hatayama <at> jp.fujitsu.com)

 - Add new member offset initialization for struct
   nsproxy::pid_ns_for_children. In upstream, the following patch
   renamed struct nsproxy::pid_ns into struct
   nsproxy::pid_ns_for_children.

    $ git log -1 c2b1df2e
    commit c2b1df2eb42978073ec27c99cc199d20ae48b849
    Author: Andy Lutomirski <luto <at> amacapital.net>
    Date:   Thu Aug 22 11:39:16 2013 -0700

        Rename nsproxy.pid_ns to nsproxy.pid_ns_for_children

        nsproxy.pid_ns is *not* the task's pid namespace. The name
        should clarify that.

        This makes it more obvious that setns on a pid namespace is weird --
        it won't change the pid namespace shown in procfs.

        Signed-off-by: Andy Lutomirski <luto <at> amacapital.net>
        Reviewed-by: "Eric W. Biederman" <ebiederm <at> xmission.com>
        Signed-off-by: David S. Miller <davem <at> davemloft.net>

   Without this fix, gcore exited abnormally at its initialization
   part and so core file is never generated.
   (d.hatayama <at> jp.fujitsu.com)

 - Fix a bug that a wrong way of checking return value of
   fopen(). fopen() returns NULL in case of error, but gcore had seen
   it as returning a minus integer. As a result, gcore continues
   execution after the check even in case of error and then exits
   abnormally at the first call of fwrite() with the broken file
   pointer gcore failed to open.

   From users' viewpoint, we face this bug when trying to overwrite an
   existing corefile with more priviledged permission and resulting in
   EPERM failure.
   (d.hatayama <at> jp.fujitsu.com)

MD5 CheckSum:

$ md5sum ./crash-gcore-command-1.3.0-rc2.tar.gz
07757d2ee044b19cac6b652de0d757fc  ./crash-gcore-command-1.3.0-rc2.tar.gz

--
Thanks.
HATAYAMA, Daisuke
Petr Tesarik | 24 Oct 11:09 2014
Picon

Kernel dump file access library

Hi all,

during this year's SUSE HackWeek, my colleague started work on enabling
kernel core files in gdb. I realized that there would be at least four
different programs implementing read access to kernel dump files:

  1. the crash utility
  2. makedumpfile (when re-filtering)
  3. kdumpid (my project to get kernel version from a dump file)
  4. gdb-kdump (started by my colleague during HackWeek)

At this point, I felt that's too much re-inventing the wheel again and
again, so I took my current code from kdumpid and adapted it as a
library that can be used by everybody:

    https://github.com/ptesarik/libkdumpfile

In its current shape, it's usable, but far from complete.

Things that work already:
   - identify kdump file format
   - parsed meta-information from the header
   - open ELF, diskdump, makedumpfile, LKCD
   - read data by physical address (incl. Xen Dom0)
   - read data by Xen machine address

Things still on my TODO list:
   - more formats: sadump, kvmdump, libvirt, xc_core, xc_save
   - determine phys_base in ELF files
   - determine kernel release if not found in headers

Ideally, I would like to replace all current implementations with this
library, so if a new file format appears, or a new feature is added to
one of the files, it can be immediately used by all kdump-related tools.

Please let me know what you think.
Oh, and if you're developing such a tool, let me know which features
should be added.

Regards,
Petr Tesarik

"Zhou, Wenjian/周文剑" | 22 Oct 08:32 2014

add support for incomplete elf dump file

Since the incomplete dump file generated by ENOSPC error can't be analysed
by crash utility, but sometimes this file may contain important information
and the panic problem won't be reproduced, then we came up with an idea to
modify the exist data of the incomplete dump file to make it analysable by
crash utility.
However, we found it will be more suitable to check the incomplete data than
modifying it in make dump file.
So, we change the p_filesz of PT_LOAD header, zero_fill and phys_end of PT_LOAD
segments, to make crash can analyse incomplete ELF dump file, when incomplete
flag exists.

the issue was discussed at
	http://lists.infradead.org/pipermail/kexec/2014-October/012669.html

--- a/netdump.c
+++ b/netdump.c
 <at>  <at>  -52,6 +52,7  <at>  <at>  static char *vmcoreinfo_read_string(const char *);

  #define MIN_PAGE_SIZE (4096)

+#define DUMP_ELF_INCOMPLETE 0x1
  /*
   * Architectures that have configurable page sizes,
   * can differ from the host machine's page size.
 <at>  <at>  -488,6 +489,10  <at>  <at>  check_dumpfile_size(char *file)
         if (stat64(file, &stat) < 0)
                 return;

+       Elf64_Phdr *load64 = nd->load64;
+       Elf32_Phdr *load32 = nd->load32;
+       unsigned int e_flag = (NULL == nd->elf64) ? (nd->elf32)->e_flags : (nd->elf64)->e_flags;
+       int status = e_flag & DUMP_ELF_INCOMPLETE;
         for (i = 0; i < nd->num_pt_load_segments; i++) {
                 pls = &nd->pt_load_segments[i];

 <at>  <at>  -495,7 +500,19  <at>  <at>  check_dumpfile_size(char *file)
                         (pls->phys_end - pls->phys_start);

                 if (segment_end > stat.st_size) {
-                       error(WARNING, "%s: may be truncated or incomplete\n"
+                       if (!status){
+                               error(WARNING, "%s: may be truncated\n"
+                                       "         PT_LOAD p_offset: %lld\n"
+                                       "                 p_filesz: %lld\n"
+                                       "           bytes required: %lld\n"
+                                       "            dumpfile size: %lld\n\n",
+                                       file, pls->file_offset,
+                                       pls->phys_end - pls->phys_start,
+                                       segment_end, stat.st_size);
+                               return;
+                       }
+                       else{
+                               error(WARNING, "%s: may be incomplete\n"
                                 "         PT_LOAD p_offset: %lld\n"
                                 "                 p_filesz: %lld\n"
                                 "           bytes required: %lld\n"
 <at>  <at>  -503,8 +520,25  <at>  <at>  check_dumpfile_size(char *file)
                                 file, pls->file_offset,
                                 pls->phys_end - pls->phys_start,
                                 segment_end, stat.st_size);
-                       return;
+                       }
+                       if (pls->file_offset > stat.st_size){
+                                pls->file_offset = 0;
+                                pls->phys_start = 0;
+                                pls->phys_end = 0;
+                        }
+                        else {
+                               if (NULL == load32)
+                                       load64->p_filesz = stat.st_size - pls->file_offset;
+                               else
+                                       load32->p_filesz = stat.st_size - pls->file_offset;
+                               pls->zero_fill = pls->phys_end;
+                                pls->phys_end = stat.st_size - pls->file_offset + pls->phys_start;
+                       }
                 }
+               if (NULL == load32)
+                       load64++;
+               else
+                       load32++;
         }
  }


Gmane