Eugene Surovegin | 1 Nov 2011 06:36

[PATCH] kexec: change phys_to_virt() to always use 'unsigned long long'.

This fixes physical address truncation by 32-bit binary preparing ELF
headers for 64-bit crash kernel, e.g. 32-bit kexec-tools running on 
x86_64..

Signed-off-by: Eugene Surovegin <ebs@...>
---
 kexec/arch/arm/phys_to_virt.c |    4 ++--
 kexec/crashdump.h             |    4 ++--
 kexec/phys_to_virt.c          |    4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/kexec/arch/arm/phys_to_virt.c b/kexec/arch/arm/phys_to_virt.c
index bcced52..ed96cf3 100644
--- a/kexec/arch/arm/phys_to_virt.c
+++ b/kexec/arch/arm/phys_to_virt.c
 <at>  <at>  -13,8 +13,8  <at>  <at> 
  * See also:
  * http://lists.arm.linux.org.uk/lurker/message/20010723.185051.94ce743c.en.html
  */
-unsigned long
-phys_to_virt(struct crash_elf_info *elf_info, unsigned long paddr)
+unsigned long long
+phys_to_virt(struct crash_elf_info *elf_info, unsigned long long paddr)
 {
 	return paddr + elf_info->page_offset - phys_offset;
 }
diff --git a/kexec/crashdump.h b/kexec/crashdump.h
index 0f7c2ea..dab5dc1 100644
--- a/kexec/crashdump.h
+++ b/kexec/crashdump.h
(Continue reading)

Dave Young | 1 Nov 2011 10:19
Picon
Favicon

[PATCH retry] intel-iommu:make identity_map default for crash dump

kdump kernel sometimes will get DMAR faults which
is caused by random in-flight dma from 1st kernel

Here make the identity_mapping as default for this case

Signed-off-by: Dave Young <dyoung@...>
---
 drivers/iommu/intel-iommu.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

--- linux-2.6.orig/drivers/iommu/intel-iommu.c	2011-11-01 13:06:18.667505962 +0800
+++ linux-2.6/drivers/iommu/intel-iommu.c	2011-11-01 17:07:50.789137864 +0800
 <at>  <at>  -40,6 +40,7  <at>  <at> 
 #include <linux/tboot.h>
 #include <linux/dmi.h>
 #include <linux/pci-ats.h>
+#include <linux/crash_dump.h>
 #include <asm/cacheflush.h>
 #include <asm/iommu.h>

 <at>  <at>  -2488,7 +2489,7  <at>  <at>  static int __init init_dmars(void)
 		}
 	}

-	if (iommu_pass_through)
+	if (iommu_pass_through || is_kdump_kernel())
 		iommu_identity_mapping |= IDENTMAP_ALL;

 #ifdef CONFIG_INTEL_IOMMU_BROKEN_GFX_WA
(Continue reading)

Dave Young | 1 Nov 2011 10:52
Picon
Favicon

Re: [PATCH retry] intel-iommu:make identity_map default for crash dump

On 11/01/2011 05:34 PM, David Woodhouse wrote:

> On Tue, 2011-11-01 at 17:19 +0800, Dave Young wrote:
>> kdump kernel sometimes will get DMAR faults which
>> is caused by random in-flight dma from 1st kernel
>>
>> Here make the identity_mapping as default for this case
> 
> So you want to *allow* the random in-flight DMA? And with an identity
> mapping it's really going to random addresses, since it'll be
> untranslated and won't even go to the physical addresses which it was
> originally intended for?
> 
> This seems entirely broken to me.
> 
> If there is "random in-flight dma" from the first kernel, the correct
> thing to do is *block* it. Which is what we do.
> 

This patch works for me for several cases. For kdump It can be seen a
workaround. But yes blocking the 1st kernel dma is ideal.

Glad to hear that you are working on this.

--

-- 
Thanks
Dave
Michael Holzheu | 2 Nov 2011 11:03
Picon

Re: [PATCH v2] kdump: Fix crash_kexec - smp_send_stop race in panic

On Tue, 2011-11-01 at 16:04 -0400, Don Zickus wrote:
> On Mon, Oct 31, 2011 at 01:34:19PM +0100, Michael Holzheu wrote:
> > Hello Andrew, hello linux-arch,
> > 
> > > Well OK.  Maybe some architectures do have this problem - who would
> > > notice?  If that is the case, we just made the failure cases much more
> > > common.  Could you check, please?
> > 
> >  <at> linux-arch: 
> > 
> > This patch introduces a spinlock to prevent parallel execution of the
> > panic code. Andrew pointed out that this might be a problem for
> > architectures that can't do smp_send_stop() on remote CPUs that have
> > interrupts disabled. When irq-disabled CPUs execute panic() in parallel,
> > we then would have looping CPUs.
> 
> x86 has such problem and I posted a patch recently to fix it
> 
> https://lkml.org/lkml/2011/10/13/426

Ok good, so with this patch x86 has no problem with the panic spinlock.
Anybody else?

Instead of introducing the panic lock, as an alternative we could move
smp_send_stop() to the beginning of panic(). Eric told me that the
function is currently "insufficiently reliable" for that, but perhaps we
could make it more reliable.

Michael
(Continue reading)

Dave Young | 4 Nov 2011 08:05
Picon
Favicon

ppc64 kexec -p failed

Hi,

When use crashkernel=128M <at> 256M at a ppc64 machine, kexec -p vmlinuz
failed with:
Could not find a free area of memory of faa448 bytes...
Could not find a free area of memory of 142721d bytes...

Is this a know issue or Is there limitation of the crashkernel base addr?

Detail info as below (define DEBUG when building):

0000000008000000-000000000c000000 : 0
000000000c000000-0000000010000000 : 0
0000000010000000-0000000014000000 : 0
0000000014000000-0000000018000000 : 0
0000000018000000-000000001c000000 : 0
000000001c000000-0000000020000000 : 0
0000000020000000-0000000024000000 : 0
0000000024000000-0000000028000000 : 0
0000000028000000-000000002c000000 : 0
000000002c000000-0000000030000000 : 0
0000000030000000-0000000034000000 : 0
0000000034000000-0000000038000000 : 0
0000000038000000-000000003c000000 : 0
000000003c000000-0000000040000000 : 0
0000000040000000-0000000044000000 : 0
0000000044000000-0000000048000000 : 0
0000000048000000-000000004c000000 : 0
000000004c000000-0000000050000000 : 0
0000000050000000-0000000054000000 : 0
(Continue reading)

Dave Young | 8 Nov 2011 03:22
Picon
Favicon

Re: ppc64 kexec -p failed

On 11/04/2011 03:05 PM, Dave Young wrote:

> Hi,
> 
> When use crashkernel=128M <at> 256M at a ppc64 machine, kexec -p vmlinuz
> failed with:
> Could not find a free area of memory of faa448 bytes...
> Could not find a free area of memory of 142721d bytes...
> 
> Is this a know issue or Is there limitation of the crashkernel base addr?

Vivek, do you know something about this?

>

> Detail info as below (define DEBUG when building):
> 
> 0000000008000000-000000000c000000 : 0
> 000000000c000000-0000000010000000 : 0
> 0000000010000000-0000000014000000 : 0
> 0000000014000000-0000000018000000 : 0
> 0000000018000000-000000001c000000 : 0
> 000000001c000000-0000000020000000 : 0
> 0000000020000000-0000000024000000 : 0
> 0000000024000000-0000000028000000 : 0
> 0000000028000000-000000002c000000 : 0
> 000000002c000000-0000000030000000 : 0
> 0000000030000000-0000000034000000 : 0
> 0000000034000000-0000000038000000 : 0
> 0000000038000000-000000003c000000 : 0
(Continue reading)

Dave Young | 8 Nov 2011 06:26
Picon
Favicon

[PATCH] remove unnecessary check code for hole_align

hole_align == 0 check is not neccesary, because it will be set to
pagesize if it's zero. Just remove it here.

Signed-off-by: Dave Young <dyoung@...>
---
 kexec/kexec.c |    4 ----
 1 file changed, 4 deletions(-)

Index: kexec-tools/kexec/kexec.c
===================================================================
--- kexec-tools.orig/kexec/kexec.c
+++ kexec-tools/kexec/kexec.c
 <at>  <at>  -203,10 +203,6  <at>  <at>  unsigned long locate_hole(struct kexec_i
 	/* Set an intial invalid value for the hole base */
 	hole_base = ULONG_MAX;

-	/* Ensure I have a sane alignment value */
-	if (hole_align == 0) {
-		hole_align = 1;
-	}
 	/* Align everything to at least a page size boundary */
 	if (hole_align < (unsigned long)getpagesize()) {
 		hole_align = getpagesize();
Picon

Re: ppc64 kexec -p failed

On 11/04/2011 12:35 PM, Dave Young wrote:
> Hi,
> 
> When use crashkernel=128M <at> 256M at a ppc64 machine, kexec -p vmlinuz
> failed with:
> Could not find a free area of memory of faa448 bytes...
> Could not find a free area of memory of 142721d bytes...
> 
> Is this a know issue or Is there limitation of the crashkernel base addr?

On Power, crashkernel base addr must fall inside RMO region. The is
because ppc64 kernel needs some of its memory in the RMO region. The
memory ranges below shows that the system has RMO region of size 128M,
Hence crashkernel base addr  <at> 64M should work just fine.

> 
> Detail info as below (define DEBUG when building):
> 
...
...
> 0000000000000000-0000000008000000 : 0
...
...

Thanks,
-Mahesh.

Gmane