Benjamin Herrenschmidt | 24 Nov 06:08 2014

linuxppc-dev list <linuxppc-dev <at> ozlabs.org>

Hi Linus !

This series fix a nasty issue with radeon adapters on powerpc servers,
it's all CC'ed stable and has the relevant maintainers ack's/reviews.

Basically, some (radeon) adapters have issues with MSI addresses above
1T (only support 40-bits). We had powerpc specific quirk but it only
listed a specific revision of an adapter that we shipped with our
machines and didn't properly handle the audio function which some distros
enable nowadays.

So we made the quirk generic and fixed both the graphic and audio drivers
properly to use it.

Without that, ppc64 server machines will crash at boot with a radeon adapter.

Note: This has been brewing for a while, it just needed a last respin which
got delayed due to us moving ozlabs to a new location in town and other such
things taking priority.

Cheers,
Ben.

The following changes since commit 5d01410fe4d92081f349b013a2e7a95429e4f2c9:

  Linux 3.18-rc6 (2014-11-23 15:25:20 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git 
(Continue reading)

Wei Yang | 22 Nov 03:52 2014
Picon

[PATCH] PCI: Refresh offset/stride after NumVFs is written

According to SR-IOV spec sec 3.3.9, 3.3.10, the NumVFs setting change will
affect the offset and stride. Current implementation doesn't refresh the
offset/stride cached in pci_sriov structure.

This patch introduces a wrapper pci_iov_set_numvfs(), which refresh these two
value after NumVFs is written.

Signed-off-by: Wei Yang <weiyang <at> linux.vnet.ibm.com>
---
 drivers/pci/iov.c |   17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 4d109c0..c7010c5 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
 <at>  <at>  -31,6 +31,15  <at>  <at>  static inline u8 virtfn_devfn(struct pci_dev *dev, int id)
 		dev->sriov->stride * id) & 0xff;
 }

+static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
+{
+	struct pci_sriov *iov = dev->sriov;
+
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, nr_virtfn);
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_OFFSET, &iov->offset);
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_STRIDE, &iov->stride);
+}
+
 static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
(Continue reading)

Konrad Rzeszutek Wilk | 21 Nov 23:17 2014
Picon

[PATCH v4] Fixes for PCI backend for 3.19


Hey,

The last time I posted these patches: https://lkml.org/lkml/2014/7/8/533
was so long ago that I don't even remember most comments. The only
one that stuck in mind was David's recommendation to add a new PCI API
call and use that. See patch #6 and #7.

The original posting (v3) had an extra patch that would do slot and
bus reset using do_flr SysFS attribute. I will revisit that once I am
done with this patchset.

I also seem to have had in my a bunch of 'SoB' from David - which
makes no sense - unless I pulled from his tree. Anyhow wherewere
I saw them I removed them.

Please take a look and comment. If I missed some comment from months
ago hopefully the new version has clarified them.

 drivers/pci/pci.c                     |  5 +--
 drivers/xen/xen-pciback/passthrough.c | 14 ++++++--
 drivers/xen/xen-pciback/pci_stub.c    | 60 ++++++++++++++++++++++-------------
 drivers/xen/xen-pciback/pciback.h     |  7 ++--
 drivers/xen/xen-pciback/vpci.c        | 14 ++++++--
 drivers/xen/xen-pciback/xenbus.c      |  4 +--
 include/linux/device.h                |  5 +++
 include/linux/pci.h                   |  2 ++
 8 files changed, 76 insertions(+), 35 deletions(-)

Konrad Rzeszutek Wilk (7):
(Continue reading)

Alex Williamson | 21 Nov 23:08 2014
Picon

[PATCH 0/3] PCI/x86: Interface for testing multivector MSI support

I'd like to make vfio-pci capable of manipulating the device exposed
to the user such that if the host can only support a single MSI
vector then we hide the fact that the device itself may actually be
able to support more.  When we virtualize PCI config space and
interrupt setup there's no PCI protocol for the device failing to
allocate the number of vectors that it said were available.  If the
userspace driver is a guest operating system, it certainly doesn't
expect this to fail.  I don't think we can ever guarantee that a
multi-vector request will succeed, but we can certainly guarantee
that it will fail if the platform doesn't support it.

An example device is the Atheros AR93xxx running in a Windows 7 VM.
Both the device and the guest OS support multiple MSI vectors.  With
interrupt remapping, such that the host supports multivector, the
device works well in the guest.  With interrupt remapping disabled,
the device is far less reliable because of the mismatch in MSI
programming vs driver configuration and often fails.  If vfio-pci
can test whether multiple vectors are supported, then we can make it
work reliably in both cases by adjusting the exposed MSI capability,
like in this patch that would follow this series:

https://github.com/awilliam/linux-vfio/commit/9ace67515680

With this series, only x86 w/ interrupt remapping will advertise
support for multiple MSI vectors.  In surveying the code, I couldn't
find any other archs that allowed it, but I'll take corrections if
that's untrue.  Thanks,

Alex

(Continue reading)

Dean A. | 21 Nov 20:51 2014

Hardware error from APEI question?

We're receiving hardware kernel panics on kernel 3.13.  Has anyone seen 
similar hardware errors on the Intel Ivytown/C600 architecture from 
their PCI/PCIe cards?  These errors do not occur with the same card on 
other PCI chipsets (eg. 82G33/P35/P31, ICH9 family, 82801 PCI bridge).

{1} {Hardware Error}: Hardware error from APEI Generic Hardware Error 
Source: 32993
{1} {Hardware Error}: event severity: fatal
{1} {Hardware Error}:  Error 0, type: fatal
{1} {Hardware Error}:   section_type: PCIe error
{1} {Hardware Error}:   port_type: 1, legacy PCI end point
{1} {Hardware Error}:   version: 1.0
{1} {Hardware Error}:   command: 0x0007, status: 0x0018
{1} {Hardware Error}:   device_id: 0000:06:00.0
{1} {Hardware Error}:   slot: 7
{1} {Hardware Error}:   secondary_bus: 0x00
{1} {Hardware Error}:   vendor_id: 0x1797, device_id: 0x6869
{1} {Hardware Error}:   class_code: 000000
Kernel panic - not syncing: Fatal hardware error!
Shutting down cpus with NMI
Rebooting in 30 seconds

Yinghai Lu | 21 Nov 20:37 2014

[PATCH] x86, PCI: support mmio more than 44 bits on 32bit/PAE mode

Aaron reported 32bit/PAE mode, has problem with 64bit resource.

[    6.610012] pci 0000:03:00.0: reg 0x10: [mem 0x383fffc00000-0x383fffdfffff 64bit pref]
[    6.622195] pci 0000:03:00.0: reg 0x20: [mem 0x383fffe04000-0x383fffe07fff 64bit pref]
[    6.656112] pci 0000:03:00.1: reg 0x10: [mem 0x383fffa00000-0x383fffbfffff 64bit pref]
[    6.668293] pci 0000:03:00.1: reg 0x20: [mem 0x383fffe00000-0x383fffe03fff 64bit pref]
...
[   12.374143] calling  ixgbe_init_module+0x0/0x51  <at>  1
[   12.378130] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 3.19.1-k
[   12.385318] ixgbe: Copyright (c) 1999-2014 Intel Corporation.
[   12.390578] ixgbe 0000:03:00.0: Adapter removed
[   12.394247] ixgbe: probe of 0000:03:00.0 failed with error -5
[   12.399369] ixgbe 0000:03:00.1: Adapter removed
[   12.403036] ixgbe: probe of 0000:03:00.1 failed with error -5
[   12.408017] initcall ixgbe_init_module+0x0/0x51 returned 0 after 29200 usecs

root cause: ioremap can not handle mmio range that is more than 44bits on
32bit PAE mode.

We are using pfn with unsigned long like pfn_pte(), so those 0x383fffc00000 will
overflow in pfn format with unsigned long (that is 32 bits in 32bit x86 kernel,
and pfn only can support 44bits).

| static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
| {
|        return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) |
|                     massage_pgprot(pgprot));
| }

We could limit iomem to 44 bits so we can reject them early from root bus.
(Continue reading)

Lorenzo Pieralisi | 21 Nov 12:29 2014

[PATCH v3 0/2] arm: pcibios: remove pci_sys_data domain

This patchset is a v3 of a previous posting:

http://www.spinics.net/lists/linux-pci/msg36502.html

v2 => v3

- Rebased on top of this patch dependency
  http://www.spinics.net/lists/linux-pci/msg36631.html

v1 => v2 changelog

- Removed stale hw_pci domain member
- Reworked pci-mvebu domain handling according to review
- Rebased against 3.18-rc3 and updated the logs
- Dropped RFC status

Original cover letter:
----
This patchset is a first RFC stab at removing the dependency on pci_sys_data
domain field on ARM platforms and by replacing it with generic code that
stashes the domain value in the pci_bus control structure, introduced in

commit 41e5c0f81d3e676d671d96a0a1fafb27abfbd9
("of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr()")

commit 670ba0c8883b576d0aec28bd7a838358a4be1
("PCI: Add generic domain handling")

All the drivers converted (apart from PCIe designware, tested on iMX6SL)
were only compile tested for lack of HW, so along some comments, testing
(Continue reading)

Jiang Liu | 21 Nov 04:23 2014
Picon

[Patch irqdomain: Enhance irq_domain_free_irqs_common() to support parentless irqdomain

Originally irq_domain_free_irqs_common() is designed to be used by
irqdomains with parent. But there are desires to reuse for parentless
irqdomains for code reduction.
So check domain->parent before invoking irq_domain_free_irqs_parent().

Signed-off-by: Jiang Liu <jiang.liu <at> linux.intel.com>
---
Hi Thomas,
	This patch applies to tip/irq/irqdomain, it helps to reduce code
size on ARM. Seems we still have chance to merge it into tip/irq/irqdomain:)
Regards!
Gerry
---
 kernel/irq/irqdomain.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 029acf11efed..0449d2869e17 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
 <at>  <at>  -975,7 +975,8  <at>  <at>  void irq_domain_free_irqs_common(struct irq_domain *domain, unsigned int virq,
 		if (irq_data)
 			irq_domain_reset_irq_data(irq_data);
 	}
-	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+	if (domain->parent)
+		irq_domain_free_irqs_parent(domain, virq, nr_irqs);
 }

 /**
(Continue reading)

suravee.suthikulpanit | 21 Nov 02:02 2014
Picon

[PATCH] irqdomain: Fix NULL pointer dererence in irq_domain_free_irqs_parent

From: Suravee Suthikulpanit <Suravee.Suthikulpanit <at> amd.com>

This patch checks if the parent domain is NULL before recursively freeing
irqs in the parent domains.

In this case, GICv2m is freeing irqs in parent (GIC), which calls
irq_domain_free_irqs_top. This fixes the crash below:

Unble to handle kernel NULL pointer dereference at virtual address 00000018
pgd = fffffe03c78c0000
[00000018] *pgd=00000083c8700003, *pud=00000083c8700003, *pmd=00000083c8700003, *pte=0000000000000000
Internal error: Oops: 96000007 [#1] SMP
Modules linked in: mlx4_core(-) rtc_efi efivarfs [last unloaded: mlx4_en]
CPU: 5 PID: 985 Comm: modprobe Not tainted 3.18.0-rc4-marc-v2m+ #223
task: fffffe03c20c0000 ti: fffffe03c1fb8000 task.ti: fffffe03c1fb8000
PC is at irq_domain_free_irqs_recursive+0x10/0x84
LR is at irq_domain_free_irqs_common+0x8c/0xa0
pc : [<fffffe00000efb2c>] lr : [<fffffe00000f028c>] pstate: 60000145
sp : fffffe03c1fbb9a0
x29: fffffe03c1fbb9a0 x28: fffffe03c1fb8000
x27: fffffe000092f000 x26: fffffe03c10eba00
...
Call trace:
[<fffffe00000efb2c>] irq_domain_free_irqs_recursive+0x10/0x84
[<fffffe00000f0288>] irq_domain_free_irqs_common+0x88/0xa0
[<fffffe00000f030c>] irq_domain_free_irqs_top+0x6c/0x84
[<fffffe00000efb40>] irq_domain_free_irqs_recursive+0x24/0x84
[<fffffe00000f0954>] irq_domain_free_irqs_parent+0x14/0x20
[<fffffe000042c4fc>] gicv2m_irq_domain_free+0x48/0x88
[<fffffe00000efb40>] irq_domain_free_irqs_recursive+0x24/0x84
(Continue reading)

Rajat Jain | 20 Nov 23:33 2014
Picon

[PATCH] PCI: pciehp: Check link state before accessing device during removal

While removing a card, we can't assume the presence to mean that the
access to card is OK. That is because the cause of removal may be a
link down event, and the card may still be physically present. Thus,
instead of presence, use the link state to decide whether or not it is
OK to access the card devices.

Here are the problem symptoms:
During the removal of a card due to link down, sometimes the following
error is seen (because pciehp_unconfigure_device() reads 0xFF from
bridge control register as the link is down, which cause it to assume
that the VGA bit is set):

pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
pciehp 0000:21:05.0:pcie24: Data Link Layer State change
pciehp 0000:21:05.0:pcie24: slot(5): Link Down event
pciehp 0000:21:05.0:pcie24: Disabling domain:bus:device=0000:60:00
pciehp 0000:21:05.0:pcie24: pciehp_unconfigure_device: domain:bus:dev = 0000:60:00
pciehp 0000:21:05.0:pcie24: Cannot remove display device 0000:60:00.0

Ofcourse, when the link comes back up, the device addition fails too:

pciehp 0000:21:05.0:pcie24: pcie_isr: intr_loc 100
pciehp 0000:21:05.0:pcie24: Data Link Layer State change
pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
pciehp 0000:21:05.0:pcie24: slot(5): Link Up event
pciehp 0000:21:05.0:pcie24: Enabling domain:bus:device=0000:60:00
pciehp 0000:21:05.0:pcie24: pciehp_check_link_active: lnk_status = 6011
pciehp 0000:21:05.0:pcie24: pciehp_check_link_status: lnk_status = 6011
pciehp 0000:21:05.0:pcie24: Device 0000:60:00.0 already exists at 0000:60:00, cannot hot-add
pciehp 0000:21:05.0:pcie24: Cannot add device at 0000:60:00
(Continue reading)

Bjorn Helgaas | 20 Nov 22:10 2014
Picon

Re: [GIT PULL] PCI fixes for v3.18

[+cc linux-pci, linux-kernel, Yinghai, Lucas, Duc]

On Thu, Nov 20, 2014 at 12:44 PM, Bjorn Helgaas <bhelgaas <at> google.com> wrote:
> Hi Linus,
>
> These are fixes for an issue with 64-bit PCI bus addresses on 32-bit PAE
> kernels, an APM X-Gene problem (it depended on a generic change we removed
> before merging), a fix for my hotplug device configuration changes, and a
> devicetree documentation update.
>
> Bjorn
>
>
> The following changes since commit 32f638fc11db0526c706454d9ab4339d55ac89f3:
>
>   PCI: Don't oops on virtual buses in acpi_pci_get_bridge_handle() (2014-11-05 13:06:16 -0700)
>
> are available in the git repository at:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git tags/pci-v3.18-fixes-3
>
> for you to fetch changes up to 7fc986d8a9727e5d40da3c2c1c343da6142e82a9:
>
>   PCI: Support 64-bit bridge windows if we have 64-bit dma_addr_t (2014-11-19 14:30:32 -0700)
>
> ----------------------------------------------------------------
> PCI updates for v3.18:
>
>   Resource management
>     - Support 64-bit bridge windows if we have 64-bit dma_addr_t (Yinghai Lu)
(Continue reading)


Gmane