Kees Cook | 26 Nov 00:31 2015

[PATCH v2 0/4] introduce post-init read-only memory

One of the easiest ways to protect the kernel from attack is to reduce
the internal attack surface exposed when a "write" flaw is available. By
making as much of the kernel read-only as possible, we reduce the
attack surface.

Many things are written to only during __init, and never changed
again. These cannot be made "const" since the compiler will do the wrong
thing (we do actually need to write to them). Instead, move these items
into a memory region that will be made read-only during mark_rodata_ro()
which happens after all kernel __init code has finished.

This introduces __ro_after_init as a way to mark such memory, and uses
it on the x86 vDSO to kill an extant kernel exploitation method. Also
adds a new kernel parameter to help debug future use and adds an lkdtm
test to check the results.


Dan Williams | 24 Nov 01:05 2015

[PATCH v2 0/2] restrict /dev/mem to idle io memory ranges

Changes since v1 [1]:

1/ Introduce ARCH_HAS_DEVMEM_IS_ALLOWED to flag archs where
   CONFIG_STRICT_DEVMEM will compile (Ingo)

2/ Drop "default y" for s390 (Heiko)

3/ Fix iomem_is_exclusive() return value in the



Dan Williams (2):
      arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug
      restrict /dev/mem to idle io memory ranges

 arch/arm/Kconfig             |    1 +
 arch/arm/Kconfig.debug       |   14 --------------
 arch/arm64/Kconfig           |    1 +
 arch/arm64/Kconfig.debug     |   14 --------------
 arch/frv/Kconfig             |    1 +
 arch/m32r/Kconfig            |    1 +
 arch/powerpc/Kconfig         |    1 +
 arch/powerpc/Kconfig.debug   |   12 ------------
 arch/s390/Kconfig            |    1 +
 arch/s390/Kconfig.debug      |   12 ------------
 arch/tile/Kconfig            |    4 +---
 arch/unicore32/Kconfig       |    1 +
(Continue reading)

Arnd Bergmann | 23 Nov 17:25 2015

[RFC] asm-generic: default BUG_ON(x) to "if(x) BUG()"

When CONFIG_BUG is disabled, BUG_ON() will only evaluate the condition,
but will not actually stop the current thread. GCC warns about a couple
of BUG_ON() users where this actually leads to further undefined

include/linux/ceph/osdmap.h: In function 'ceph_can_shift_osds':
include/linux/ceph/osdmap.h:54:1: warning: control reaches end of non-void function
fs/ext4/inode.c: In function 'ext4_map_blocks':
fs/ext4/inode.c:548:5: warning: 'retval' may be used uninitialized in this function
drivers/mfd/db8500-prcmu.c: In function 'prcmu_config_clkout':
drivers/mfd/db8500-prcmu.c:762:10: warning: 'div_mask' may be used uninitialized in this function
drivers/mfd/db8500-prcmu.c:769:13: warning: 'mask' may be used uninitialized in this function
drivers/mfd/db8500-prcmu.c:757:7: warning: 'bits' may be used uninitialized in this function
drivers/tty/serial/8250/8250_core.c: In function 'univ8250_release_irq':
drivers/tty/serial/8250/8250_core.c:252:18: warning: 'i' may be used uninitialized in this function
drivers/tty/serial/8250/8250_core.c:235:19: note: 'i' was declared here

There is an obvious conflict of interest here: on the one hand, someone
who disables CONFIG_BUG() will want the kernel to be as small as possible
and doesn't care about printing error messages to a console that nobody
looks at. On the other hand, running into a BUG_ON() condition means that
something has gone wrong, and we probably want to also stop doing things
that might cause data corruption.

This patch picks the second choice, and changes the NOP to BUG(), which
normally stops the execution of the current thread in some form (endless
loop or a trap). This follows the logic we applied in a4b5d580e078 ("bug:
Make BUG() always stop the machine").

For ARM multi_v7_defconfig, the size slightly increases:
(Continue reading)

Richard Weinberger | 19 Nov 23:50 2015

[RFC] Limiting linker scope


UML recently had an interesting bug[1] where the host side of UML
tried to call sigsuspend() but as the kernel itself offers a function
with the same name it called sigsuspend() on
the UML kernel side and funny things happened.

The root cause of the problem is that the UML links userspace
code like glibc, libpcap, etc. to the kernel image and symbols can
clash. Especially if one side is a shared library it will not noticed
at compile time.

As this is not the first bug of this kind I'm facing on UML I've
started to think how to deal with that.

Is it somehow possible to limit the linker scope?
Such that we can force LD no not blindly link every symbols of
vmlinux into another object? Maybe using a white list?
I have do admit I've never used LD scripts nor GNU export maps,
maybe they can help. Currently I'm reading their docs and hope
to find a way to implement my idea.

The problem is also not specific to UML, the emerging Linux Kernel
Library will suffer from the same issue.
As random programs will link the kernel as library the chance is
high to face similar symbol conflicts.

Maybe we can also find a nice generic solution to limit the linker
scope within the kernel. Such that it does not hurt when a random device
driver exports a symbol like "i".
(Continue reading)

Will Deacon | 19 Nov 19:11 2015


As illustrated by a3afe70b83fd ("[S390] latencytop s390 support."),
HAVE_LATENCYTOP_SUPPORT is defined by an architecture to advertise an
implementation of save_stack_trace_tsk.

However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk()
weak alias") a dummy implementation is provided if STACKTRACE=y.
Given that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects

Cc: Vineet Gupta <vgupta <at>>
Cc: Russell King <linux <at>>
Cc: James Hogan <james.hogan <at>>
Cc: Michal Simek <monstr <at>>
Cc: Helge Deller <deller <at>>
Cc: Michael Ellerman <mpe <at>>
Cc: "David S. Miller" <davem <at>>
Cc: Guan Xuetao <gxt <at>>
Cc: Ingo Molnar <mingo <at>>
Cc: Andrew Morton <akpm <at>>
Acked-by: Heiko Carstens <heiko.carstens <at>>
Signed-off-by: Will Deacon <will.deacon <at>>
 arch/arc/Kconfig        | 3 ---
 arch/arm/Kconfig        | 5 -----
 arch/metag/Kconfig      | 3 ---
 arch/microblaze/Kconfig | 3 ---
 arch/parisc/Kconfig     | 3 ---
 arch/powerpc/Kconfig    | 3 ---
 arch/s390/Kconfig       | 3 ---
 arch/sh/Kconfig         | 3 ---
(Continue reading)

Jia He | 19 Nov 07:48 2015

[PATCH v2 0/3] Improve bitmap_empty and bitmap_full

find_fisrt_{zero_}bit are too heavy for bitmap_{full,empty}. We don't 
need to calculate and compare the position of bitmap. This set of patch
instroduces lightweight api and replaces the heavy one.

v2: Move the declarations to linux/bitops.h for compilation

Jia He (3):
  Move 2 mask macro from bitmap.h to bitops.h
  Introduce 2 bit ops api: all_is_bit_{one,zero}
  Replace find_fisrt_{zero_}bit with the new lightweight api

 include/linux/bitmap.h |  7 ++-----
 include/linux/bitops.h |  7 +++++++
 lib/find_bit.c         | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 59 insertions(+), 5 deletions(-)



Jia He | 19 Nov 03:31 2015

[PATCH 0/3] Improve bitmap_empty and bitmap_full

find_fisrt_{zero_}bit are too heavy for bitmap_{full,empty}. We don't 
need to calculate and compare the position of bitmap. This set of patch
instroduces lightweight api and replaces the heavy one.

Jia He (3):
  linux/bitmap: Move 2 mask macro from bitmap.h to bitops.h
  lib: Introduce 2 find bit api: all_is_bit_{one,zero}
  linux/bitmap: Replace find_fisrt_{zero_}bit with the new lightweight api 

 include/asm-generic/bitops/find.h |  3 +++
 include/linux/bitmap.h            |  7 ++----
 include/linux/bitops.h            |  4 ++++
 lib/find_bit.c                    | 50 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 59 insertions(+), 5 deletions(-)



Will Deacon | 18 Nov 14:11 2015

spin_lock() ordering on ia64

Hi Tony,

I recently took a look at the ia64 spinlock implementation in hope of
some inspiration regarding spin_unlock_wait, but I'm actually having
trouble understanding how spin_lock() guarantees ordering between taking
the lock and reads of some shared data in the critical section.

In particular, the loop where the locker spins awaiting its turn in the

for (;;) {
	asm volatile (" %0=[%1]" : "=r"(serve) : "r"(p) : "memory");

	if (!(((serve >> TICKET_SHIFT) ^ ticket) & TICKET_MASK))

AFAIU, doesn't provide any ordering semantics, so a load to an
unrelated address from within the following critical section could be
speculated before we've actually acquired the lock.

What am I missing? Does the ALAT provide guarantees against other loads
somehow or does provide ordering guarantees that I didn't spot
in the manuals?


(Continue reading)

Michael Schmitz | 17 Nov 05:42 2015

Re: [PATCH 1/4] m68k/mm: motorola - Add missing initialization of max_pfn

Hi Geert,

Am 16.11.2015 um 00:04 schrieb Geert Uytterhoeven:
> If max_pfn is not initialized, the various /proc/kpage* files are empty,
> and selftests/vm/mlock2-tests will fail. max_pfn is also used by the
> block layer to calculate DMA masks.

What about platforms where the DMA can't address all available physical



> Signed-off-by: Geert Uytterhoeven <geert <at>>
> ---
>  arch/m68k/mm/motorola.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
> index b958916e5eac96b2..8f37fdd80be9e9cc 100644
> --- a/arch/m68k/mm/motorola.c
> +++ b/arch/m68k/mm/motorola.c
>  <at>  <at>  -250,7 +250,7  <at>  <at>  void __init paging_init(void)
>  	high_memory = phys_to_virt(max_addr);
>  	min_low_pfn = availmem >> PAGE_SHIFT;
> -	max_low_pfn = max_addr >> PAGE_SHIFT;
> +	max_pfn = max_low_pfn = max_addr >> PAGE_SHIFT;
(Continue reading)

Dave Hansen | 17 Nov 04:35 2015

[PATCH 00/37] x86: Memory Protection Keys

Memory Protection Keys for User pages is a CPU feature which will
first appear on Skylake Servers, but will also be supported on
future non-server parts.  It provides a mechanism for enforcing
page-based protections, but without requiring modification of the
page tables when an application changes protection domains.  See
the Documentation/ patch for more details.

Changes from RFCv3:

 * Added 'current' and 'foreign' variants of get_user_pages() to
   help indicate whether protection keys should be enforced.
   Thanks to Jerome Glisse for pointing out this issue.
 * Added "allocation" and set/get system calls so that we can do
   management of proection keys in the kernel.  This opens the
   door to use of specific protection keys for kernel use in the
   future, such as for execute-only memory.
 * Removed the kselftest code for the moment.  It will be
   submitted separately.

Thanks Ingo and Thomas for most of these):
Changes from RFCv2 (Thanks Ingo and Thomas for most of these):

 * few minor compile warnings
 * changed 'nopku' interaction with cpuid bits.  Now, we do not
   clear the PKU cpuid bit, we just skip enabling it.
 * changed __pkru_allows_write() to also check access disable bit
 * removed the unused write_pkru()
 * made si_pkey a u64 and added some patch description details.
   Also made it share space in siginfo with MPX and clarified
(Continue reading)

Greg Ungerer | 17 Nov 00:54 2015

Re: [PATCH 4/4] m68knommu: Add missing initialization of max_pfn and {min,max}_low_pfn

Hi Geert,

On 15/11/15 21:04, Geert Uytterhoeven wrote:
> If max_pfn is not initialized, the block layer may use wrong DMA masks.
> Replace open-coded shifts by PFN_DOWN() while we're at it.
> Signed-off-by: Geert Uytterhoeven <geert <at>>

Tested and working fine on m68knommu. So:

Tested-By: Greg Ungerer <gerg <at>>

If you respin this patch for any reason I wouldn't object
to removing the "/* 0 on coldfire */" comment...


> ---
> Compile-tested only.
> ---
>  arch/m68k/kernel/setup_no.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> diff --git a/arch/m68k/kernel/setup_no.c b/arch/m68k/kernel/setup_no.c
> index 88c27d94a7214c95..29b44e69f0f47375 100644
> --- a/arch/m68k/kernel/setup_no.c
> +++ b/arch/m68k/kernel/setup_no.c
>  <at>  <at>  -238,11 +238,14  <at>  <at>  void __init setup_arch(char **cmdline_p)
(Continue reading)