Prarit Bhargava | 24 Jul 19:54 2014

[PATCH] cpufreq, store_scaling_governor requires policy->rwsem to be held for duration of changing governors

A while ago we added a test to mimic some of our users' userspace governor
programs which monitor system behaviour and will switch governors on the
fly.  The decision process for this includes looking at time of day,
expected system load, etc.  For some time now we have had reports of
system panics in the cpufreq code when using the userspace governor

The userspace utility writes
/sys/devices/system/cpu/cpuX/cpufreq/scaling_governor and sets the
governor.  In some cases this can happen rapidly, and under heavy load
there are occasions where the changes are delayed.  This can mean that
several governor changes may occur within a short period of time.

This has exposed a bug in the store_scaling_governor path.  When the sysfs
file is written to,

		-> cpufreq_set_policy()
			__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);

The release of the policy->rwsem results in a situation where another
write to the scaling_governor file can now start and overwrite pointers
and cause corruption.


(Continue reading)


[Query] cpuidle functions cpuidle_install_idle_handler/cpuidle_uninstall_idle_handler


In drivers/cpuidle/cpuidle.c, there are two functions
cpuidle_install_idle_handler & cpuidle_uninstall_idle_handler.
The names seem confusing to me as they don't install any handler,
rather set 'initialized'  variable to 1/0.

In kernel version 3.0, these functions used to look as below where
they installed and uninstalled some handler function (pm_idle) -

void cpuidle_install_idle_handler(void)
        if (enabled_devices && (pm_idle != cpuidle_idle_call)) {
                 /* Make sure all changes finished before we switch
to new idle */
                 pm_idle = cpuidle_idle_call;

void cpuidle_uninstall_idle_handler(void)
         if (enabled_devices && pm_idle_old && (pm_idle != pm_idle_old)) {
                 pm_idle = pm_idle_old;

In recent kernel (3.16.0-rc6) , the code for the two mentioned
functions looks as below -
(Continue reading)


[PATCH] cpuidle: menu governor - remove unused macro STDDEV_THRESH

STDDEV_THRESH was once defined and used in menu governor. But now its no longer
used anywhere. So removing the define.

Signed-off-by: Mohammad Merajul Islam Molla <meraj.enigma <at>>
 drivers/cpuidle/governors/menu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index c4f80c1..c3732fa 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
 <at>  <at>  -35,7 +35,6  <at>  <at> 
 #define RESOLUTION 1024
 #define DECAY 8
 #define MAX_INTERESTING 50000
-#define STDDEV_THRESH 400



To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo <at>
More majordomo info at

Krzysztof Kozlowski | 24 Jul 11:26 2014

[PATCH] cpuidle: coupled: Enable interrupts when early returning on invalid driver

cpuidle_enter_state is expected to return with interrupts enabled.
However cpuidle_enter_state_coupled returned with interrupts disabled if
the cpuidle driver was registered without mask of coupled cpus.

This could be observed as a warning:
[    1.613132] ------------[ cut here ]------------
[    1.613244] WARNING: CPU: 0 PID: 0 at kernel/sched/idle.c:175 cpu_idle_loop+0x2dc/0x6d0()
[    1.620268] Modules linked in:
[    1.623311] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-rc4-00102-g7669ccdbf8af-dirty #81
[    1.623619] dwmmc_exynos 12510000.mshc: 1 slots initialized
[    1.624890] logger: created 256K log 'log_main'
[    1.625483] logger: created 256K log 'log_events'
[    1.626084] logger: created 256K log 'log_radio'
[    1.626699] logger: created 256K log 'log_system'
[    1.655960] [<c00167cc>] (unwind_backtrace) from [<c0012c28>] (show_stack+0x10/0x14)
[    1.663679] [<c0012c28>] (show_stack) from [<c050ab00>] (dump_stack+0x70/0xbc)
[    1.670883] [<c050ab00>] (dump_stack) from [<c0023ac8>] (warn_slowpath_common+0x68/0x8c)
[    1.678954] [<c0023ac8>] (warn_slowpath_common) from [<c0023b08>] (warn_slowpath_null+0x1c/0x24)
[    1.687720] [<c0023b08>] (warn_slowpath_null) from [<c006ddd4>] (cpu_idle_loop+0x2dc/0x6d0)
[    1.696052] [<c006ddd4>] (cpu_idle_loop) from [<c006e1d4>] (cpupri_find+0x0/0xd4)
[    1.703518] [<c006e1d4>] (cpupri_find) from [<c07cdd14>] (processor_id+0x0/0x2c)
[    1.710917] ---[ end trace a85327313857296e ]---

Enable the interrupts also when early returning from
cpuidle_enter_state_coupled due to invalid coupled configuration.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski <at>>
Fixes: 4126c0197bc8 ("cpuidle: add support for states that affect multiple cpus")
Cc: <stable <at>>
(Continue reading)

Prarit Bhargava | 23 Jul 22:45 2014

[Question] Why is there a restriction on the policy->rwsem & CPUFREQ_GOV_POLICY_EXIT ?

I'm debugging a race/locking issue in the store_scaling_governor path and
came across the following restriction on using a semaphore in the
cpufreq_policy struct defined in include/linux/cpufreq.h :

         * Additional rules:
         * - Lock should not be held across
         *     __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);

	struct rw_semaphore	rwsem;

I'm not completely familiar with this code and am wondering why the
restriction is in place?  Is there a worry about the module_put() in


To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo <at>
More majordomo info at

Rafael J. Wysocki | 23 Jul 22:46 2014

[PATCH] PCI / PM: Make PCIe PME interrupts wake up from "freeze" sleep state

From: Rafael J. Wysocki <rafael.j.wysocki <at>>

The "freeze" sleep state, also known as suspend-to-idle, is entered
without taking nonboot CPUs offline, right after devices have been
suspended.  It works by waiting for at least one wakeup source object
to become "active" as a result of handling a hardware interrupt.

Of course, interrupts supposed to be able to wake up the system from
suspend-to-idle cannot be disabled by suspend_device_irqs() and their
interrupt handlers must be able to cope with interrupts coming after
all devices have been suspended.  In that case, they only need to
call __pm_wakeup_event() for a single wakeup source object without
trying to access hardware (that will be resumed later as part of
the subsequent system resume).

Make PCIe PME interrupts work this way.

Register an additional wakeup source object for each PCIe PME
service device.  That object will be used to generate wakeups from

Add IRQF_NO_SUSPEND to PME interrupt flags.  This will make
suspend_device_irqs() to ignore PME interrupts, but that's OK,
because the PME interrupt handler is suspend-aware anyway and
can cope with interrupts coming during system suspend-resume.

For each PCIe port with PME service during the "prepare" phase of
system suspend walk the bus below it and see if any devices on that
bus are configured for wakeup.  If so, mark the port as one that can
be used for system wakeup signaling and handle it differenty going
(Continue reading)

Daniel Lezcano | 23 Jul 19:02 2014

[PATCH 1/2] cpuidle: Remove manual selection of the multiple driver support

Like the coupled idle state, it is not up to the user to set this option
but the driver to select it.

Remove the interactive selection of this option.

Signed-off-by: Daniel Lezcano <daniel.lezcano <at>>
 drivers/cpuidle/Kconfig |    7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index 1b96fb9..32748c3 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
 <at>  <at>  -15,12 +15,7  <at>  <at>  config CPU_IDLE

-        bool "Support multiple cpuidle drivers"
-        default n
-        help
-         Allows the cpuidle framework to use different drivers for each CPU.
-         This is useful if you have a system with different CPU latencies and
-         states. If unsure say N.
+        bool

 	bool "Ladder governor (for periodic timer tick)"

(Continue reading)

Thomas Petazzoni | 23 Jul 15:00 2014

[PATCHv3 00/16] cpuidle for Marvell Armada 370 and 38x


Here comes the third version of the cpuidle support for Armada 370 and
Armada 38x.

We are hoping to see this patch series merged for 3.17.

Most patches are touching only arch/arm/mach-mvebu/ code so they
should be handled by the mvebu maintainers. However, patches 11-13 are
touching the mvebu cpuidle driver, with a possible issue on patch 11,
which touches both the cpuidle driver and the mach-mvebu code in order
to rename the driver without breaking functionality (if needed, we can
decide to split the commits, it would break functionality temporarly,
but not buildability).

Changes since v2

 * According to the discussion with Daniel Lezcano (cpuidle
   maintainer) and Arnd Bergmann, changed the cpuidle-mvebu-v7 driver
   to actually register three separate cpuidle platform driver, one
   per-SoC. This way, we don't need special platform data to convey
   the SoC type being used, as this information is already available
   by looking at the driver name.

   This change impacts the patches "cpuidle: mvebu: rename the driver
   from armada-370-xp to mvebu-v7", "cpuidle: mvebu: add Armada 370
   support", "cpuidle: mvebu: add Armada 38x support", "ARM: mvebu:
   add cpuidle support for Armada 370" and "ARM: mvebu: add cpuidle
   support for Armada 38x". Other patches are unchanged. The patch
(Continue reading)

Brian Norris | 23 Jul 01:48 2014

[PATCH v9] power: reset: Add reboot driver for brcmstb

From: Marc Carino <marc.ceeeee <at>>

Add support for reboot functionality on boards with ARM-based
Broadcom STB chipsets. Make it built-in by default for ARCH_BRCMSTB,
but allow it to be configurable under COMPILE_TEST.

Signed-off-by: Marc Carino <marc.ceeeee <at>>
Signed-off-by: Brian Norris <computersforpeace <at>>
Cc: Sebastian Reichel <sre <at>>
Cc: Dmitry Eremin-Solenikov <dbaryshkov <at>>
Cc: David Woodhouse <dwmw2 <at>>
Cc: linux-pm <at>
Signed-off-by: Brian Norris <computersforpeace <at>>
Hi Sebastian, can you merge this, or else defer this to Matt?

This driver is mostly unchanged for a while, but it hasn't gotten an Ack from
any PM guys. It is now separated from the rest of its accompanying series:

The DT bindings have gotten review somewhere in the course of 8 versions.

 drivers/power/reset/Kconfig          |  11 ++++
 drivers/power/reset/Makefile         |   1 +
 drivers/power/reset/brcmstb-reboot.c | 120 +++++++++++++++++++++++++++++++++++
 3 files changed, 132 insertions(+)
 create mode 100644 drivers/power/reset/brcmstb-reboot.c

(Continue reading)

Mark Brown | 22 Jul 16:43 2014


From: Mark Brown <broonie <at>>

Since the OPP layer is a kernel library which has been converted to be
directly selectable by its callers rather than user selectable and
requiring architectures to enable it explicitly the ARCH_HAS_OPP symbol
has become redundant and can be removed. Do so.

Signed-off-by: Mark Brown <broonie <at>>
Reviewed-by: Viresh Kumar <viresh.kumar <at>>
Acked-by: Nishanth Menon <nm <at>>
Acked-by: Rob Herring <robh <at>>
Acked-by: Shawn Guo <shawn.guo <at>>
Acked-by: Simon Horman <horms+renesas <at>>

Raphael, IIRC you said that you'd applied this but it seems not to have
appeared in -next.  There's a few more references crept in, I'll try to
get them removed at source.

 Documentation/power/opp.txt    | 3 ---
 arch/arm/mach-exynos/Kconfig   | 1 -
 arch/arm/mach-highbank/Kconfig | 1 -
 arch/arm/mach-imx/Kconfig      | 1 -
 arch/arm/mach-omap2/Kconfig    | 1 -
 arch/arm/mach-shmobile/Kconfig | 2 --
 arch/arm/mach-vexpress/Kconfig | 1 -
 arch/arm/mach-zynq/Kconfig     | 1 -
 drivers/devfreq/Kconfig        | 1 -
 kernel/power/Kconfig           | 3 ---
 10 files changed, 15 deletions(-)
(Continue reading)

Thomas Renninger | 22 Jul 16:05 2014

[PATCH 0/4] tools/power/cpupower latest fixes

These are fixes for the cpupower utility I received lately.

Rafael: Would be great if you can queue them in your pm tree.



Himangi Saraogi (1):
  cpupower: mperf monitor: Correct use of ! and &

Peter Senna Tschudin (1):
  cpupower: Remove redundant error check

Rickard Strandqvist (1):
  tools: power: cpupower: bench: parse.c: Fix several minor errors

Thomas Renninger (1):
  cpupower: Adjust MAINTAINERS file

 MAINTAINERS                                        |    2 +-
 tools/power/cpupower/bench/parse.c                 |   39 +++++++++++---------
 tools/power/cpupower/utils/cpufreq-set.c           |   11 +++---
 .../cpupower/utils/idle_monitor/mperf_monitor.c    |    2 +-
 4 files changed, 28 insertions(+), 26 deletions(-)



(Continue reading)