Re: [PATCH 3/4] powernv: cpuidle: Redesign idle states management

From: Preeti U Murthy
Date: Wed Nov 12 2014 - 01:51:49 EST


Hi Shreyas,

On 11/03/2014 09:38 PM, Shreyas B. Prabhu wrote:
> diff --git a/arch/powerpc/kernel/idle_power7.S b/arch/powerpc/kernel/idle_power7.S
> index 283c603..df11acb 100644
> --- a/arch/powerpc/kernel/idle_power7.S
> +++ b/arch/powerpc/kernel/idle_power7.S
> _GLOBAL(power7_idle)
> /* Now check if user or arch enabled NAP mode */
> @@ -141,49 +192,16 @@ _GLOBAL(power7_idle)
>
> _GLOBAL(power7_nap)
> mr r4,r3
> - li r3,0
> + li r3,1

The comment at the top of this file states 0 for nap and 1 for sleep.
You will need to change that. As an alternative I would suggest using
the macros that you have already defined:PNV_THREAD_NAP and
PNV_THREAD_SLEEP to write to r3 above and remove the lines that say 0
for nap and 1 for sleep in the comments.

> b power7_powersave_common
> /* No return */
>

<snip>

> @@ -210,12 +226,91 @@ _GLOBAL(power7_wakeup_tb_loss)
> BEGIN_FTR_SECTION
> CHECK_HMI_INTERRUPT
> END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> +
> + li r7,1
> + mfspr r8,SPRN_PIR
> + /*
> + * The last 3 bits of PIR represents the thread id of a cpu
> + * in power8. This will need adjusting for power7.
> + */
> + andi. r8,r8,0x07 /* Get thread id into r8 */
> + rotld r7,r7,r8
> + /* r7 now has 'thread_id'th bit set */
> +
> + ld r14,PACA_CORE_IDLE_STATE_PTR(r13)
> +lwarx_loop2:
> + lwarx r15,0,r14
> + andi. r9,r15,PNV_CORE_IDLE_LOCK_BIT
> + /*
> + * Lock bit is set in one of the 2 cases-
> + * a. In the sleep/winkle enter path, the last thread is executing
> + * fastsleep workaround code.
> + * b. In the wake up path, another thread is executing fastsleep
> + * workaround undo code or resyncing timebase or restoring context
> + * In either case loop until the lock bit is cleared.
> + */
> + bne lwarx_loop2
> +
> + cmpwi cr2,r15,0
> + or r15,r15,r7 /* Set thread bit */
> +
> + beq cr2,first_thread
> +
> + /* Not first thread in core to wake up */
> + stwcx. r15,0,r14
> + bne- lwarx_loop2
> + b common_exit
> +
> +first_thread:
> + /* First thread in core to wakeup */
> + ori r15,r15,PNV_CORE_IDLE_LOCK_BIT
> + stwcx. r15,0,r14
> + bne- lwarx_loop2
> +
> + LOAD_REG_ADDR(r3, pnv_need_fastsleep_workaround)
> + lbz r3,0(r3)
> + cmpwi r3,1
> + /* skip fastsleep workaround if its not needed */
> + bne timebase_resync
> +
> + /* Undo fast sleep workaround */
> + mfcr r16 /* Backup CR into a non-volatile register */

Don't you want to do this ^^ before calling opal_call_realmode for
timebase resync below also?

> + li r3,1
> + li r4,0
> + li r0,OPAL_CONFIG_CPU_IDLE_STATE
> + bl opal_call_realmode
> + mtcr r16 /* Restore CR */
> +
> + /* Do timebase resync if we are waking up from sleep. Use cr1 value
> + * set in exceptions-64s.S */
> + ble cr1,clear_lock
> +
> +timebase_resync:
> /* Time base re-sync */
> - li r3,OPAL_RESYNC_TIMEBASE
> + li r0,OPAL_RESYNC_TIMEBASE
> bl opal_call_realmode;
> -
> diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
> index 34c6665..980c964 100644
> --- a/arch/powerpc/platforms/powernv/setup.c
> +++ b/arch/powerpc/platforms/powernv/setup.c
> @@ -36,6 +36,8 @@
> #include <asm/opal.h>
> #include <asm/kexec.h>
> #include <asm/smp.h>
> +#include <asm/cputhreads.h>
> +#include <asm/cpuidle.h>
>
> #include "powernv.h"
>
> @@ -292,11 +294,55 @@ static void __init pnv_setup_machdep_rtas(void)
>
> static u32 supported_cpuidle_states;
>
> +static void pnv_alloc_idle_core_states(void)
> +{
> + int i, j;
> + int nr_cores = cpu_nr_cores();
> + u32 *core_idle_state;
> +
> + /*
> + * Deep idle states like sleep and winkle are per core idle states.
> + * A core enters these states only when all the threads enter either
> + * the particular idle state or a deeper one. There are tasks like
> + * fastsleep hardware bug workaround and hypervisor core state save
> + * which have to be done only by the last thread of the core entering
> + * deep idle state and similarly tasks like timebase resync, hypervisor
> + * core register restore that have to be done only by the first thread
> + * waking up from these states. Introducing core_idle_state, a per core
> + * structure which will keep track threads entering idle states deeper
> + * than sleep.

Since you already have explained ^^ in the changelog, you do not need to
elaborate it here.

> + * core_idle_state - First 8 bits track the idle state of each thread
> + * of the core. The 8th bit is the lock bit. Initially all thread bits
> + * are set. They are cleared when the thread enters deep idle state
> + * like sleep and winkle. Initially the lock bit is cleared.

you can simply have the comment about the bits of core_idle_state
without having to mention about when they are cleared etc..

> + * The lock bit has 2 purposes
> + * a. While the first thread is restoring core state, it prevents
> + * from other threads in the core from switching to prcoess context.
> + * b. While the last thread in the core is saving the core state, it
> + * prevent a different thread from waking up.

The above two points are useful. As far as I see besides explaining the
bits of core_idle_state structure and the purpose of lock bit the rest
of the comments is redundant. A git-blame will let people know why all
this is needed. The comment section should not be used up for this
purpose IMO.

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/