Re: [PATCH] intel_idle: replace conditionals with static_cpu_has(X86_FEATURE_ARAT)

From: Jacob Pan
Date: Fri Oct 06 2017 - 14:34:07 EST


On Fri, 6 Oct 2017 13:19:45 -0400
Jason Baron <jbaron@xxxxxxxxxx> wrote:

> If the 'arat' cpu flag is set, then the conditionals in intel_idle()
> that guard calling tick_broadcast_enter()/exit() will never be true.
> Use static_cpu_has(X86_FEATURE_ARAT) to create a fast path to replace
> the conditional.
>
> Signed-off-by: Jason Baron <jbaron@xxxxxxxxxx>
> Cc: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> Cc: Len Brown <lenb@xxxxxxxxxx>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> ---
> drivers/idle/intel_idle.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 5dc7ea4..5db5e31 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -913,8 +913,7 @@ static __cpuidle int intel_idle(struct
> cpuidle_device *dev, struct cpuidle_state *state =
> &drv->states[index]; unsigned long eax = flg2MWAIT(state->flags);
> unsigned int cstate;
> -
> - cstate = (((eax) >> MWAIT_SUBSTATE_SIZE) &
> MWAIT_CSTATE_MASK) + 1;
> + bool uninitialized_var(tick);
>
> /*
> * NB: if CPUIDLE_FLAG_TLB_FLUSHED is set, this idle
> transition @@ -923,12 +922,19 @@ static __cpuidle int
> intel_idle(struct cpuidle_device *dev,
> * useful with this knowledge.
> */
>
> - if (!(lapic_timer_reliable_states & (1 << (cstate))))
> - tick_broadcast_enter();
> + if (!static_cpu_has(X86_FEATURE_ARAT)) {
> + cstate = (((eax) >> MWAIT_SUBSTATE_SIZE) &
> + MWAIT_CSTATE_MASK) + 1;
> + tick = false;
> + if (!(lapic_timer_reliable_states & (1 <<
> (cstate)))) {
> + tick = true;
> + tick_broadcast_enter();
> + }
> + }
>
> mwait_idle_with_hints(eax, ecx);
>
> - if (!(lapic_timer_reliable_states & (1 << (cstate))))
> + if (!static_cpu_has(X86_FEATURE_ARAT) && tick)
> tick_broadcast_exit();
>
> return index;

Seems better to have a function pointer set up at init time to select
whether we do tick_broadcast or not (two functions). There is no need to
check CPU feature on every entry.