Re: [tip:x86/urgent] x86/mce: Fix the MCE poll timer logic

From: Chen Gong
Date: Wed Jun 06 2012 - 03:51:30 EST


ä 2012/6/6 15:10, tip-bot for Chen Gong åé:
> Commit-ID: 958fb3c51295764599d6abce87e1a01ace897a3e
> Gitweb: http://git.kernel.org/tip/958fb3c51295764599d6abce87e1a01ace897a3e
> Author: Chen Gong <gong.chen@xxxxxxxxxxxxxxx>
> AuthorDate: Tue, 5 Jun 2012 10:35:02 +0800
> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> CommitDate: Wed, 6 Jun 2012 08:28:21 +0200
>
> x86/mce: Fix the MCE poll timer logic
>
> In commit 82f7af09 ("x86/mce: Cleanup timer mess), Thomas just
> forgot the "/ 2" there while cleaning up.
>
> Signed-off-by: Chen Gong <gong.chen@xxxxxxxxxxxxxxx>
> Acked-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: bp@xxxxxxxxx
> Cc: tony.luck@xxxxxxxxx
> Link: http://lkml.kernel.org/r/1338863702-9245-1-git-send-email-gong.chen@xxxxxxxxxxxxxxx
> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
> ---
> arch/x86/kernel/cpu/mcheck/mce.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index 0a687fd..a97f3c4 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -1274,7 +1274,7 @@ static void mce_timer_fn(unsigned long data)
> */
> iv = __this_cpu_read(mce_next_interval);
> if (mce_notify_irq())
> - iv = max(iv, (unsigned long) HZ/100);
> + iv = max(iv / 2, (unsigned long) HZ/100);
> else
> iv = min(iv * 2, round_jiffies_relative(check_interval * HZ));
> __this_cpu_write(mce_next_interval, iv);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

In fact, there still exists another potential issue:

static void __mcheck_cpu_init_timer(void)
{
struct timer_list *t = &__get_cpu_var(mce_timer);
unsigned long iv = __this_cpu_read(mce_next_interval);

setup_timer(t, mce_timer_fn, smp_processor_id());

if (mce_ignore_ce)
return;

__this_cpu_write(mce_next_interval, iv);
if (!iv)
return;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Because the 2nd patch is not merged yet, so here iv is zero when this
function is called, which means at the beginning, the poll timers are
not registered until some other conditions trigger *add_timer_on*.

t->expires = round_jiffies(jiffies + iv);
add_timer_on(t, smp_processor_id());
}

Another potential issue is in this function two smp_processor_id()
are called. If conext changes during this procedure (I'm not sure
if it can hapen, besides secondary_cpu kickoff, online/offline will
call these functions, even in virtualization envrionment, etc.).
So I think it will be better saving the value in the beginning of
this function. Make sense?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/