Re: [patch 2/2] x86: mce: Implement cmci poll mode for intelmachines

From: Thomas Gleixner
Date: Tue Jun 05 2012 - 09:35:20 EST


On Tue, 5 Jun 2012, Chen Gong wrote:
> ä 2012/6/5 4:01, Thomas Gleixner åé:
> > On Mon, 4 Jun 2012, Chen Gong wrote:
> > static void mce_timer_fn(unsigned long data)
> > {
> > ...
> > + /* Might have become 0 after CMCI storm subsided */
> > + if (iv) {
> > + t->expires = jiffies + iv;
> > + add_timer_on(t, smp_processor_id());
> > + }
> > +}
>
> I've found under some conditions, *t* is pending on the timer tree, so
> add_timer_on will crash the whole system. Furthermore, if this timer

How should that happen? This is the timer callback function, which is
called from the timer code when the timer expired. It CANNOT be
pending at that point. The add_timer_on() in mce_timer_kick() might
cause that BUG to trigger, but definitely not the one here in the
timer callback.

> function triggers "WARN_ON(smp_processor_id() != data);", this timer

What causes the timer to be added on the wrong CPU in the first place?
The WARN_ON meriliy detects it after the fact.

> will be added on the other CPU, which means it loses the chance to
> decrement *cmci_storm_on_cpus* to zero to reenable the CMCI. Maybe
> this situation happens seldomly, but once it happens, CMCI will never
> be actived again after it is disabled.
>
> > +void mce_timer_kick(unsigned long interval)
> > +{
> > + struct timer_list *t = &__get_cpu_var(mce_timer);
> > + unsigned long when = jiffies + interval;
> > + unsigned long iv = __this_cpu_read(mce_next_interval);
> > +
> > + if (time_before(when, t->expires) && timer_pending(t)) {
> > + mod_timer(t, when);
> > + } else if (!mce_next_interval) {
> > + t->expires = round_jiffies(jiffies + iv);
> > + add_timer_on(t, smp_processor_id());
>
> I've changed "else if (!mce_next_interval)" to "else if (iv)", but

else if (!iv) perhaps ?

> I still think it is not right. Imaging *when* is after t->expires and
> this timer is pending on the timer tree, so it will hit *else if*
> if iv is not zero(common situations), again, add_timer_on will trigger
> BUG_ON because this timer is pending.

See above. Aside of that I rewrote the code to be more clear. See
delta patch below.

> > static void intel_threshold_interrupt(void)
> > {
> > + if (cmci_storm_detect())
> > + return;
> > machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
> > mce_notify_irq();
> > }
>
> I think cmci_storm_detect should be placed in the machine_check_poll,
> not out of it. Because machine_check_poll it the core execution logic
> for CMCI handling, in the meanwhile, poll timer and mce-inject module
> call machine_check_poll at any time. If poll timer or mce-inject run
> too quickly, the CMCI handler has trouble. Whereas, if
> cmci_storm_detect is in the machine_check_poll, this kind of possibility
> can be avoid.

No, it's wrong to put it into machine_check_poll().

machine_check_poll() is a generic functionality which has nothing to
do with CMCI storms. That CMCI storm/poll stuff is intel specific and
therefor the detection logic is in the intel specific interrupt
handler and nowhere else.

Thanks,

tglx
---
arch/x86/kernel/cpu/mcheck/mce.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mcheck/mce.c
+++ linux-2.6/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1305,13 +1305,14 @@ void mce_timer_kick(unsigned long interv
unsigned long when = jiffies + interval;
unsigned long iv = __this_cpu_read(mce_next_interval);

- if (time_before(when, t->expires) && timer_pending(t)) {
- mod_timer(t, when);
- } else if (!mce_next_interval) {
- t->expires = round_jiffies(jiffies + iv);
+ if (timer_pending(t)) {
+ if (time_before(when, t->expires))
+ mod_timer(t, when);
+ } else {
+ t->expires = round_jiffies(jiffies + when);
add_timer_on(t, smp_processor_id());
}
- if (interval < iv)
+ if (interval > iv)
__this_cpu_write(mce_next_interval, iv);
}