Re: nmi_watchdog suspicious
From: Cyrill Gorcunov
Date: Tue Jun 17 2008 - 11:51:55 EST
[Maciej W. Rozycki - Tue, Jun 17, 2008 at 12:20:28AM +0100]
| On Mon, 16 Jun 2008, Cyrill Gorcunov wrote:
|
| > Maciej, I think nmi_watchdog could (and probably should) be defined as
| > unsigned. Here my points of why (fix me please if I'm wrong):
| >
| > - if we remain it as unsigned we could simplify setup_nmi_watchdog() to
| > just check for 'if (nmi >= NMI_INVALID)'
|
| This is run once only at the boot if at all -- just to verify the range
| is correct. Other places are executed multiple times during normal
| operation and it is them you should optimise for.
|
| > - current code does check for NMI_NONE _and_ NMI_DISABLED at once in most
| > cases (only the case it dont is - proc_nmi_enabled() wich could be simplified too)
|
| Please note the intent is NMI_DISABLED is a bootstrap default to tell the
| platform the user has not specified any override. With the 32-bit
| platform it used to be promoted automatically to NMI_IO_APIC or
| NMI_LOCAL_APIC as appropriate, but it was removed because of stability
| problems with many systems. It looks it wasn't done in a particularly
| fortunate way -- the new promotion should be to NMI_NONE, but instead it
| was removed altogether.
|
| Preferably the initialization to NMI_NONE should be done as soon as it
| has been determined there was no "nmi_watchdog=" option specified, but in
| practice I think it can simply be done at the beginning of trap_init(),
| before the gate descriptor has been set up for the NMI (after which point
| the NMI handler can be reached). This way no piece of code other than
| setup_nmi_watchdog() would have to care about negative values of
| nmi_watchdog.
|
| > - the only affected of such sign/unsign contention I found is
| > touch_nmi_watchdog() for which I suggested the patch (already in Ingo's tip tree)
| > http://lkml.org/lkml/2008/6/12/200
| > So there could be some 'useless counters resetting' but it could happen for
| > quite short time while APIC in initialization phase.
|
| This is a sloppy coding practice which has led us to the current
| situation with the APIC code -- there should be no "useless code
| execution" unless absolutely unavoidable. I'd feel more comfortable if
| there was a separate variable like nmi_watchdog_active checked in the
| handler instead of nmi_watchdog that would only be set once the watchdog
| has actually been activated.
|
| The whole idea of touch_nmi_watchdog() itself is rather unfortunate too,
| but that's apparently not an easy problem to solve.
|
| Maciej
|
Thanks a lot Maciej for comments! I've marked them. I'm not sure but it seems
I wrote a bit unclear /my english bad indeed/ ;) I mean - this say 'slipping'
(ie useless code executions) _was_ before the patch applied. Now it doesn't
slip on this since we do mention explicitly in which case there should be
alert counters reset. Other then that - will try to handle your notes. Thanks!
- Cyrill -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/