Re: [patch] enabling APIC and NMI watchdog on UP systems

From: Keith Owens (kaos@ocs.com.au)
Date: Fri Sep 29 2000 - 20:04:55 EST


On Fri, 29 Sep 2000 10:00:37 +0200 (CEST),
Ingo Molnar <mingo@elte.hu> wrote:
>On Fri, 29 Sep 2000, Keith Owens wrote:
>
>> arch/i386/kernel/msr.c has been there since at least 2.4.0-test1. A
>
>there is nothing performance-counter specific about /dev/msr. There is no
>highlevel performance-counter logic in the kernel. So if the kernel
>decides to use one of them, that should not be more of a problem than some
>other user already using the performance counter.

/dev/msr exposes all msrs to user space, including performance
counters. There is code out there that uses /dev/msr to get
performance data. My worry was not about the kernel using performance
counters, because there is no such use. I worry about userspace code
that is already using performance counters.

>You can simply turn performance counters off via /dev/msr, just like
you would clear them prior use anyway.

Sorry, no. As your patch stands, on UP you can turn off PERFCTR1 NMI
via /dev/msr. On SMP you end up with two sources of NMI, local APIC on
every cpu plus IO-APIC. The code in linux/arch/i386/kernel/traps.c
resets PERFCTR1 on every NMI, whether PERFCTR1 was the NMI source or
not, this makes PERFCTR1 permanently unavailable on SMP boxes. Even if
the user changes EVNTSEL1 via /dev/msr, traps.c on SMP will still
overwrite the PERFCTR1 value on every NMI. Not only that but the NMI
oopser now kicks in twice as fast on SMP.

  Problem 1: We need a field nmi_watchdog_source, I put that code in
  there for a good reason.

  Problem 2: Booting multiple cpu machines when NMI is generated via
  local APIC causes problems. Each cpu spins in start_secondary()
  until all cpus are active. The local APIC NMI oopser is active at
  this point and after about the fifth cpu it kicks in. This problem
  exists in my kdb patch as well. The easiest solution is to add
  field nmi_watchdog_source and set that to 2 (SMP mode) at the start
  of do_boot_cpu().

  Problem 3: You replaced my use of wrmsr_eio with a test on the return
  code with wrmsr and no return code test. wrmsr_eio returns -EIO for
  an invalid msr, wrmsr will oops. I was allowing for hardware that
  has a local APIC but does not support performance counters, does AMD
  fall into this category? wrmsr_eio is much safer at almost zero
  extra execution cost.

I can live without a config or /proc option for adjusting NMI oopser.
Field nmi_watchdog_source is required, wrmsr_eio is recommended.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 30 2000 - 21:00:26 EST