Re: BUG: scheduling while atomic in acpi_ps_complete_op

From: Vegard Nossum
Date: Mon Aug 24 2009 - 10:50:33 EST


2009/8/24 Eric Paris <eparis@xxxxxxxxxx>:
> On Sat, 2009-08-22 at 01:24 +0400, Alexey Starikovskiy wrote:
>> Eric Paris ÐÐÑÐÑ:
>> > On Sat, 2009-08-22 at 00:12 +0400, Alexey Starikovskiy wrote:
>> >> Hi,
>> >> This should be handled by abe1dfab60e1839d115930286cb421f5a5b193f3.
>> >
>> > And yet I'm getting it from linux-next today.
>> >
>> > So you are apparently failing the in_atomic_preempt_off() test but
>> > succeeding in your !irqs_disabled() test.
>> >
>> > Something isn't right since I'm hitting it hundreds of times on boot.
>> >
>> > -Eric
>> >
>> Ok, let's see if replacing irqs_disabled() to
>> in_atomic_preempt_off() helps...
>
> It does stop my slew of warnings. ÂNot sure it completely fixes my
> problems though....
>
> [ Â Â1.897021] ... counter mask: Â Â Â Â Â Â0000000700000003^M
> [ Â Â1.906821] ACPI: Core revision 20090625^M
> [ Â 10.000008] INFO: RCU detected CPU 0 stall (t=10000 jiffies)^M
> [ Â 10.000008] sending NMI to all CPUs:^M
> [ Â 21.907580] Setting APIC routing to flat^M
> [ Â 21.973314] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1^M
> [ Â 21.985260] CPU0: Intel(R) Xeon(R) CPU Â Â Â Â Â X5355 Â@ 2.66GHz stepping 07^M
> [ Â 21.992017] kmemcheck: Limiting number of CPUs to 1.^M
> [ Â 21.993065] kmemcheck: Initialized^M
> [ Â 22.750118] Brought up 1 CPUs^M
> [ Â 22.751069] Total of 1 processors activated (5333.45 BogoMIPS).^M
> [ Â 23.493639] khelper used greatest stack depth: 4848 bytes left^M
> [ Â 24.999193] Booting paravirtualized kernel on bare hardware^M
> [ Â 25.265364] Time: 17:50:52 ÂDate: 08/21/09^M
> [ Â 25.616191] NET: Registered protocol family 16^M
> [ Â 27.765113] ACPI: bus type pci registered^M
> [ Â 28.795307] PCI: Using configuration type 1 for base access^M
> [ Â 61.793279] bio: create slab <bio-0> at 0^M
> [ Â 95.285367] ACPI: BIOS _OSI(Linux) query ignored^M
> [ Â102.628227] ACPI: Interpreter enabled^M
> [ Â102.630134] ACPI: (supports S0 S1 S5)^M
> [ Â102.823225] ACPI: Using IOAPIC for interrupt routing^M
> [ Â142.365090] ACPI: No dock devices found.^M
> [ Â156.864036] ACPI: PCI Root Bridge [PCI0] (0000:00)^M
> [ Â157.460654] pci 0000:00:07.3: quirk: region 1000-103f claimed by PIIX4 ACPI^M
> [ Â157.463937] pci 0000:00:07.3: quirk: region 1040-104f claimed by PIIX4 SMB^M
> [ Â157.644036] pci 0000:00:11.0: transparent bridge^M
> [ Â193.009036] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 14 15) *0, disabled.^M
> [ Â193.938036] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 14 15)^M
> [ Â194.864036] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 14 15)^M
> [ Â195.780036] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 14 15) *0, disabled.^M
>
> Something took 20 seconds between "ACPI: Core revision 20090625" and
> "Setting APIC routing to flat"
>
> This is a linux-next kernel, on vmware-server, with kmemcheck enabled.
> Disabling kmemcheck seems to make all of this go away. ÂIf not the ACPI
> guys who should I be talking to?
>
> A little bit later I finally see backtraces from NMIs because of RCU
> stalls. ÂAnyone have ideas here?
>
> [ Â213.168161] INFO: RCU detected CPU 0 stall (t=10004 jiffies)^M

So this is probably just the intrinsic slowness of kmemcheck that
causes the the big delays and RCU stalls. It shouldn't cause any other
badness, as far as I understood, the 10000 jiffies limit is just a
heuristic. Maybe we need to adjust it when kmemcheck is enabled.

I'm more confused about the change you had to with
irqs_disabled()/in_atomic_preempt_off().


Vegard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/