Re: [PATCH] ioapic: fix potential resume deadlock

From: Daniel J Blueman
Date: Tue May 10 2011 - 06:53:33 EST


On 10 May 2011 15:35, Ingo Molnar <mingo@xxxxxxx> wrote:
> * Daniel J Blueman <daniel.blueman@xxxxxxxxx> wrote:
>
>> Fix a potential deadlock when resuming; here the calling function
>> has disabled interrupts, so we cannot sleep.
>>
>> Signed-off-by: Daniel J Blueman <daniel.blueman@xxxxxxxxx>
>>
>> ---
>>  arch/x86/kernel/apic/io_apic.c |    4 ++--
>>  1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
>> index 45fd33d..df63620 100644
>> --- a/arch/x86/kernel/apic/io_apic.c
>> +++ b/arch/x86/kernel/apic/io_apic.c
>> @@ -621,14 +621,14 @@ struct IO_APIC_route_entry **alloc_ioapic_entries(void)
>>       struct IO_APIC_route_entry **ioapic_entries;
>>
>>       ioapic_entries = kzalloc(sizeof(*ioapic_entries) * nr_ioapics,
>> -                             GFP_KERNEL);
>> +                             GFP_ATOMIC);
>>       if (!ioapic_entries)
>>               return 0;
>>
>>       for (apic = 0; apic < nr_ioapics; apic++) {
>>               ioapic_entries[apic] =
>>                       kzalloc(sizeof(struct IO_APIC_route_entry) *
>> -                             nr_ioapic_registers[apic], GFP_KERNEL);
>> +                             nr_ioapic_registers[apic], GFP_ATOMIC);
>>               if (!ioapic_entries[apic])
>>                       goto nomem;
>>       }
>
> Hm, there must be some other bug here.
>
> ioapic entries should be allocated on system bootup and should never really be
> deallocated.
>
> The bug might be in the idea to call to enable_IR_x2apic() on resume - why are
> ioapic entries reallocated there? That call in default_setup_apic_routing() was
> introduced some time ago in:
>
>  fa47f7e52874: x86, x2apic: Simplify apic init in SMP and UP builds
>
> and that it results in reallocation in the suspend patch is distinctly wrong.
>
> I suspect the reason why this has not triggered for others is the relative
> rarity of affected systems?
>
> Suresh?

For reference, this is the warning I get on resume with debugging enabled:

BUG: sleeping function called from invalid context at mm/slub.c:824
in_atomic(): 0, irqs_disabled(): 1, pid: 1842, name: pm-suspend
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [< (null)>] (null)
hardirqs last disabled at (0): [<ffffffff8105eca2>] copy_process+
0x4e2/0x1040
softirqs last enabled at (0): [<ffffffff8105eca2>] copy_process+0x4e2/0x1040
softirqs last disabled at (0): [< (null)>] (null)
Pid: 1842, comm: pm-suspend Tainted: G M 2.6.39-rc6-330cd+ #3
Call Trace:
[<ffffffff81098290>] ? print_irqtrace_events+0xd0/0xe0
[<ffffffff8104a11d>] __might_sleep+0xed/0x120
[<ffffffff8112f7c3>] __kmalloc+0xd3/0x140
[<ffffffff81025566>] alloc_ioapic_entries+0x66/0xb0
[<ffffffff81022bc5>] lapic_resume+0x265/0x350
[<ffffffff814059d5>] syscore_resume+0x35/0xd0
[<ffffffff810a8ace>] suspend_enter+0xde/0x120
[<ffffffff810a8b7a>] suspend_devices_and_enter+0x6a/0xb0
[<ffffffff810a8c91>] enter_state+0xd1/0x100
[<ffffffff810a81c6>] state_store+0xc6/0x100
[<ffffffff81329c37>] kobj_attr_store+0x17/0x20
[<ffffffff811a78d1>] sysfs_write_file+0xe1/0x160
[<ffffffff811400c6>] vfs_write+0xc6/0x180
[<ffffffff811403dc>] sys_write+0x4c/0x90
[<ffffffff8170927b>] system_call_fastpath+0x16/0x1b
--
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/