Re: corruption causing crash in __queue_work

From: Nikolay Borisov
Date: Wed Dec 09 2015 - 11:23:25 EST




On 12/09/2015 06:08 PM, Tejun Heo wrote:
> Hello, Nikolay.
>
> On Wed, Dec 09, 2015 at 02:08:56PM +0200, Nikolay Borisov wrote:
>> 73309.529940] BUG: unable to handle kernel NULL pointer dereference at (null)
>> [73309.530238] IP: [<ffffffff8106b663>] __queue_work+0xb3/0x390
> ...
>> [73309.537319] <IRQ>
>> [73309.537373] [<ffffffff8106b940>] ? __queue_work+0x390/0x390
>> [73309.537714] [<ffffffff8106b958>] delayed_work_timer_fn+0x18/0x20
>> [73309.537891] [<ffffffff810ad1d7>] call_timer_fn+0x47/0x110
>> [73309.538071] [<ffffffff810be302>] ? tick_sched_timer+0x52/0xa0
>> [73309.538249] [<ffffffff810adb6f>] run_timer_softirq+0x17f/0x2b0
>> [73309.538425] [<ffffffff8106b940>] ? __queue_work+0x390/0x390
>> [73309.538604] [<ffffffff81057f40>] __do_softirq+0xe0/0x290
>> [73309.538778] [<ffffffff810581e6>] irq_exit+0xa6/0xb0
>> [73309.538952] [<ffffffff8159413a>] smp_apic_timer_interrupt+0x4a/0x59
>> [73309.539128] [<ffffffff815926bb>] apic_timer_interrupt+0x6b/0x70
> ...
>> The gist is that this fail on the following line:
>>
>> if (last_pool && last_pool != pwq->pool) {
>
> That's new.
>
>> Since the pointer 'pwq' is wrong (it is loaded in %rdx) which in this
>> case is 0000000000000000. Looking at the function's source pwq should
>> be loaded by per_cpu_ptr since the if (!(wq->flags & WQ_UNBOUND))
>> check should evaluate to false. So pwq is loaded as the result from
>> unbound_pwq_by_node(wq, cpu_to_node(cpu));
>>
>> Here are the flags of the workqueue:
>> crash> struct workqueue_struct.flags 0xffff8803df464c00
>> flags = 131082
>
> That's ordered unbound workqueue w/ a rescuer.

So the name of the queue is 'dm-thin', looking at the sources in
dm-thin, the only place where a workqueue is allocates this here:

pool->wq = alloc_ordered_workqueue("dm-" DM_MSG_PREFIX, WQ_MEM_RECLAIM);

But in this case I guess the caller can't be the culprit? I'm biased wrt
dm-thin because in the past few months I've hit multiple bugs.

>
>> (0xffff8803df464c00 is indeed the pointer to the workqueue struct,
>> so the flags aren't bogus).
>>
>> So reading the numa_pwq_tbl it seems that it's uninitialised:
>>
>> crash> struct workqueue_struct.numa_pwq_tbl 0xffff8803df464c00
>> numa_pwq_tbl = 0xffff8803df464d10
>> crash> rd -64 0xffff8803df464d10 3
>> ffff8803df464d10: 0000000000000000 0000000000000000 ................
>> ffff8803df464d20: 0000000000000000 ........
>>
>> The machine where the crash occurred has a single NUMA node, so at the
>> very least I would have expected to have a pointer, rather than NULL ptr.
>>
>> Also this crash is not isolated in that I have observed it on multiple
>> other nodes running vanilla 4.2.5/4.2.6 kernels.
>>
>> Any advice how to further debug that?
>
> Adding printk or tracepoints at numa_pwq_tbl_install() to dump what's
> being installed would be helpful. It should at least tell us whether
> it's the table being corrupted by something else or workqueue failing
> to set it up correctly to begin with. How reproducible is the
> problem?

I think we are seeing this at least daily on at least 1 server (we have
multiple servers like that). So adding printk's would likely be the way
to go, anything in particular you might be interested in knowing? I see
RCU stuff around so might be tricky race condition.


>
> Thanks.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/