Re: [ANNOUNCE] v5.14-rc4-rt4

From: Jens Axboe
Date: Wed Aug 04 2021 - 09:32:43 EST


On 8/4/21 7:17 AM, Peter Zijlstra wrote:
> On Wed, Aug 04, 2021 at 01:00:57PM +0200, Sebastian Andrzej Siewior wrote:
>> On 2021-08-04 12:48:05 [+0200], To Daniel Wagner wrote:
>>> On 2021-08-04 12:43:42 [+0200], To Daniel Wagner wrote:
>>>> Odd. Do you have a config for that, please?
>>>
>>> No need.
>>> | [ 90.202543] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:35
>>> | [ 90.202549] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2047, name: iou-wrk-2041
>>> | [ 90.202555] CPU: 5 PID: 2047 Comm: iou-wrk-2041 Tainted: G W 5.14.0-rc4-rt4+ #89
>>> | [ 90.202561] Call Trace:
>> …
>>> | [ 90.202588] rt_spin_lock+0x19/0x70
>>> | [ 90.202593] ___slab_alloc+0xcb/0x7d0
>> …
>>> | [ 90.202618] kmem_cache_alloc_trace+0x79/0x1f0
>>> | [ 90.202621] io_wqe_dec_running.isra.0+0x98/0xe0
>>> | [ 90.202625] io_wq_worker_sleeping+0x37/0x50
>>> | [ 90.202628] schedule+0x30/0xd0
>>>
>>> le look.
>>
>> So this is due to commit
>> 685fe7feedb96 ("io-wq: eliminate the need for a manager thread")
>>
>> introduced in the v5.13-rc1 merge window. The call chain is
>> schedule()
>> sched_submit_work()
>> preempt_disable();
>> io_wq_worker_sleeping()
>> raw_spin_lock_irq(&worker->wqe->lock);
>> io_wqe_dec_running(worker);
>> io_queue_worker_create()
>> kmalloc(sizeof(*cwd), GFP_ATOMIC);
>>
>> The lock wqe::lock has been turned into a raw_spinlock_t in commit
>> 95da84659226d ("io_wq: Make io_wqe::lock a raw_spinlock_t")
>>
>> after a careful analysis of the code at that time. This commit breaks
>> things. Is this really needed?
>
> Urgh, doing allocs from schedule seems really yuck. Can we please not do
> this?

Agree, I have an idea of how to get rid of it. Let me experiment a bit...

--
Jens Axboe