Re: [PATCH 1/2] io_uring: clear TIF_NOTIFY_SIGNAL when running task work
From: Pavel Begunkov
Date: Tue Aug 10 2021 - 17:33:47 EST
On 8/10/21 9:28 AM, Nadav Amit wrote:
>> On Aug 9, 2021, at 2:48 PM, Olivier Langlois <olivier@xxxxxxxxxxxxxx> wrote:
>> On Sat, 2021-08-07 at 17:13 -0700, Nadav Amit wrote:
>>> From: Nadav Amit <namit@xxxxxxxxxx>
>>> When using SQPOLL, the submission queue polling thread calls
>>> task_work_run() to run queued work. However, when work is added with
>>> TWA_SIGNAL - as done by io_uring itself - the TIF_NOTIFY_SIGNAL remains
>>> set afterwards and is never cleared.
>>> Consequently, when the submission queue polling thread checks whether
>>> signal_pending(), it may always find a pending signal, if
>>> task_work_add() was ever called before.
>>> The impact of this bug might be different on different kernel versions.
>>> It appears that on 5.14 it would only cause unnecessary calculation and
>>> prevent the polling thread from sleeping. On 5.13, where the bug was
>>> found, it stops the polling thread from finding newly submitted work.
>>> Instead of task_work_run(), use tracehook_notify_signal() that clears
>>> TIF_NOTIFY_SIGNAL. Test for TIF_NOTIFY_SIGNAL in addition to
>>> current->task_works to avoid a race in which task_works is cleared but
>>> the TIF_NOTIFY_SIGNAL is set.
>> thx a lot for this patch!
>> This explains what I am seeing here:
>> I was under the impression that task_work_run() was clearing
>> your patch made me realize that it does not…
> Happy it could help.
> Unfortunately, there seems to be yet another issue (unless my code
> somehow caused it). It seems that when SQPOLL is used, there are cases
> in which we get stuck in io_uring_cancel_sqpoll() when tctx_inflight()
> never goes down to zero.
> Debugging... (while also trying to make some progress with my code)
It's most likely because a request has been lost (mis-refcounted).
Let us know if you need any help. Would be great to solve it for 5.14.
1) if not already, try out Jens' 5.14 branch
2) try to characterise the io_uring use pattern. Poll requests?
Read/write requests? Send/recv? Filesystem vs bdev vs sockets?
If easily reproducible, you can match io_alloc_req() with it
getting into io_dismantle_req();