Re: [Bug] fio hang when running multiple job io_uring/hipri over nvme

From: Jens Axboe
Date: Fri Jun 18 2021 - 10:56:04 EST


On 6/17/21 4:39 PM, Ming Lei wrote:
> On Thu, Jun 17, 2021 at 10:56:53AM -0600, Jens Axboe wrote:
>> On 6/17/21 10:48 AM, Jens Axboe wrote:
>>> On 6/17/21 5:17 AM, Ming Lei wrote:
>>>> Hello,
>>>>
>>>> fio hangs when running the test[1], and doesn't observe this issue
>>>> when running a
>>>> such single job test.
>>>>
>>>> v5.12 is good, both v5.13-rc3 and the latest v5.13-rc6 are bad.
>>>>
>>>>
>>>> [1] fio test script and log
>>>> + fio --bs=4k --ioengine=io_uring --fixedbufs --registerfiles --hipri
>>>> --iodepth=64 --iodepth_batch_submit=16
>>>> --iodepth_batch_complete_min=16 --filename=/dev/nvme0n1 --direct=1
>>>> --runtime=20 --numjobs=4 --rw=randread
>>>> --name=test --group_reporting
>>>>
>>>> test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
>>>> 4096B-4096B, ioengine=io_uring, iodepth=64
>>>> ...
>>>> fio-3.25
>>>> Starting 4 processes
>>>> fio: filehash.c:64: __lookup_file_hash: Assertion `f->fd != -1' failed.
>>>> fio: pid=1122, got signal=6
>>>> ^Cbs: 3 (f=0): [f(1),r(1),K(1),r(1)][63.6%][eta 00m:20s]
>>>
>>> Funky, would it be possible to bisect this? I'll see if I can reproduce.
>>
>> Actually, this looks like a fio bug, that assert is a bit too trigger
>> happy. Current -git should work, please test and see if things work.
>> I believe it's just kernel timing that causes this, not a kernel issue.
>
> Yeah, current -git does work, thanks the fix!

Thanks for checking!

--
Jens Axboe