Re: [PATCH] mm/filemap: invalidating pages is still necessary when io with IOCB_NOWAIT
From: Jens Axboe
Date: Mon May 27 2024 - 11:36:39 EST
On 5/27/24 4:09 AM, Liu Wei wrote:
> I am a newer, thanks for the reminder.
>
>>
>>>> when we issuing AIO with direct I/O and IOCB_NOWAIT on a block device, the
>>>> process context will not be blocked.
>>>>
>>>> However, if the device already has page cache in memory, EAGAIN will be
>>>> returned. And even when trying to reissue the AIO with direct I/O and
>>>> IOCB_NOWAIT again, we consistently receive EAGAIN.
>>
>> -EAGAIN doesn't mean "just try again and it'll work".
>>
>>>> Maybe a better way to deal with it: filemap_fdatawrite_range dirty pages
>>>> with WB_SYNC_NONE flag, and invalidate_mapping_pages unmapped pages at
>>>> the same time.
>>>>
>>>> Signed-off-by: Liu Wei <liuwei09@xxxxxxxx>
>>>> ---
>>>> mm/filemap.c | 9 ++++++++-
>>>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/filemap.c b/mm/filemap.c
>>>> index 30de18c4fd28..1852a00caf31 100644
>>>> --- a/mm/filemap.c
>>>> +++ b/mm/filemap.c
>>>> @@ -2697,8 +2697,15 @@ int kiocb_invalidate_pages(struct kiocb *iocb, size_t count)
>>>>
>>>> if (iocb->ki_flags & IOCB_NOWAIT) {
>>>> /* we could block if there are any pages in the range */
>>>> - if (filemap_range_has_page(mapping, pos, end))
>>>> + if (filemap_range_has_page(mapping, pos, end)) {
>>>> + if (mapping_needs_writeback(mapping)) {
>>>> + __filemap_fdatawrite_range(mapping,
>>>> + pos, end, WB_SYNC_NONE);
>>>> + }
>>
>> I don't think WB_SYNC_NONE tells it not to block, it just says not to
>> wait for it... So this won't work as-is.
>
> Yes, but I think an asynchronous writex-back is better than simply
> return EAGAIN. By using __filemap_fdatawrite_range to trigger a
> writeback, subsequent retries may have a higher chance of success.
And what's the application supposed to do, just hammer on the same
IOCB_NOWAIT submission until it then succeeds? The only way this can
reasonably work for that would be if yo can do:
1) Issue IOCB_NOWAIT IO
2) Get -EAGAIN
3) Sync kick off writeback, wait for it to be done
4) Issue IOCB_NOWAIT IO again
5) Success
If you just kick it off, then you'd repeat steps 1..2 ad nauseam until
it works out, not tenable.
And this doesn't even include the other point I mentioned, which is
__filemap_fdatawrite_range() IO issue blocking in the first place.
So no, NAK on this patch.
--
Jens Axboe