Re: [PATCH v1 03/10] mm: fs: remove filemap_nr_thps*() functions and their users

From: David Hildenbrand (Arm)

Date: Wed Apr 01 2026 - 15:18:25 EST


On 4/1/26 17:32, Zi Yan wrote:
> On 1 Apr 2026, at 10:35, David Hildenbrand (Arm) wrote:
>
>> On 3/27/26 16:05, Zi Yan wrote:
>>>
>>>
>>> But I added
>>>
>>> + if (!is_shmem && inode_is_open_for_write(mapping->host))
>>> + result = SCAN_FAIL;
>>>
>>> That keeps the original bail out, right?
>>
>> Independent of that, are we sure that the possible race we allow is ok?
>
> Let me think.
>
> do_dentry_open() -> file_get_write_access() -> get_write_access() bumps
> inode->i_writecount atomically and it turns inode_is_open_for_write()
> to true. Then, do_dentry_open() also truncates all pages
> if filemap_nr_thps() is not zero. This pairs with khugepaged’s first
> filemap_nr_thps_inc() then inode_is_open_for_write() to prevent opening
> a fd with write when there is a read-only THP.
>
> After removing READ_ONLY_THP_FOR_FS, khugepaged only creates read-only THPs
> on FSes with large folio support (to be precise THP support). If a fd
> is opened for write before inode_is_open_for_write() check, khugepaged
> will stop. It is fine. But if a fd is opened for write after
> inode_is_open_for_write() check, khugepaged will try to collapse a read-only
> THP and the fd can be written at the same time.

Exactly, that's the race I mean.

>
> I notice that fd write requires locking the to-be-written folio first
> (I see it from f_ops->write_iter() -> write_begin_get_folio() and assume
> f_ops->write() has the same locking requirement) and khugepaged has already
> locked the to-be-collapsed folio before inode_is_open_for_write(). So if the
> fd is opened for write after inode_is_open_for_write() check, its write
> will wait for khugepaged collapse and see a new THP. Since the FS
> supports THP, writing to the new THP should be fine.
>
> Let me know if my analysis above makes sense. If yes, I will add it
> to the commit message and add a succinct comment about it before
> inode_is_open_for_write().

khugepaged code is the only code that replaces folios in the pagecache
by other folios. So my main concern is if that is problematic on
concurrent write access.

You argue that the folio lock is sufficient. That's certainly true for
individual folios, but I am more concerned about the replacement part.

I don't have anything concrete, primarily just pointing out that this is
a change that might unlock some code paths that could not have been
triggered before.

--
Cheers,

David