Re: [PATCH v2] mm, thp: use head page in __migration_entry_wait

From: Kirill A. Shutemov
Date: Wed Jun 09 2021 - 05:59:42 EST


On Tue, Jun 08, 2021 at 02:35:23PM +0100, Matthew Wilcox wrote:
> On Tue, Jun 08, 2021 at 03:58:38PM +0300, Kirill A. Shutemov wrote:
> > On Tue, Jun 08, 2021 at 01:32:21PM +0100, Matthew Wilcox wrote:
> > > On Tue, Jun 08, 2021 at 03:00:26PM +0300, Kirill A. Shutemov wrote:
> > > > But there's one quirk: if split succeed we effectively wait on wrong
> > > > page to be unlocked. And it may take indefinite time if split_huge_page()
> > > > was called on the head page.
> > >
> > > Hardly indefinite time ... callers of split_huge_page_to_list() usually
> > > unlock the page soon after. Actually, I can't find one that doesn't call
> > > unlock_page() within a few lines of calling split_huge_page_to_list().
> >
> > I didn't check all callers, but it's not guaranteed by the interface and
> > it's not hard to imagine a future situation when a page got split on the
> > way to IO and kept locked until IO is complete.
>
> I would say that can't happen. Pages are locked when added to the page
> cache and are !Uptodate. You can't put a PTE in a process page table
> until it's Uptodate, and once it's Uptodate, the page is unlocked. So
> any subsequent locks are transient, and not for the purposes of IO
> (writebacks only take the page lock transiently).

Documentation/filesystems/locking.rst:

Note, if the filesystem needs the page to be locked during writeout, that
is ok, too, the page is allowed to be unlocked at any point in time
between the calls to set_page_writeback() and end_page_writeback().

I probably misinterpret what is written here. I know very little about
writeback path.

> > The wake up shouldn't have much overhead as in most cases split going to
> > be called on the head page.
>
> I'm not convinced about that. We go out of our way to not wake up pages
> (eg PageWaiters), and we've had some impressively long lists in the past
> (which is why we now have the bookmarks).

Maybe we should be smarter on when to wake up, I donno.

I just notice that with the change we have /potential/ to wait long time
on the wrong page to be unlocked. split_huge_page() interface doesn't
enforce that the page gets split soon after split is complete.

--
Kirill A. Shutemov