Re: [PATCH, RFC 00/10] THP refcounting redesign

From: Kirill A. Shutemov
Date: Tue Jun 10 2014 - 16:47:29 EST


On Tue, Jun 10, 2014 at 03:25:42PM -0500, Christoph Lameter wrote:
> On Mon, 9 Jun 2014, Kirill A. Shutemov wrote:
>
> > To be able to split huge page at any point we have to track which tail
> > page was pinned. It leads to tricky and expensive get_page() on tail pages
> > and also occupy tail_page->_mapcount.
>
> Maybe we should give up the requirement to be able to split a huge page at
> any point?

Yes, that's what the patchset does: we don't allow to split the page if
any sub-page is pinned.

> This got us into the mess AFAICT. Instead we could use the locking
> mechanisms that we have to stop all access to the page and then do the
> conversion?

I end up with compound_lock to freeze page count. Not sure if it's the
best option we have

> Page migration can do that so it should be fine with refcounting for
> huge pages exclusively in the head page exactly like a regular page.

We've discussed "split via migration" with Dave. I need to look more on
how migration works.

> The problem is then dealing with the locations where we now do rely on
> the ability to split at "any point" (notion is weird in itself and
> suggests issues with synchronization).

As I said, we have only 4 places where we need to split the page (not only
PMD): swap out, memory failure, KSM, migration. All of them can tolerate
split failure.

> Use the standard locking schemes for pages instead?

Could you elaborate here?

> I thought the idea was that we would modify the relevant code and
> that at some point this requirement could go away?
>
> Huge pages (and other larger order pages) will become increasingly
> difficult to handle if relevant page state has to be maintained in tail
> pages and if it differs significantly from regular pages.

Agreed. The patchset drops tail page refcounting.
--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/