Re: [PATCH v3] mm/swap: Fix release_pages() when releasing devmap pages
From: Dan Williams
Date: Tue Jun 04 2019 - 17:46:41 EST
On Tue, Jun 4, 2019 at 1:17 PM John Hubbard <jhubbard@xxxxxxxxxx> wrote:
>
> On 6/4/19 1:11 PM, Dan Williams wrote:
> > On Tue, Jun 4, 2019 at 12:48 PM John Hubbard <jhubbard@xxxxxxxxxx> wrote:
> >>
> >> On 6/4/19 9:48 AM, ira.weiny@xxxxxxxxx wrote:
> >>> From: Ira Weiny <ira.weiny@xxxxxxxxx>
> >>>
> ...
> >>> diff --git a/mm/swap.c b/mm/swap.c
> >>> index 7ede3eddc12a..6d153ce4cb8c 100644
> >>> --- a/mm/swap.c
> >>> +++ b/mm/swap.c
> >>> @@ -740,15 +740,20 @@ void release_pages(struct page **pages, int nr)
> >>> if (is_huge_zero_page(page))
> >>> continue;
> >>>
> >>> - /* Device public page can not be huge page */
> >>> - if (is_device_public_page(page)) {
> >>> + if (is_zone_device_page(page)) {
> >>> if (locked_pgdat) {
> >>> spin_unlock_irqrestore(&locked_pgdat->lru_lock,
> >>> flags);
> >>> locked_pgdat = NULL;
> >>> }
> >>> - put_devmap_managed_page(page);
> >>> - continue;
> >>> + /*
> >>> + * Not all zone-device-pages require special
> >>> + * processing. Those pages return 'false' from
> >>> + * put_devmap_managed_page() expecting a call to
> >>> + * put_page_testzero()
> >>> + */
> >>
> >> Just a documentation tweak: how about:
> >>
> >> /*
> >> * ZONE_DEVICE pages that return 'false' from
> >> * put_devmap_managed_page() do not require special
> >> * processing, and instead, expect a call to
> >> * put_page_testzero().
> >> */
> >
> > Looks better to me, but maybe just go ahead and list those
> > expectations explicitly. Something like:
> >
> > /*
> > * put_devmap_managed_page() only handles
> > * ZONE_DEVICE (struct dev_pagemap managed)
> > * pages when the hosting dev_pagemap has the
> > * ->free() or ->fault() callback handlers
> > * implemented as indicated by
> > * dev_pagemap.type. Otherwise the expectation
> > * is to fall back to a plain decrement /
> > * put_page_testzero().
> > */
>
> I like it--but not here, because it's too much internal detail in a
> call site that doesn't use that level of detail. The call site looks
> at the return value, only.
>
> Let's instead put that blurb above (or in) the put_devmap_managed_page()
> routine itself. And leave the blurb that I wrote where it is. And then I
> think everything will have an appropriate level of detail in the right places.
Ok. Ideally there wouldn't be any commentary needed at the call site
and the put_page() could be handled internal to
put_devmap_managed_page(), but I did not see a way to do that without
breaking the compile out / static branch optimization when there are
no active ZONE_DEVICE users.