Re: [PATCH v3 2/2] mm: rid swapoff of quadratic complexity

From: Hugh Dickins
Date: Wed Jan 02 2019 - 14:46:13 EST

On Wed, 2 Jan 2019, Vineeth Pillai wrote:
> After reading the code again, I feel like we can make the retry logic
> simpler and avoid the use of oldi. If my understanding is correct,
> except for frontswap case, we reach try_to_unuse() only after we
> disable the swap device. So I think, we would not be seeing any more
> swap usage on the disabled swap device, after we loop through all the
> process and swapin the pages on that device. In that case, we would
> not need the retry logic right?

Wrong. Without heavier locking that would add unwelcome overhead to
common paths, we shall "always" need the retry logic. It does not
come into play very often, but here are two examples of why it's
needed (if I thought longer, I might find more). And in practice,
yes, I sometimes saw 1 retry needed.

One, the issue already discussed, of a multiply-mapped page which is
swapped out, one pte swapped off, but swapped back in by concurrent
fault before the last pte has been swapped off and the page finally
deleted from swap cache. That swapin still references the disabled
swapfile, and will need a retry to unuse (and that retry might need
another). We may fix this later with an rmap walk while still holding
page locked for the first pte; but even if we do, I'd still want to
retain the retry logic, to avoid dependence on corner-case-free
reliable rmap walks.

Two, get_swap_page() allocated a swap entry for shmem file or vma
just before the swapoff started, but the swapper did not reach the
point of inserting that swap entry before try_to_unuse() scanned
the shmem file or vma in question.

> For frontswap case, the patch was missing a check for pages_to_unuse.
> We would still need the retry logic, but as you mentioned, I can
> easily remove the oldi logic and make it simpler. Or probably,
> refactor the frontswap code out as a special case if pages_to_unuse is
> still not zero after the initial loop.

I don't use frontswap myself, and haven't paid any attention to the
frontswap partial swapoff case (though notice now that shmem_unuse()
lacks the plumbing needed for it - that needs fixing); but doubt it
would be a good idea to refactor it out as a separate case.