Re: [PATCH -mm -v5 RESEND] mm, swap: Fix race between swapoff and some swap operations

From: Andrew Morton
Date: Fri Feb 16 2018 - 18:38:31 EST


On Wed, 14 Feb 2018 08:38:00 +0800 "Huang\, Ying" <ying.huang@xxxxxxxxx> wrote:

> Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> writes:
>
> > On Tue, 13 Feb 2018 09:42:20 +0800 "Huang, Ying" <ying.huang@xxxxxxxxx> wrote:
> >
> >> From: Huang Ying <ying.huang@xxxxxxxxx>
> >>
> >> When the swapin is performed, after getting the swap entry information
> >> from the page table, system will swap in the swap entry, without any
> >> lock held to prevent the swap device from being swapoff. This may
> >> cause the race like below,
> >
> > Sigh. In terms of putting all the work into the swapoff path and
> > avoiding overheads in the hot paths, I guess this is about as good as
> > it will get.
> >
> > It's a very low-priority fix so I'd prefer to keep the patch in -mm
> > until Hugh has had an opportunity to think about it.
> >
> >> ...
> >>
> >> +/*
> >> + * Check whether swap entry is valid in the swap device. If so,
> >> + * return pointer to swap_info_struct, and keep the swap entry valid
> >> + * via preventing the swap device from being swapoff, until
> >> + * put_swap_device() is called. Otherwise return NULL.
> >> + */
> >> +struct swap_info_struct *get_swap_device(swp_entry_t entry)
> >> +{
> >> + struct swap_info_struct *si;
> >> + unsigned long type, offset;
> >> +
> >> + if (!entry.val)
> >> + goto out;
> >> + type = swp_type(entry);
> >> + if (type >= nr_swapfiles)
> >> + goto bad_nofile;
> >> + si = swap_info[type];
> >> +
> >> + preempt_disable();
> >
> > This preempt_disable() is later than I'd expect. If a well-timed race
> > occurs, `si' could now be pointing at a defunct entry. If that
> > well-timed race include a swapoff AND a swapon, `si' could be pointing
> > at the info for a new device?
>
> struct swap_info_struct pointed to by swap_info[] will never be freed.
> During swapoff, we only free the memory pointed to by the fields of
> struct swap_info_struct. And when swapon, we will always reuse
> swap_info[type] if it's not NULL. So it should be safe to dereference
> swap_info[type] with preemption enabled.

That's my point. If there's a race window during which there is a
parallel swapoff+swapon, this swap_info_struct may now be in use for a
different device?