Re: [PATCH -mm -v5 RESEND] mm, swap: Fix race between swapoff and some swap operations
From: Huang\, Ying
Date: Tue Feb 13 2018 - 19:38:14 EST
Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> writes:
> On Tue, 13 Feb 2018 09:42:20 +0800 "Huang, Ying" <ying.huang@xxxxxxxxx> wrote:
>
>> From: Huang Ying <ying.huang@xxxxxxxxx>
>>
>> When the swapin is performed, after getting the swap entry information
>> from the page table, system will swap in the swap entry, without any
>> lock held to prevent the swap device from being swapoff. This may
>> cause the race like below,
>
> Sigh. In terms of putting all the work into the swapoff path and
> avoiding overheads in the hot paths, I guess this is about as good as
> it will get.
>
> It's a very low-priority fix so I'd prefer to keep the patch in -mm
> until Hugh has had an opportunity to think about it.
>
>> ...
>>
>> +/*
>> + * Check whether swap entry is valid in the swap device. If so,
>> + * return pointer to swap_info_struct, and keep the swap entry valid
>> + * via preventing the swap device from being swapoff, until
>> + * put_swap_device() is called. Otherwise return NULL.
>> + */
>> +struct swap_info_struct *get_swap_device(swp_entry_t entry)
>> +{
>> + struct swap_info_struct *si;
>> + unsigned long type, offset;
>> +
>> + if (!entry.val)
>> + goto out;
>> + type = swp_type(entry);
>> + if (type >= nr_swapfiles)
>> + goto bad_nofile;
>> + si = swap_info[type];
>> +
>> + preempt_disable();
>
> This preempt_disable() is later than I'd expect. If a well-timed race
> occurs, `si' could now be pointing at a defunct entry. If that
> well-timed race include a swapoff AND a swapon, `si' could be pointing
> at the info for a new device?
struct swap_info_struct pointed to by swap_info[] will never be freed.
During swapoff, we only free the memory pointed to by the fields of
struct swap_info_struct. And when swapon, we will always reuse
swap_info[type] if it's not NULL. So it should be safe to dereference
swap_info[type] with preemption enabled.
Best Regards,
Huang, Ying
>> + if (!(si->flags & SWP_VALID))
>> + goto unlock_out;
>> + offset = swp_offset(entry);
>> + if (offset >= si->max)
>> + goto unlock_out;
>> +
>> + return si;
>> +bad_nofile:
>> + pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val);
>> +out:
>> + return NULL;
>> +unlock_out:
>> + preempt_enable();
>> + return NULL;
>> +}