On 05/03/2024 15:50, David Hildenbrand wrote:
On 05.03.24 16:13, Ryan Roberts wrote:
There was previously a theoretical window where swapoff() could run and
teardown a swap_info_struct while a call to free_swap_and_cache() was
running in another thread. This could cause, amongst other bad
possibilities, swap_page_trans_huge_swapped() (called by
free_swap_and_cache()) to access the freed memory for swap_map.
This is a theoretical problem and I haven't been able to provoke it from
a test case. But there has been agreement based on code review that this
is possible (see link below).
Fix it by using get_swap_device()/put_swap_device(), which will stall
swapoff(). There was an extra check in _swap_info_get() to confirm that
the swap entry was valid. This wasn't present in get_swap_device() so
I've added it. I couldn't find any existing get_swap_device() call sites
where this extra check would cause any false alarms.
Details of how to provoke one possible issue (thanks to David Hilenbrand
for deriving this):
Almost
"s/Hilenbrand/Hildenbrand/" :)
Ahh sorry... I even explicitly checked it against your name on emails... fat
fingers...
LGTM
Are you planning on sending a doc extension for get_swap_device()?
I saw your comment:
We should likely update the documentation of get_swap_device(), that after
decrementing the refcount, the SI might become stale and should not be touched
without a prior get_swap_device().
But when I went to make the changes, I saw the documentation already said:
...we need to enclose all swap related functions with get_swap_device() and
put_swap_device()... Notice that swapoff ... can still happen before the
percpu_ref_tryget_live() in get_swap_device() or after the percpu_ref_put() in
put_swap_device()... The caller must be prepared for that.
I thought that already covered it? I'm sure as usual, I've misunderstood your
point. Happy to respin if you have something in mind?