Re: [PATCH v5 0/9] mm/swap: Regular page swap optimizations
From: Michal Hocko
Date: Mon Jan 16 2017 - 07:02:46 EST
Hi,
I am seeing a lot of preempt unsafe warnings with the current mmotm and
I assume that this patchset has introduced the issue. I haven't checked
more closely but get_swap_page didn't use this_cpu_ptr before "mm/swap:
add cache for swap slots allocation"
[ 57.812314] BUG: using smp_processor_id() in preemptible [00000000] code: kswapd0/527
[ 57.814360] caller is debug_smp_processor_id+0x17/0x19
[ 57.815237] CPU: 1 PID: 527 Comm: kswapd0 Tainted: G W 4.9.0-mmotm-00135-g4e9a9895ebef #1042
[ 57.816019] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1 04/01/2014
[ 57.816019] ffffc900001939c0 ffffffff81329c60 0000000000000001 ffffffff81a0ce06
[ 57.816019] ffffc900001939f0 ffffffff81343c2a 00000000000137a0 ffffea0000dfd2a0
[ 57.816019] ffff88003c49a700 ffffc90000193b10 ffffc90000193a00 ffffffff81343c53
[ 57.816019] Call Trace:
[ 57.816019] [<ffffffff81329c60>] dump_stack+0x68/0x92
[ 57.816019] [<ffffffff81343c2a>] check_preemption_disabled+0xce/0xe0
[ 57.816019] [<ffffffff81343c53>] debug_smp_processor_id+0x17/0x19
[ 57.816019] [<ffffffff8115f06f>] get_swap_page+0x19/0x183
[ 57.816019] [<ffffffff8114e01d>] shmem_writepage+0xce/0x38c
[ 57.816019] [<ffffffff81148916>] shrink_page_list+0x81f/0xdbf
[ 57.816019] [<ffffffff81149652>] shrink_inactive_list+0x2ab/0x594
[ 57.816019] [<ffffffff8114a22f>] shrink_node_memcg+0x4c7/0x673
[ 57.816019] [<ffffffff8114a49f>] shrink_node+0xc4/0x282
[ 57.816019] [<ffffffff8114a49f>] ? shrink_node+0xc4/0x282
[ 57.816019] [<ffffffff8114b8cb>] kswapd+0x656/0x834
[ 57.816019] [<ffffffff8114b275>] ? mem_cgroup_shrink_node+0x2e1/0x2e1
[ 57.816019] [<ffffffff81069fb4>] ? call_usermodehelper_exec_async+0x124/0x12d
[ 57.816019] [<ffffffff81073621>] kthread+0xf9/0x101
[ 57.816019] [<ffffffff81660198>] ? _raw_spin_unlock_irq+0x2c/0x4a
[ 57.816019] [<ffffffff81073528>] ? kthread_park+0x5a/0x5a
[ 57.816019] [<ffffffff81069e90>] ? umh_complete+0x25/0x25
[ 57.816019] [<ffffffff81660b07>] ret_from_fork+0x27/0x40
I thought a simple
diff --git a/mm/swap_slots.c b/mm/swap_slots.c
index 8cf941e09941..732194de58a4 100644
--- a/mm/swap_slots.c
+++ b/mm/swap_slots.c
@@ -303,7 +303,7 @@ swp_entry_t get_swap_page(void)
swp_entry_t entry, *pentry;
struct swap_slots_cache *cache;
- cache = this_cpu_ptr(&swp_slots);
+ cache = &get_cpu_var(swp_slots);
entry.val = 0;
if (check_cache_active()) {
@@ -322,11 +322,13 @@ swp_entry_t get_swap_page(void)
}
mutex_unlock(&cache->alloc_lock);
if (entry.val)
- return entry;
+ goto out;
}
get_swap_pages(1, &entry);
+out:
+ put_cpu_var(swp_slots);
return entry;
}
would be a way to go but the function takes a sleeping lock so disabling
the preemption is not a way forward. So this is either preempt safe
for some reason - which should be IMHO documented in a comment - and
raw_cpu_ptr can be used or this needs a deeper thought.
--
Michal Hocko
SUSE Labs