Re: [PATCH v2] mm/vmscan: fix data races at kswapd_classzone_idx

From: Qian Cai
Date: Wed Feb 26 2020 - 08:49:36 EST


On Tue, 2020-02-25 at 20:06 -0800, Matthew Wilcox wrote:
> On Tue, Feb 25, 2020 at 10:58:27PM -0500, Qian Cai wrote:
> > pgdat->kswapd_classzone_idx could be accessed concurrently in
> > wakeup_kswapd(). Plain writes and reads without any lock protection
> > result in data races. Fix them by adding a pair of READ|WRITE_ONCE() as
> > well as saving a branch (compilers might well optimize the original code
> > in an unintentional way anyway). While at it, also take care of
> > pgdat->kswapd_order and non-kswapd threads in allow_direct_reclaim().
>
> I don't understand why the usages of kswapd_classzone_idx in kswapd() and
> kswapd_try_to_sleep() don't need changing too? kswapd_classzone_idx()
> looks safe to me, but I'm prone to missing stupid things that compilers
> are allowed to do.

Right, I did capture the race this time. I'll post a v3.

[ÂÂ924.803628][ T6299] BUG: KCSAN: data-race in kswapd / wakeup_kswapdÂ
[ÂÂ924.809949][ T6299]ÂÂ
[ÂÂ924.812170][ T6299] write to 0xffff90973ffff2dc of 4 bytes by task 820 on cpu
6:Â
[ÂÂ924.819630][ T6299]ÂÂkswapd+0x27c/0x8d0Â
[ÂÂ924.823509][ T6299]ÂÂkthread+0x1e0/0x200Â
[ÂÂ924.827471][ T6299]ÂÂret_from_fork+0x27/0x50Â
[ÂÂ924.831774][ T6299]ÂÂ
[ÂÂ924.833987][ T6299] read to 0xffff90973ffff2dc of 4 bytes by task 6299 on cpu
0:Â
[ÂÂ924.841442][ T6299]ÂÂwakeup_kswapd+0xf3/0x450Â
[ÂÂ924.845838][ T6299]ÂÂwake_all_kswapds+0x59/0xc0Â
[ÂÂ924.850409][ T6299]ÂÂ__alloc_pages_slowpath+0xdcc/0x1290Â
[ÂÂ924.855769][ T6299]ÂÂ__alloc_pages_nodemask+0x3bb/0x450Â
[ÂÂ924.861040][ T6299]ÂÂalloc_pages_vma+0x8a/0x2c0Â
[ÂÂ924.865612][ T6299]ÂÂdo_anonymous_page+0x170/0x700Â
[ÂÂ924.870443][ T6299]ÂÂ__handle_mm_fault+0xc9f/0xd00Â
[ÂÂ924.875276][ T6299]ÂÂhandle_mm_fault+0xfc/0x2f0Â
[ÂÂ924.879849][ T6299]ÂÂdo_page_fault+0x263/0x6f9Â
[ÂÂ924.884334][ T6299]ÂÂpage_fault+0x34/0x40Â