Re: [PATCH] mm: fix list corruptions on shmem shrinklist

From: Andrew Morton
Date: Thu Aug 03 2017 - 19:11:52 EST


On Wed, 2 Aug 2017 22:46:30 -0700 Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:

> We saw many list corruption warnings on shmem shrinklist:
>
> ...
>
> The problem is that shmem_unused_huge_shrink() moves entries
> from the global sbinfo->shrinklist to its local lists and then
> releases the spinlock. However, a parallel shmem_setattr()
> could access one of these entries directly and add it back to
> the global shrinklist if it is removed, with the spinlock held.
>
> The logic itself looks solid since an entry could be either
> in a local list or the global list, otherwise it is removed
> from one of them by list_del_init(). So probably the race
> condition is that, one CPU is in the middle of INIT_LIST_HEAD()

Where is this INIT_LIST_HEAD()?

> but the other CPU calls list_empty() which returns true
> too early then the following list_add_tail() sees a corrupted
> entry.
>
> list_empty_careful() is designed to fix this situation.
>

I'm not sure I'm understanding this. AFAICT all the list operations to
which you refer are synchronized under spin_lock(&sbinfo->shrinklist_lock)?