Re: [RFC PATCH] mm: move xa forward when run across zombie page

From: Matthew Wilcox
Date: Fri Oct 14 2022 - 08:12:14 EST


On Fri, Oct 14, 2022 at 01:30:48PM +0800, zhaoyang.huang wrote:
> From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
>
> Bellowing RCU stall is reported where kswapd traps in a live lock when shrink
> superblock's inode list. The direct reason is zombie page keeps staying on the
> xarray's slot and make the check and retry loop permanently. The root cause is unknown yet
> and supposed could be an xa update without synchronize_rcu etc. I would like to
> suggest skip this page to break the live lock as a workaround.

No, the underlying bug should be fixed.

> if (!folio || xa_is_value(folio))
> return folio;
>
> - if (!folio_try_get_rcu(folio))
> + if (!folio_try_get_rcu(folio)) {
> + xas_advance(xas, folio->index + folio_nr_pages(folio) - 1);
> goto reset;
> + }

You can't do this anyway. To call folio_nr_pages() and to look at
folio->index, you must have a refcount on the page, and this is the
path where we failed to get the refcount.