RE: [PATCH 1/1] mm: memory-failure: Re-split hw-poisoned huge page on -EAGAIN

From: Zhuo, Qiuxu
Date: Wed Dec 20 2023 - 03:57:57 EST


Hi Miaohe,

Thanks for the review.
Please see the comments below.

> From: Miaohe Lin <linmiaohe@xxxxxxxxxx>
> ...
> > +
> > +static void split_thp_work_fn(struct work_struct *work) {
> > + struct split_thp_req *req = container_of(work, typeof(*req),
> work.work);
> > + int ret;
> > +
> > + /* Split the thp. */
> > + get_page(req->thp);
>
> Can req->thp be freed when split_thp_work_fn is scheduled ?

It's possible. Thanks for catching this.

Instead of making a new work to re-split the thp,
I'll leverage the existing memory_failure_queue() to resplit the thp in the v2.

>
> > + lock_page(req->thp);
> > + ret = split_huge_page(req->thp);
> > + unlock_page(req->thp);
> > + put_page(req->thp);
> > +
> > + /* Retry with an exponential backoff. */
> > + if (ret && ++req->retries < SPLIT_THP_MAX_RETRY_CNT) {
> > + schedule_delayed_work(to_delayed_work(work),
> > +
> msecs_to_jiffies(SPLIT_THP_INIT_DELAYED_MS << req->retries));
> > + return;
> > + }
> > +
> > + pr_err("%#lx: split unsplit thp %ssuccessfully.\n", page_to_pfn(req-
> >thp), ret ? "un" : "");
> > + kfree(req);
> > + split_thp_pending = false;
>
> split_thp_pending is not protected against split_thp_delayed? Though this
> race should be benign.

Thanks for being concerned about this.

As the Read-Check-Modify of "split_thp_pending" is protected by the
mutex " &mf_mutex", and the worker only modified it to false (no read on it).
In theory, there is no race here.

Will leverage the existing memory_failure_queue() in v2. There should be no
such concern about this race. 😊

-Qiuxu