Re: [PATCH 0/2] fsdax,xfs: fix warning messages

From: Dan Williams
Date: Wed Nov 30 2022 - 16:49:20 EST


Andrew Morton wrote:
> On Tue, 29 Nov 2022 19:59:14 -0800 Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
> > [ add Andrew ]
> >
> > Shiyang Ruan wrote:
> > > Many testcases failed in dax+reflink mode with warning message in dmesg.
> > > This also effects dax+noreflink mode if we run the test after a
> > > dax+reflink test. So, the most urgent thing is solving the warning
> > > messages.
> > >
> > > Patch 1 fixes some mistakes and adds handling of CoW cases not
> > > previously considered (srcmap is HOLE or UNWRITTEN).
> > > Patch 2 adds the implementation of unshare for fsdax.
> > >
> > > With these fixes, most warning messages in dax_associate_entry() are
> > > gone. But honestly, generic/388 will randomly failed with the warning.
> > > The case shutdown the xfs when fsstress is running, and do it for many
> > > times. I think the reason is that dax pages in use are not able to be
> > > invalidated in time when fs is shutdown. The next time dax page to be
> > > associated, it still remains the mapping value set last time. I'll keep
> > > on solving it.
> > >
> > > The warning message in dax_writeback_one() can also be fixed because of
> > > the dax unshare.
> >
> > Thank you for digging in on this, I had been pinned down on CXL tasks
> > and worried that we would need to mark FS_DAX broken for a cycle, so
> > this is timely.
> >
> > My only concern is that these patches look to have significant collisions with
> > the fsdax page reference counting reworks pending in linux-next. Although,
> > those are still sitting in mm-unstable:
> >
> > http://lore.kernel.org/r/20221108162059.2ee440d5244657c4f16bdca0@xxxxxxxxxxxxxxxxxxxx
>
> As far as I know, Dan's "Fix the DAX-gup mistake" series is somewhat
> stuck. Jan pointed out:
>
> https://lore.kernel.org/all/20221109113849.p7pwob533ijgrytu@quack3/T/#u
>
> or have Jason's issues since been addressed?

No, they have not. I do think the current series is a step forward, but
given the urgency remains low for the time being (CXL hotplug use case
further out, no known collisions with ongoing folio work, and no
MEMORY_DEVICE_PRIVATE users looking to build any conversions on top for
6.2) I am ok to circle back for 6.3 for that follow on work to be
integrated.

> > My preference would be to move ahead with both in which case I can help
> > rebase these fixes on top. In that scenario everything would go through
> > Andrew.
> >
> > However, if we are getting too late in the cycle for that path I think
> > these dax-fixes take precedence, and one more cycle to let the page
> > reference count reworks sit is ok.
>
> That sounds a decent approach. So we go with this series ("fsdax,xfs:
> fix warning messages") and aim at 6.3-rc1 with "Fix the DAX-gup
> mistake"?
>

Yeah, that's the path of least hassle.