Re: Kernel NULL pointer deref and data corruptions with xfs on 6.1

From: Frederick Lawler
Date: Fri Aug 04 2023 - 12:57:36 EST


Hi Matthew,

On Thu, Jul 27, 2023 at 01:27:56PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 27, 2023 at 11:25:33AM +0100, Daniel Dao wrote:
> > On Thu, Jul 27, 2023 at 4:27 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > >
> > > On Fri, Jul 21, 2023 at 11:49:04AM +0100, Daniel Dao wrote:
> > > > We do not have a reproducer yet, but we now have more debugging data
> > > > which hopefully
> > > > should help narrow this down. Details as followed:
> > > >
> > > > 1. Kernel NULL pointer deferencences in __filemap_get_folio
> > > >
> > > > This happened on a few different hosts, with a few different repeated addresses.
> > > > The addresses are 0000000000000036, 0000000000000076,
> > > > 00000000000000f6. This looks
> > > > like the xarray is corrupted and we were trying to do some work on a
> > > > sibling entry.
> > >
> > > I think I have a fix for this one. Please try the attached.
> >
> > For some reason I do not see the attached patch. Can you resend it, or
> > is it the same
> > one as in https://bugzilla.kernel.org/show_bug.cgi?id=216646#c31 ?
>
> Yes, that's the one, sorry.

I setup a kernel with this patch to deploy out. It'll take some time to
see any results from that. I did run your multiorder.c changes with/without
the change to lib/xarray.c and that seemed to work as intended. I didn't see
any regressions across multiple seeds with our kernel config.

Fred