Re: BUG: unable to handle kernel NULL pointer dereference in xlog_cil_commit

From: Dave Chinner
Date: Wed Oct 06 2021 - 18:27:55 EST


On Wed, Oct 06, 2021 at 08:43:27AM -0700, Darrick J. Wong wrote:
> On Wed, Oct 06, 2021 at 04:14:43PM +0800, Hao Sun wrote:
> > Hello,
> >
> > When using Healer to fuzz the latest Linux kernel, the following crash
> > was triggered.
> >
> > HEAD commit: 0513e464f900 Merge tag 'perf-tools-fixes-for-v5.15-2021-09-27'
> > git tree: upstream
> > console output:
> > https://drive.google.com/file/d/1vm5fDM220kkghoiGa3Aw_Prl4O_pqAXF/view?usp=sharing
> > kernel config: https://drive.google.com/file/d/1Jqhc4DpCVE8X7d-XBdQnrMoQzifTG5ho/view?usp=sharing
> >
> > Sorry, I don't have a reproducer for this crash, hope the symbolized
> > report can help.
> > If you fix this issue, please add the following tag to the commit:
> > Reported-by: Hao Sun <sunhao.th@xxxxxxxxx>
>
> So figure out how to fix the problem and send a patch. You don't get to
> hand out fixit tasks like you're some kind of manager for people you
> don't employ.

I fully agree with this Darrick but, OTOH, the cynical, jaded
engineer in me says "I don't think people that run bots and
copy/paste their output to mailing lists have the capability to fix
the problems the bots find."

Quite frankly, it's even more of a waste of our time trying to
review crap patches and make suggestions to fix it and then going
around the review loop 15 times getting nowhere like we have in teh
past.

So, kvmalloc() sucks dogs balls, as I pointed out in this recent
patch in the intent whiteouts series:

https://lore.kernel.org/linux-xfs/20210902095927.911100-8-david@xxxxxxxxxxxxx/

Because of the crap implementation of kvmalloc(), we can't just pass
__GFP_NOFAIL because that will cause it to try to run
kmalloc_node(__GFP_NORETRY | __GFP_NOFAIL) and that will cause heads
to go all explodey. Not to mention that kvmalloc won't even allow
GFP_NOFS to be passed and still actually do the vmalloc() fallback.

So, basically, we've got to go back to doing an open coded kvmalloc
loop here that cannot fail. Because kvmalloc can fail and we can't
tell it that it must succeed or die trying.

That's what the above patch does - gets rid of the garbage kvmalloc
direct reclaim -> memory compaction behaviour, and wraps it in a
loop so that the fail-fast memory allocation semantics it uses does
not end up in a shadow buffer allocation failure.

So, yeah, I've already fixed this in my trees....

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx