Re: [PATCH] bcachefs: don't call sleeping funcs when handling inconsistency errors

From: Kent Overstreet
Date: Wed Apr 02 2025 - 12:40:40 EST


On Wed, Apr 02, 2025 at 10:03:10PM +0530, Bharadwaj Raju wrote:
> On Wed, Apr 2, 2025 at 9:47 PM Kent Overstreet
> <kent.overstreet@xxxxxxxxx> wrote:
> >
> > On Wed, Apr 02, 2025 at 09:40:40PM +0530, Bharadwaj Raju wrote:
> > > In bch2_bkey_pick_read_device, we're in an RCU lock. So, we can't call
> > > any potentially-sleeping functions. However, we call bch2_dev_rcu,
> > > which calls bch2_fs_inconsistent in its error case. That then calls
> > > bch2_prt_print on a non-atomic printbuf, as well as uses the blocking
> > > variant of bch2_print_string_as_lines, both of which lead to calls to
> > > potentially-sleeping functions, namely krealloc with GFP_KERNEL
> > > and console_lock respectively.
> > >
> > > Give a nonzero atomic to the printbuf, and use the nonblocking variant
> > > of bch2_print_string_as_lines.
> >
> > Sorry, beat you to it :)
> >
> > You also missed the one the syzbot report actually hit -
> > bch2_inconsistent_error().
>
> Oops, thank you.
>
> If I'm not wrong, though, the bch2_print_string_as_lines
> still needs to be changed to bch2_print_string_as_lines_nonblocking?
>
> In my testing that also produces the same BUG warning.
>
> Should I make a patch for that?

Yeah, you're right - please do.

If you're feeling particularly adventurous - print_string_as_lines() is
a hack, I think we should be able to do something more robust by
skipping printk (that's where the 1k limit comes from) and calling
something lower level - that will require digging into the printk
codepath and finding lower level we can call.

I also just noticed that print_string_as_lines() needs to check for
being passed a NULL pointer - in case the printbuf memory allocation
fails. Want to get that one too?