Re: filesystem corruption with "scsi: core: Reallocate device's budget map on queue depth change"

From: James Bottomley
Date: Wed Mar 30 2022 - 09:32:07 EST


On Wed, 2022-03-30 at 13:59 +0100, John Garry wrote:
> On 30/03/2022 12:21, Andrea Righi wrote:
> > On Wed, Mar 30, 2022 at 11:38:02AM +0100, John Garry wrote:
> > > On 30/03/2022 11:11, Andrea Righi wrote:
> > > > Hello,
> > > >
> > > > after this commit I'm experiencing some filesystem corruptions
> > > > at boot on a power9 box with an aacraid controller.
> > > >
> > > > At the moment I'm running a 5.15.30 kernel; when the filesystem
> > > > is mounted at boot I see the following errors in the console:
>
> About "scsi: core: Reallocate device's budget map on queue depth
> change" being added to a stable kernel, I am not sure if this was
> really a fix or just a memory optimisation.

I can see how it becomes the problem: it frees and allocates a new
bitmap across a queue freeze, but bits in the old one might still be in
use. This isn't a problem except when they return and we now possibly
see a tag greater than we think we can allocate coming back.
Presumably we don't check this and we end up doing a write to
unallocated memory.

I think if you want to reallocate on queue depth reduction, you might
have to drain the queue as well as freeze it.

James