Re: [GIT PULL] gfs2 fixes

From: Andreas Gruenbacher
Date: Fri Feb 11 2022 - 16:41:02 EST


On Fri, Feb 11, 2022 at 8:48 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Feb 11, 2022 at 9:05 AM Andreas Gruenbacher <agruenba@xxxxxxxxxx> wrote:
> >
> > * Revert debug commit that causes unexpected data corruption.
>
> Well, apparently not just unexpected, but unexplained too.
>
> That's a bit worrisome. It sounds like the corruption cause is still
> there, just hidden by the lack of __cond_resched()?

Yes, that's what it looks like. My initial suspicion was that we're
somewhere using gfs2_glock_dq() in non-sleepable context when we know
that we're not dropping the last reference and so gfs2_glock_dq()
won't sleep, but there's no such instance in the code, and testing
would also have revealed such cases. The corruption we've seen always
affects whole pages/blocks. Maybe it's an ordering / memory barrier
issue.

Thanks,
Andreas