Re: [PATCH] fs: Fix freeze_bdev()/thaw_bdev() accounting of bd_fsfreeze_sb

From: Bob Peterson
Date: Fri Jan 08 2021 - 08:18:46 EST


----- Original Message -----
> This causes bdev->bd_fsfreeze_sb to be set to NULL even if the call to
> thaw_super right after this line fail. So if a caller tries to call
> thaw_bdev() again after receiving such an error, that next call won't even
> try to call thaw_super(). Is that what we want here? (I don't know much
> about this code, but from a cursory glance I think this difference is
> visible to emergency_thaw_bdev() in fs/buffer.c)
>
> In my version of the patch, I set bdev->bd_fsfreeze_sb to NULL only
> *after* we check that the call to thaw_super() succeeded to avoid this.

Yes, I see your point. Your patch is superior and I'll mine accordingly.

> Thanks a lot for investigating the bug and the patch I sent :)
> Was there actually an issue with that patch I sent? As you said, the bug

No, I never saw your patch until I saw Christoph's reference to it yesterday,
after I had been using my patch to fix the problem. AFAIK, there is no
problem with your patch.

> I think the second difference (decrementing bd_fsfreeze_count when
> get_active_super() returns NULL) doesn't change anything w.r.t the
> use-after-free. It does however, change the behaviour of the function
> slightly, and it might be caller visible (because from a cursory glance, it
> looks like we're reading the bd_fsfreeze_count from some other places like
> fs/super.c). Even before 040f04bd2e82, the code wouldn't decrement
> bd_fsfreeze_count when get_active_super() returned NULL - so is this change
> in behaviour intentional? And if so, maybe it should go in a separate
> patch?

This is the bigger issue, and I'm not very familiar with this code either,
so I'll defer to the experts. Yes, it's a change in behavior, but I think
it makes sense to decrement the bd_fsfreeze_count in this case. Here's why:

If the blockdev is frozen by freeze_bdev while it's being unmounted, the
bd_fsfreeze_count is incremented, but the freeze is ignored. Subsequent
attempts to thaw the device will be ignored but return 0 because the sb
is not found. When the device is mounted again, calls to freeze_bdev
will bypass the call to freeze_super for the newly mounted sb, because
bdev->bd_fsfreeze_count was then incremented from 1 to 2 in freeze_bdev.

if (++bdev->bd_fsfreeze_count > 1)
goto done;

So you're freezing the device without really freezing the superblock.
Seems like dangerous behavior to me. The new sb will only be frozen if
a second thaw is done, which gets them back in sync. I suppose we could
say this is acceptable loss, and your number of thaws should match your
freezes, and if they don't: user error. Still, it seems like we should do
something about it, like refuse to mount a frozen device. Perhaps it already
does that; I'll need to do some research.

Like I said, I don't know this code. I'm just trying to fix a problem
I observed. I'll defer to the experts.

Regards,

Bob Peterson