Re: LVM snapshot broke between 4.14 and 4.16
From: Mike Snitzer
Date: Fri Aug 03 2018 - 15:30:41 EST
On Fri, Aug 03 2018 at 3:09pm -0400,
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Aug 3, 2018 at 11:54 AM Mike Snitzer <snitzer@xxxxxxxxxx> wrote:
> >
> >
> > As I explained to Ted in my previous reply to this thread: using an lvm2
> > that is of the same vintage of the kernel is generally going to provide
> > a more robust user experience
>
> You said that yes.
>
> And it is completely irrelevant.
>
> The fact is, if you use an older lvm2, then a newer kernel still needs
> to work. Your "more robust experience" argument has nothing
> what-so-ever to do with that.
>
> Will you get new features from newer user land tools? Sure, usually.
> And entirely immaterial to a kernel regression.
I was merely giving context for the suggestion of keeping lvm2 updated.
Not saying it was relevant for this regression.
> Will newer user land tools hopefully fix other issues? You'd hope so,
> but again - immaterial.
>
> So why are you bringing up a complete red herring? It's entirely
> immaterial to the actual issue at hand.
I was trying to give context for the "best to update lvm2 anyway"
disclaimer that was used. Yeah, it was specious.
And Zdenek exposed way more surface area for you to attack with his
reply to this thread. My initial response to this thread was far more
understated but was effectively: read-only dm-snapshot is rare, I'm
inclined to just let this be.
And yeah, that isn't a good excuse to ignore it but: dm-snapshot is a
steaming pile as compared to dm thin-provisioning so dm-snapshot users
who then go off the beaten path are already masochistic. SO the 2 users
who noticed can cope..
But that too is a cop-out.
> I would _hope_ that other projects hjave the same "no regressions"
> rule that the kernel has, but I know many don't. But whatever other
> projects are out there, and whatever other rules _they_ have for their
> development is also entirely immaterial to the kernel.
>
> The kernel has a simple rule: no user regressions.
>
> Yes, we've had to break that rule very occasionally - when the
> semantics are a huge honking security issue and cannot possibly be
> hidden any other way, then we obviously have to break them.
>
> So it has happened. It's happily quite rare.
>
> But in this case, the issue is that the block layer now enforces the
> read-only protection more. And it seems to be the case that the lvm
> tools set the read-only flag even when they then depended on being
> able to write to them, because we didn't use to.
>
> So just judging from that description, I do suspect that "we can't
> depend on the lvm read-only flag", so a patch like
>
> "let's not turn DM_READONLY_FLAG into actually set_disk_ro(dm_disk(md), 1)"
>
> makes sense.
>
> Obviously, if we can limit that more, that would be lovely.
>
> But dammit, NOBODY gets to say "oh, you should just update user land tools".
I'll have a closer look at all this.
Could be DM in general is lacking for read-only permissions when you
have complex stacking involved.
> Because when they do, I will explode. And I'm 1000% serious that I
> will refuse to work with people who continue to say that or continue
> to make excuses.
>
> And user land developers should damn well know about this. The fact
> that they are apparently not clued in about kernel rules is what
> allowed this bug to go undiscovered and unreported for much too long.
> Apparently the lvm2 user land developers *did* notice the breakage,
> but instead of reporting it as a kernel bug, they worked around it.
Yeap, they did.. I was unaware myself.
> So user land developers should actually know that if the kernel stops
> working for them, they should *not* work around it. Sure, fix your
> program, but let the kernel people know.
Agreed.
> And kernel people should know that "oh, the user land people already
> changed their behavior" is *not* a "I don't need to care about it".
> Unless the user land fix was so long ago that nobody cares any more.
I never didn't care. I just didn't care much. Because "dm-snapshot".
Mike