Re: [PATCH] s390/vfio-ap: do not open code locks for VFIO_GROUP_NOTIFY_SET_KVM notification
From: Halil Pasic
Date: Tue Jul 13 2021 - 12:45:33 EST
On Tue, 13 Jul 2021 09:48:01 -0400
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
> On 7/12/21 7:38 PM, Halil Pasic wrote:
> > On Wed, 7 Jul 2021 11:41:56 -0400
> > Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
> >
> >> It was pointed out during an unrelated patch review that locks should not
> >> be open coded - i.e., writing the algorithm of a standard lock in a
> >> function instead of using a lock from the standard library. The setting and
> >> testing of the kvm_busy flag and sleeping on a wait_event is the same thing
> >> a lock does. Whatever potential deadlock was found and reported via the
> >> lockdep splat was not magically removed by going to a wait_queue; it just
> >> removed the lockdep annotations that would identify the issue early
> > Did you change your opinion since we last talked about it? This reads to
> > me like we are deadlocky without this patch, because of the last
> > sentence.
>
> The words are a direct paraphrase of Jason G's responses to my
> query regarding what he meant by open coding locks. I
> am choosing to take his word on the subject and remove the
> open coded locks.
>
> Having said that, we do not have a deadlock problem without
> this patch. If you recall, the lockdep splat occurred ONLY when
> running a Secure Execution guest in a CI environment. Since
> AP is not yet supported for SE guests, there is no danger of
> a lockdep splat occurring in a customer environment. Given
> Jason's objections to the original solution (i.e., kvm_busy flag
> and wait queue), I decided to replace the so-called open
> coded locks.
I'm in favor of doing that. But if ("s390/vfio-ap: fix
circular lockdep when setting/clearing crypto masks") ain't buggy,
then this patch does not qualify for stable. For a complete set of
rules consult:
https://github.com/torvalds/linux/blob/master/Documentation/process/stable-kernel-rules.rst
Here the most relevant points:
* It must fix a real bug that bothers people (not a, "This could be a
problem..." type thing).
* t must fix a problem that causes a build error (but not for things
marked CONFIG_BROKEN), an oops, a hang, data corruption, a real security
issue, or some "oh, that's not good" issue. In short, something critical.
* No "theoretical race condition" issues, unless an explanation of how
the race can be exploited is also provided.
Jason may give it another try to convince us that 0cc00c8d4050 only
silenced lockdep, but vfio_ap remained prone to deadlocks. To my best
knowledge using condition variable and a mutex is one of the well known
ways to implement an rwlock.
In my opinion, you should drop the fixes tag, drop the cc stable, and
provide a patch description that corresponds to *your* understanding
of the situation.
Neither the Fixes tag or the stable process is (IMHO) meant for these
types of (style) issues. And if you don't think the alleged problem is
real, don't make the description of your patch say it is real.
Regards,
Halil
>
> >
> > Regards,
> > Halil
>