Re: Litmus test for question from Al Viro
From: Paul E. McKenney
Date: Mon Oct 05 2020 - 10:01:06 EST
On Mon, Oct 05, 2020 at 10:12:48AM +0100, Will Deacon wrote:
> On Mon, Oct 05, 2020 at 09:20:03AM +0100, Will Deacon wrote:
> > On Sun, Oct 04, 2020 at 10:38:46PM -0400, Alan Stern wrote:
> > > On Sun, Oct 04, 2020 at 04:31:46PM -0700, Paul E. McKenney wrote:
> > > > Nice simple example! How about like this?
> > > >
> > > > Thanx, Paul
> > > >
> > > > ------------------------------------------------------------------------
> > > >
> > > > commit c964f404eabe4d8ce294e59dda713d8c19d340cf
> > > > Author: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> > > > Date: Sun Oct 4 16:27:03 2020 -0700
> > > >
> > > > manual/kernel: Add a litmus test with a hidden dependency
> > > >
> > > > This commit adds a litmus test that has a data dependency that can be
> > > > hidden by control flow. In this test, both the taken and the not-taken
> > > > branches of an "if" statement must be accounted for in order to properly
> > > > analyze the litmus test. But herd7 looks only at individual executions
> > > > in isolation, so fails to see the dependency.
> > > >
> > > > Signed-off-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > > >
> > > > diff --git a/manual/kernel/crypto-control-data.litmus b/manual/kernel/crypto-control-data.litmus
> > > > new file mode 100644
> > > > index 0000000..6baecf9
> > > > --- /dev/null
> > > > +++ b/manual/kernel/crypto-control-data.litmus
> > > > @@ -0,0 +1,31 @@
> > > > +C crypto-control-data
> > > > +(*
> > > > + * LB plus crypto-control-data plus data
> > > > + *
> > > > + * Result: Sometimes
> > > > + *
> > > > + * This is an example of OOTA and we would like it to be forbidden.
> > > > + * The WRITE_ONCE in P0 is both data-dependent and (at the hardware level)
> > > > + * control-dependent on the preceding READ_ONCE. But the dependencies are
> > > > + * hidden by the form of the conditional control construct, hence the
> > > > + * name "crypto-control-data". The memory model doesn't recognize them.
> > > > + *)
> > > > +
> > > > +{}
> > > > +
> > > > +P0(int *x, int *y)
> > > > +{
> > > > + int r1;
> > > > +
> > > > + r1 = 1;
> > > > + if (READ_ONCE(*x) == 0)
> > > > + r1 = 0;
> > > > + WRITE_ONCE(*y, r1);
> > > > +}
> > > > +
> > > > +P1(int *x, int *y)
> > > > +{
> > > > + WRITE_ONCE(*x, READ_ONCE(*y));
> > > > +}
> > > > +
> > > > +exists (0:r1=1)
> > >
> > > Considering the bug in herd7 pointed out by Akira, we should rewrite P1 as:
> > >
> > > P1(int *x, int *y)
> > > {
> > > int r2;
> > >
> > > r = READ_ONCE(*y);
> >
> > (r2?)
> >
> > > WRITE_ONCE(*x, r2);
> > > }
> > >
> > > Other than that, this is fine.
> >
> > But yes, module the typo, I agree that this rewrite is much better than the
> > proposal above. The definition of control dependencies on arm64 (per the Arm
> > ARM [1]) isn't entirely clear that it provides order if the WRITE is
> > executed on both paths of the branch, and I believe there are ongoing
> > efforts to try to tighten that up. I'd rather keep _that_ topic separate
> > from the "bug in herd" topic to avoid extra confusion.
>
> Ah, now I see that you're changing P1 here, not P0. So I'm now nervous
> about claiming that this is a bug in herd without input from Jade or Luc,
> as it does unfortunately tie into the definition of control dependencies
> and it could be a deliberate choice.
>
> Jade, Luc: apparently herd doesn't emit a control dependency edge from
> the READ_ONCE() to the WRITE_ONCE() in the following:
>
>
> P0(int *x, int *y)
> {
> int r1;
>
> r1 = 1;
> if (READ_ONCE(*x) == 0)
> r1 = 0;
> WRITE_ONCE(*y, r1);
> }
>
>
> Is that deliberate?
>
> Setting the arm64 architecture aside for one moment, I think the Linux
> memory model would very much like the control dependency to exist in this
> case. Documenting the unexpected outcome is one thing, but I think it would
> be much better to do it in a way where users can reason about whether or not
> they're falling into this trap rather than warning them that the results may
> be unreliable, which is not likely to build confidence in the tool.
It was in fact a deliberate choice. Exact modeling of what compilers can
and cannot do gets extremely computationally intensive very quickly given
the current state of the art.
Thanx, Paul