Re: [GIT PULL] x86/mm changes for v5.8

From: Linus Torvalds
Date: Tue Jun 02 2020 - 19:29:03 EST


On Tue, Jun 2, 2020 at 4:01 PM Singh, Balbir <sblbir@xxxxxxxxxx> wrote:
>
> > (c) and if I read the code correctly, trying to flush the L1D$ on
> > non-intel without the HW support, it causes a WARN_ON_ONCE()! WTF?
>
> That is not correct, the function only complains if we do a software fallback
> flush without allocating the flush pages.

Right.

And if you're not on Intel, then that allocation would never have been
done, since the allocation function returns an error for non-intel
systems.

> That function is not exposed without
> the user using the prctl() API, which allocates those flush pages.

See above: it doesn't actually allocate those pages on anything but intel CPU's.

That said, looking deeper, it then does look like a
l1d_flush_init_once() failure will also cause the code to avoid
setting the TIF_SPEC_L1D_FLUSH bit, so non-intel CPU's will never call
the actual flushing routines, and thus never hit the WARN_ON. Ok.

> > (2) the HW case is done for any vendor, if it reports the "I have the MSR"
>
> No l1d_flush_init_once() fails for users opting in via the prctl(), it
> succeeds for users of L1TF.

Yeah, again it looks like this all is basically just a hack for Intel CPU's.

It should never have been conditional on "do this on Intel".

It should have been conditional on the L1TF bug.

Yes, there's certainly overlap there, but it's not complete.

> > (3) the VMX support certainly has various sanity checks like "oh, CPU
> > doesn't have X86_BUG_L1TF, then I won't do this even if there was some
> > kernel command line to say I should". But the new prctrl doesn't have
> > anything like that. It just enables that L1D$ thing mindlessly,
> > thinking that user-land software somehow knows what it's doing. BS.
>
> So you'd like to see a double opt-in?

I'd like it to be gated on being sane by default, together with some
system option like we have for pretty much all the mitigations.

> Unforunately there is no gating
> of the bug and I tried to make it generic - clearly calling it opt-in
> flushing for the paranoid, for those who really care about CVE-2020-0550.

No, you didn't make it generic at all - you made it depend on
X86_VENDOR_INTEL instead.

So now the logic is "on Intel, do this thing whether it makes sense or
not, on other vendors, never do it whether it _would_ make sense or
not".

That to me is not sensible. I just don't see the logic.

This feature should never be enabled unless X86_BUG_L1TF is on, as far
as I can tell.

And it should never be enabled if SMT is on.

At that point, it at least starts making sense. Maybe we don't need
any further admin options at that point.

> Would this make you happier?
>
> 1. Remove SW fallback flush
> 2. Implement a double opt-in (CAP_SYS_ADMIN for the prctl or a
> system wide disable)?
> 3. Ensure the flush happens only when the current core has
> SMT disabled

I think that (3) case should basically be "X86_BUG_L1TF && !SMT". That
should basically be the default setting for this.

The (2) thing I would prefer to just be the same kind of thing we do
for all the other mitigations: have a kernel command line to override
the defaults.

The SW fallback right now feels wrong to me. It does seem to be very
microarchitecture-specific and I'd really like to understand the
reason for the magic TLB filling. At the same time, if the feature is
at least enabled under sane and understandable circumstances, and
people have a way to turn it off, maybe I don't care too much.

Linus