Re: [PATCH] arch: x86: power: cpu: init %gs before __restore_processor_state (clang)

From: Borislav Petkov
Date: Tue Sep 15 2020 - 18:07:11 EST


On Tue, Sep 15, 2020 at 12:51:47PM -0700, Nick Desaulniers wrote:
> I agree; I also would not have sent the patch though.

Maybe google folks should run stuff by you before sending it up... :-)

> Until LTO has landed upstream, this is definitely somewhat self
> inflicted. This was only debugged last week; even with a compiler fix
> in hand today, it still takes time to ship that compiler and qualify
> it; for other folks on tighter timelines, I can understand why the
> patch was sent,

... because they have the requirement that a patch which gets backported
to a kernel used at google needs to be upstream? Because I'm willing to
bet a lot of cash that no one runs bleeding egde 5.9-rcX in production
over there right now :-)

> and do genuinely appreciate the effort to participate more upstream
> which I'm trying to encourage more of throughout the company (we're
> in a lot of technical debt kernel-wise; and I'm not referring to
> Android...a story over beers perhaps, or ask Greg).

Beers? Always. But I can imagine the reasons: people working on projects
and then those projects getting done and no one cares about upstreaming
stuff after the fact or no one has time ... or policy ... but let's keep
that for beers. :-)

> It's just that this isn't really appropriate since it works around
> a bug in a non-upstream feature, and will go away once we fix the
> toolchain.

Hohumm.

> It would be much nicer if we had the flexibility to disable stack
> protectors per function, rather than per translation unit. I'm going
> to encourage you to encourage your favorite compile vendor ("write to
> your senator") to support the function attribute
> __attribute__((no_stack_protector)) so that one day,

I already forgot why gcc doesn't do that... Martin, do you know?

> we can use that to stop shipping crap like a9a3ed1eff360 ("x86: Fix
> early boot crash on gcc-10, third try"). Having had that, we could
> have used a nicer workaround until the toolchain was fixed (and one
> day revert a9a3ed1eff360, and d0a8d9378d16, and probably more hacks in
> the kernel).

Yap, agreed. I guess with those new compiler features it is always a
couple of releases - both kernel, i.e., the user of that feature, and
compiler, i.e., the provider of the feature, to both figure out what
the proper use cases are, to experiment a bit and then to adjust stuff,
change here and there and then cast in stone. Oh well.

> And the case that's causing the compiler bug in question is something
> all compiler vendors will need to consider in their implementations.

Are you talking to gcc folks about it already so that they DTRT too?

Btw, if it is any consolation, talking to compiler folks is like a charm
in comparison to talking to hardware vendors and trying to get them
to agree on something because they seem to think that the kernel is
software and sure, can be changed to do whatever. But that's another
story for the beers... :-)

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette