Re: [PATCH v2 33/39] x86/cpufeatures: Limit shadow stack to Intel CPUs

From: Edgecombe, Rick P
Date: Tue Oct 04 2022 - 16:35:21 EST


On Tue, 2022-10-04 at 14:43 -0500, John Allen wrote:
> On 10/4/22 10:47 AM, Nathan Chancellor wrote:
> > Hi Kees,
> >
> > On Mon, Oct 03, 2022 at 09:54:26PM -0700, Kees Cook wrote:
> > > On Mon, Oct 03, 2022 at 05:09:04PM -0700, Dave Hansen wrote:
> > > > On 10/3/22 16:57, Kees Cook wrote:
> > > > > On Thu, Sep 29, 2022 at 03:29:30PM -0700, Rick Edgecombe
> > > > > wrote:
> > > > > > Shadow stack is supported on newer AMD processors, but the
> > > > > > kernel
> > > > > > implementation has not been tested on them. Prevent basic
> > > > > > issues from
> > > > > > showing up for normal users by disabling shadow stack on
> > > > > > all CPUs except
> > > > > > Intel until it has been tested. At which point the
> > > > > > limitation should be
> > > > > > removed.
> > > > > >
> > > > > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
> > > > >
> > > > > So running the selftests on an AMD system is sufficient to
> > > > > drop this
> > > > > patch?
> > > >
> > > > Yes, that's enough.
> > > >
> > > > I _thought_ the AMD folks provided some tested-by's at some
> > > > point in the
> > > > past. But, maybe I'm confusing this for one of the other
> > > > shared
> > > > features. Either way, I'm sure no tested-by's were dropped on
> > > > purpose.
> > > >
> > > > I'm sure Rick is eager to trim down his series and this would
> > > > be a great
> > > > patch to drop. Does anyone want to make that easy for Rick?
> > > >
> > > > <hint> <hint>
> > >
> > > Hey Gustavo, Nathan, or Nick! I know y'all have some fancy AMD
> > > testing
> > > rigs. Got a moment to spin up this series and run the selftests?
> > > :)
> >
> > I do have access to a system with an EPYC 7513, which does have
> > Shadow
> > Stack support (I can see 'shstk' in the "Flags" section of lscpu
> > with
> > this series). As far as I understand it, AMD only added Shadow
> > Stack
> > with Zen 3; my regular AMD test system is Zen 2 (probably should
> > look at
> > procurring a Zen 3 or Zen 4 one at some point).
> >
> > I applied this series on top of 6.0 and reverted this change then
> > booted
> > it on that system. After building the selftest (which did require
> > 'make headers_install' and a small addition to make it build beyond
> > that, see below), I ran it and this was the result. I am not sure
> > if
> > that is expected or not but the other results seem promising for
> > dropping this patch.
> >
> > $ ./test_shadow_stack_64
> > [INFO] new_ssp = 7f8a36c9fff8, *new_ssp = 7f8a36ca0001
> > [INFO] changing ssp from 7f8a374a0ff0 to 7f8a36c9fff8
> > [INFO] ssp is now 7f8a36ca0000
> > [OK] Shadow stack pivot
> > [OK] Shadow stack faults
> > [INFO] Corrupting shadow stack
> > [INFO] Generated shadow stack violation successfully
> > [OK] Shadow stack violation test
> > [INFO] Gup read -> shstk access success
> > [INFO] Gup write -> shstk access success
> > [INFO] Violation from normal write
> > [INFO] Gup read -> write access success
> > [INFO] Violation from normal write
> > [INFO] Gup write -> write access success
> > [INFO] Cow gup write -> write access success
> > [OK] Shadow gup test
> > [INFO] Violation from shstk access
> > [OK] mprotect() test
> > [OK] Userfaultfd test
> > [FAIL] Alt shadow stack test
>
> The selftest is looking OK on my system (Dell PowerEdge R6515 w/ EPYC
> 7713). I also just pulled a fresh 6.0 kernel and applied the series
> including the fix Nathan mentions below.
>
> $ tools/testing/selftests/x86/test_shadow_stack_64
> [INFO] new_ssp = 7f30cccc5ff8, *new_ssp = 7f30cccc6001
> [INFO] changing ssp from 7f30cd4c6ff0 to 7f30cccc5ff8
> [INFO] ssp is now 7f30cccc6000
> [OK] Shadow stack pivot
> [OK] Shadow stack faults
> [INFO] Corrupting shadow stack
> [INFO] Generated shadow stack violation successfully
> [OK] Shadow stack violation test
> [INFO] Gup read -> shstk access success
> [INFO] Gup write -> shstk access success
> [INFO] Violation from normal write
> [INFO] Gup read -> write access success
> [INFO] Violation from normal write
> [INFO] Gup write -> write access success
> [INFO] Cow gup write -> write access success
> [OK] Shadow gup test
> [INFO] Violation from shstk access
> [OK] mprotect() test
> [OK] Userfaultfd test
> [OK] Alt shadow stack test.

Thanks for the testing. Based on the test, I wonder if this could be a
SW bug. Nathan, could I send you a tweaked test with some more debug
information?

John, are we sure AMD and Intel systems behave the same with respect to
CPUs not creating Dirty=1,Write=0 PTEs in rare situations? Or any other
CET related differences we should hash out? Otherwise I'll drop the
patch for the next version. (and assuming the issue Nathan hit doesn't
turn up anything HW related).