Re: [PATCH v7 01/41] Documentation/x86: Add CET shadow stack description

From: szabolcs.nagy@xxxxxxx
Date: Thu Mar 02 2023 - 11:16:06 EST


The 03/01/2023 18:07, Edgecombe, Rick P wrote:
> On Wed, 2023-03-01 at 14:21 +0000, Szabolcs Nagy wrote:
> > The 02/27/2023 14:29, Rick Edgecombe wrote:
> > > +Application Enabling
> > > +====================
> > > +
> > > +An application's CET capability is marked in its ELF note and can
> > > be verified
> > > +from readelf/llvm-readelf output::
> > > +
> > > + readelf -n <application> | grep -a SHSTK
> > > + properties: x86 feature: SHSTK
> > > +
> > > +The kernel does not process these applications markers directly.
> > > Applications
> > > +or loaders must enable CET features using the interface described
> > > in section 4.
> > > +Typically this would be done in dynamic loader or static runtime
> > > objects, as is
> > > +the case in GLIBC.
> >
> > Note that this has to be an early decision in libc (ld.so or static
> > exe start code), which will be difficult to hook into system wide
> > security policy settings. (e.g. to force shstk on marked binaries.)
>
> In the eager enabling (by the kernel) scenario, how is this improved?
> The loader has to have the option to disable the shadow stack if
> enabling conditions are not met, so it still has to trust userspace to
> not do that. Did you have any more specifics on how the policy would
> work?

i guess my issue is that the arch prctls only allow self policing.
there is no kernel mechanism to set policy from outside the process
that is either inherited or asynchronously set. policy is completely
managed by libc (and done very early).

now i understand that async disable does not work (thanks for the
explanation), but some control for forced enable/locking inherited
across exec could work.

> > From userspace POV I'd prefer if a static exe did not have to parse
> > its own ELF notes (i.e. kernel enabled shstk based on the marking).
>
> This is actually exactly what happens in the glibc patches. My
> understand was that it already been discussed amongst glibc folks.

there were many glibc patches some of which are committed despite not
having an accepted linux abi, so i'm trying to review the linux abi
contracts and expect this patch to be authorative, please bear with me.

> > - I think it's better to have a new limit specifically for shadow
> > stack size (which by default can be RLIMIT_STACK) so userspace
> > can adjust it if needed (another reason is that stack size is
> > not always a good indicator of max call depth).
>
> Hmm, yea. This seems like a good idea, but I don't see why it can't be
> a follow on. The series is quite big just to get the basics. I have
> tried to save some of the enhancements (like alt shadow stack) for the
> future.

it is actually not obvious how to introduce a limit so it is inherited
or reset in a sensible way so i think it is useful to discuss it
together with other issues.

> > The kernel pushes on the shadow stack on signal entry so shadow stack
> > overflow cannot be handled. Please document this as non-recoverable
> > failure.
>
> It doesn't hurt to call it out. Please see the below link for future
> plans to handle this scenario (alt shadow stack).
>
> >
> > I think it can be made recoverable if signals with alternate stack
> > run
> > on a different shadow stack. And the top of the thread shadow stack
> > is
> > just corrupted instead of pushed in the overflow case. Then longjmp
> > out
> > can be made to work (common in stack overflow handling cases), and
> > reliable crash report from the signal handler works (also common).
> >
> > Does SSP get stored into the sigcontext struct somewhere?
>
> No, it's pushed to the shadow stack only. See the v2 coverletter of the
> discussion on the design and reasoning:
>
> https://lore.kernel.org/lkml/20220929222936.14584-1-rick.p.edgecombe@xxxxxxxxx/

i think this should be part of the initial design as it may be hard
to change later.

"sigaltshstk() is separate from sigaltstack(). You can have one
without the other, neither or both together. Because the shadow
stack specific state is pushed to the shadow stack, the two
features don’t need to know about each other."

this means they cannot be changed together atomically.

i'd expect most sigaltstack users to want to be resilient
against shadow stack overflow which means non-portable
code changes.

i don't see why automatic alt shadow stack allocation would
not work (kernel manages it transparently when an alt stack
is installed or disabled).

"Since shadow alt stacks are a new feature, longjmp()ing from an
alt shadow stack will simply not be supported. If a libc want’s
to support this it will need to enable WRSS and write it’s own
restore token."

i think longjmp should work without enabling writes to the shadow
stack in the libc. this can also affect unwinding across signal
handlers (not for c++ but e.g. glibc thread cancellation).

i'd prefer overwriting the shadow stack top entry on overflow to
disallowing longjmp out of a shadow stack overflow handler.

> > I think the map_shadow_stack syscall should be mentioned in this
> > document too.
>
> There is a man page prepared for this. I plan to update the docs to
> reference it when it exists and not duplicate the text. There can be a
> blurb for the time being but it would be short lived.

i wanted to comment on the syscall because i think it may be better
to have a magic mmap MAP_ flag that takes care of everything.

but i can go comment on the specific patch then.

thanks.