Re: [PATCH V10 14/44] mm/pkeys: Introduce pks_set_readwrite()

From: Ira Weiny
Date: Tue May 10 2022 - 17:33:09 EST


On Mon, May 09, 2022 at 02:38:38PM -0700, Kees Cook wrote:
> On Tue, Apr 19, 2022 at 10:06:19AM -0700, ira.weiny@xxxxxxxxx wrote:
> > From: Ira Weiny <ira.weiny@xxxxxxxxx>
> >
> > When kernel code needs access to a PKS protected page they will need to
> > change the protections for the pkey to Read/Write.
>
> I'm excited to have this infrastructure available! It'll finally give us
> the "write rarely" infrastructure we've needed:
> https://github.com/KSPP/linux/issues/130

Thanks!

[snip]

> >
> > @@ -275,4 +276,34 @@ void pks_setup(void)
> > cr4_set_bits(X86_CR4_PKS);
> > }
> >
> > +/*
> > + * Do not call this directly, see pks_set*().
> > + *
> > + * @pkey: Key for the domain to change
> > + * @protection: protection bits to be used
> > + *
> > + * Protection utilizes the same protection bits specified for User pkeys
> > + * PKEY_DISABLE_ACCESS
> > + * PKEY_DISABLE_WRITE
> > + *
> > + */
> > +void pks_update_protection(u8 pkey, u8 protection)
>
> For better security, I think this should be a static inline, not a
> callable (i.e. as a non-inline it could be the target of a control
> flow attack).

Good point! I'll move this to asm/pks.h.

>
> > +{
> > + u32 pkrs;
> > +
> > + if (!cpu_feature_enabled(X86_FEATURE_PKS))
> > + return;
> > +
> > + if (WARN_ON_ONCE(pkey >= PKS_KEY_MAX))
> > + return;
>
> I think this should enforce arguments being __builtin_constant_p(). i.e.
> making sure that all callers of pks_update_protection() are using a
> compile-time constant value. That makes it so that the caller location
> and key become hard-coded (i.e. further reduction in risk to becoming a
> control-flow gadget: the inlining of a const value means no arguments
> any more). For example:
>
> BUILD_BUG_ON(!__builtin_constant_p(pkey));
> BUILD_BUG_ON(!__builtin_constant_p(protection));

Sounds reasonable.

>
> (I think the test code will need some tweaking, but it should be
> possible to adjust it.)

I'll figure it out.

>
> > +
> > + pkrs = current->thread.pkrs;
> > + current->thread.pkrs = pkey_update_pkval(pkrs, pkey,
> > + protection);
> > + preempt_disable();
> > + pks_write_pkrs(current->thread.pkrs);
> > + preempt_enable();
>
> To resist cross-thread attacks, please:
>
> - make pkey_update_pkval() also an inline
> - use the pkrs variable directly and store it back only after the write
>
> For example:
>
> preempt_disable();
> pkrs = pkey_update_pkval(current->thread.pkrs,
> pkey, protection);
> pks_write_pkrs(pkrs);
> current->thread.pkrs = pkrs;
> preempt_enable();
>
> This means that the pkey/protection relationship always lives in a
> CPU-local register and cannot be manipulated by another CPU before the
> msr write completes.

Yes this sounds good. Thanks for the tip.

> Better yet would be:
>
> preempt_disable();
> rdmsrl(MSR_IA32_PKRS, pkrs);
> pkrs = pkey_update_pkval(pkrs, pkey, protection);
> pks_write_pkrs(pkrs);
> current->thread.pkrs = pkrs;
> preempt_enable();
>
> Then cross-thread attacks cannot corrupt the _other_ PKS keys (i.e.
> write the desired changes to target's current->thread.kprs and trigger
> an update to a different pkey, resulting in flushing the attacker's
> changes to that CPU's pkey state.

Unfortunately I don't think this entirely prevents an attack through the
thread.pkrs value. thread.pkrs has to be used to set the MSR when a thread is
scheduled. Therefore the rdmsrl above will by definition pick up the
thread.pkrs but from an earlier time.

I'm not opposed to doing this as I think it does reduce the time window of such
an attack but I wanted to mention it. Especially since I specifically avoided
ever reading the MSR to improve performance.

I'm going to run some tests. Perhaps the MSR read is not that big of a deal
and I can convince myself that the performance diff is negligible.

>
> > +/**
> > + * pks_set_readwrite() - Make the domain Read/Write
> > + * @pkey: the pkey for which the access should change.
> > + *
> > + * Allow all access, read and write, to the domain specified by pkey. This is
> > + * not a global update and only affects the current running thread.
> > + */
> > +static inline void pks_set_readwrite(u8 pkey)
> > +{
> > + pks_update_protection(pkey, PKEY_READ_WRITE);
> > +}
>
> While adding these, can you please also add pks_set_nowrite()? This
> will be needed for protecting writes to memory that should be otherwise
> readable.

I have a patch to add pks_set_readonly() but I was advised to not send it
because this series does not include a use case for it. (PMEM does not need
it.)

Dave, Dan? Are you ok adding that back?

Kees would you prefer pks_set_nowrite() as a name?

>
> With these changes it should be possible to protect the kernel's page
> table entries from "stray" writes. :)

Yes, Rick has done some great work in that area.

Ira

>
> -Kees
>
> > +
> > +#else /* !CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */
> > +
> > +static inline void pks_set_readwrite(u8 pkey) {}
> > +
> > +#endif /* CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */
> > +
> > +#endif /* _LINUX_PKS_H */
> > diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
> > index 6c1aa92a92e4..f179544bd33a 100644
> > --- a/include/uapi/asm-generic/mman-common.h
> > +++ b/include/uapi/asm-generic/mman-common.h
> > @@ -80,6 +80,7 @@
> > /* compatibility flags */
> > #define MAP_FILE 0
> >
> > +#define PKEY_READ_WRITE 0x0
> > #define PKEY_DISABLE_ACCESS 0x1
> > #define PKEY_DISABLE_WRITE 0x2
> > #define PKEY_ACCESS_MASK (PKEY_DISABLE_ACCESS |\
> > --
> > 2.35.1
> >
>
> --
> Kees Cook