Re: [PATCH RFC V3 5/9] x86/pks: Add PKS kernel API

From: Ira Weiny
Date: Wed Oct 14 2020 - 21:11:44 EST


On Tue, Oct 13, 2020 at 11:43:57AM -0700, Dave Hansen wrote:
> > +static inline void pks_update_protection(int pkey, unsigned long protection)
> > +{
> > + current->thread.saved_pkrs = update_pkey_val(current->thread.saved_pkrs,
> > + pkey, protection);
> > + preempt_disable();
> > + write_pkrs(current->thread.saved_pkrs);
> > + preempt_enable();
> > +}
>
> Why does this need preempt count manipulation in addition to the
> get/put_cpu_var() inside of write_pkrs()?

This is a bug. The disable should be around the update_pkey_val().

>
> > +/**
> > + * PKS access control functions
> > + *
> > + * Change the access of the domain specified by the pkey. These are global
> > + * updates. They only affects the current running thread. It is undefined and
> > + * a bug for users to call this without having allocated a pkey and using it as
> > + * pkey here.
> > + *
> > + * pks_mknoaccess()
> > + * Disable all access to the domain
> > + * pks_mkread()
> > + * Make the domain Read only
> > + * pks_mkrdwr()
> > + * Make the domain Read/Write
> > + *
> > + * @pkey the pkey for which the access should change.
> > + *
> > + */
> > +void pks_mknoaccess(int pkey)
> > +{
> > + pks_update_protection(pkey, PKEY_DISABLE_ACCESS);
> > +}
> > +EXPORT_SYMBOL_GPL(pks_mknoaccess);
>
> These are named like PTE manipulation functions, which is kinda weird.
>
> What's wrong with: pks_disable_access(pkey) ?

Internal review suggested these names. I'm not dead set on them.

FWIW I would rather they not get to wordy.

I was trying to get some consistency with pks_mk*() as meaning PKS 'make' X.

Do me 'disable' implies a state transition where 'make' implies we are
'setting' an absolute value. I think the later is a better name. And 'make'
made more sense because 'set' is so overloaded IHO.

>
> > +void pks_mkread(int pkey)
> > +{
> > + pks_update_protection(pkey, PKEY_DISABLE_WRITE);
> > +}
> > +EXPORT_SYMBOL_GPL(pks_mkread);
>
> I really don't like this name. It doesn't make readable, or even
> read-only, *especially* if it was already access-disabled.

Ok.

But it does sense if going from access-disable to read, correct?. I could see
this being better named pks_mkreadonly() so that going from RW to this would
make more sense. Especially after thinking about it above 'read only' needs to
be in the name.

Before I change anything I'd like to get consensus on naming.

How about the following?

pks_mk_noaccess()
pks_mk_readonly()
pks_mk_readwrite()

?

>
> > +static const char pks_key_user0[] = "kernel";
> > +
> > +/* Store names of allocated keys for debug. Key 0 is reserved for the kernel. */
> > +static const char *pks_key_users[PKS_NUM_KEYS] = {
> > + pks_key_user0
> > +};
> > +
> > +/*
> > + * Each key is represented by a bit. Bit 0 is set for key 0 and reserved for
> > + * its use. We use ulong for the bit operations but only 16 bits are used.
> > + */
> > +static unsigned long pks_key_allocation_map = 1 << PKS_KERN_DEFAULT_KEY;
> > +
> > +/*
> > + * pks_key_alloc - Allocate a PKS key
> > + *
> > + * @pkey_user: String stored for debugging of key exhaustion. The caller is
> > + * responsible to maintain this memory until pks_key_free().
> > + */
> > +int pks_key_alloc(const char * const pkey_user)
> > +{
> > + int nr;
> > +
> > + if (!cpu_feature_enabled(X86_FEATURE_PKS))
> > + return -EINVAL;
>
> I'm not sure I like -EINVAL for this. I thought we returned -ENOSPC for
> this case for user pkeys.

-ENOTSUP?

I'm not really sure anyone will need to know the difference between the
platform not supporting the key vs running out of them. But they are 2
different error conditions.

>
> > + while (1) {
> > + nr = find_first_zero_bit(&pks_key_allocation_map, PKS_NUM_KEYS);
> > + if (nr >= PKS_NUM_KEYS) {
> > + pr_info("Cannot allocate supervisor key for %s.\n",
> > + pkey_user);
> > + return -ENOSPC;

We return -ENOSPC here when running out of keys.

> > + }
> > + if (!test_and_set_bit_lock(nr, &pks_key_allocation_map))
> > + break;
> > + }
> > +
> > + /* for debugging key exhaustion */
> > + pks_key_users[nr] = pkey_user;
> > +
> > + return nr;
> > +}
> > +EXPORT_SYMBOL_GPL(pks_key_alloc);
> > +
> > +/*
> > + * pks_key_free - Free a previously allocate PKS key
> > + *
> > + * @pkey: Key to be free'ed
> > + */
> > +void pks_key_free(int pkey)
> > +{
> > + if (!cpu_feature_enabled(X86_FEATURE_PKS))
> > + return;
> > +
> > + if (pkey >= PKS_NUM_KEYS || pkey <= PKS_KERN_DEFAULT_KEY)
> > + return;
>
> This seems worthy of a WARN_ON_ONCE() at least. It's essentially
> corrupt data coming into a kernel API.

Ok, Done,
Ira