Re: [RFC PATCH v2 22/27] x86/cet/ibt: User-mode indirect branch tracking support

From: Yu-cheng Yu
Date: Wed Jul 11 2018 - 18:14:00 EST


On Tue, 2018-07-10 at 17:11 -0700, Dave Hansen wrote:
> Is this feature *integral* to shadow stacks?ÂÂOr, should it just be
> in a
> different series?

The whole CET series is mostly about SHSTK and only a minority for IBT.
IBT changes cannot be applied by itself without first applying SHSTK
changes. ÂWould the titles help, e.g. x86/cet/ibt, x86/cet/shstk, etc.?

>
> >
> > diff --git a/arch/x86/include/asm/cet.h
> > b/arch/x86/include/asm/cet.h
> > index d9ae3d86cdd7..71da2cccba16 100644
> > --- a/arch/x86/include/asm/cet.h
> > +++ b/arch/x86/include/asm/cet.h
> > @@ -12,7 +12,10 @@ struct task_struct;
> > Âstruct cet_status {
> > Â unsigned long shstk_base;
> > Â unsigned long shstk_size;
> > + unsigned long ibt_bitmap_addr;
> > + unsigned long ibt_bitmap_size;
> > Â unsigned int shstk_enabled:1;
> > + unsigned int ibt_enabled:1;
> > Â};
> Is there a reason we're not using pointers here?ÂÂThis seems like the
> kind of place that we probably want __user pointers.

Yes, I will change that.

>
>
> >
> > +static unsigned long ibt_mmap(unsigned long addr, unsigned long
> > len)
> > +{
> > + struct mm_struct *mm = current->mm;
> > + unsigned long populate;
> > +
> > + down_write(&mm->mmap_sem);
> > + addr = do_mmap(NULL, addr, len, PROT_READ | PROT_WRITE,
> > + ÂÂÂÂÂÂÂMAP_ANONYMOUS | MAP_PRIVATE,
> > + ÂÂÂÂÂÂÂVM_DONTDUMP, 0, &populate, NULL);
> > + up_write(&mm->mmap_sem);
> > +
> > + if (populate)
> > + mm_populate(addr, populate);
> > +
> > + return addr;
> > +}
> We're going to have to start consolidating these at some point.ÂÂWe
> have
> at least three of them now, maybe more.

Maybe we can do the following in linux/mm.h?

+static inline unsigned long do_mmap_locked(addr, len, prot,
+ Â Â flags, vm_flags)
+{
+ struct mm_struct *mm = current->mm;
+ unsigned long populate;
+
+ down_write(&mm->mmap_sem);
+ addr = do_mmap(NULL, addr, len, prot, flags, vm_flags,
+ Â Â Â Â0, &populate, NULL);
+ up_write(&mm->mmap_sem);
+
+ if (populate)
+ mm_populate(addr, populate);
+
+ return addr;
+}Â

> >
> > +int cet_setup_ibt_bitmap(void)
> > +{
> > + u64 r;
> > + unsigned long bitmap;
> > + unsigned long size;
> > +
> > + if (!cpu_feature_enabled(X86_FEATURE_IBT))
> > + return -EOPNOTSUPP;
> > +
> > + size = TASK_SIZE_MAX / PAGE_SIZE / BITS_PER_BYTE;
> Just a note: this table is going to be gigantic on 5-level paging
> systems, and userspace won't, by default use any of that extra
> address
> space.ÂÂI think it ends up being a 512GB allocation in a 128TB
> address
> space.
>
> Is that a problem?
>
> On 5-level paging systems, maybe we should just stick it up in the
> high
> part of the address space.

We do not know in advance if dlopen() needs to create the bitmap. ÂDo
we always reserve high address or force legacy libs to low address?

>
> >
> > + bitmap = ibt_mmap(0, size);
> > +
> > + if (bitmap >= TASK_SIZE_MAX)
> > + return -ENOMEM;
> > +
> > + bitmap &= PAGE_MASK;
> We're page-aligning the result of an mmap()?ÂÂWhy?

This may not be necessary. ÂThe lower bits of MSR_IA32_U_CET are
settings and not part of the bitmap address. ÂIs this is safer?

>
> >
> > + rdmsrl(MSR_IA32_U_CET, r);
> > + r |= (MSR_IA32_CET_LEG_IW_EN | bitmap);
> > + wrmsrl(MSR_IA32_U_CET, r);
> Comments, please.ÂÂWhat is this doing, logically?ÂÂAlso, why are we
> OR'ing the results into this MSR?ÂÂWhat are we trying to preserve?

I will add comments.

>
> >
> > + current->thread.cet.ibt_bitmap_addr = bitmap;
> > + current->thread.cet.ibt_bitmap_size = size;
> > + return 0;
> > +}
> > +
> > +void cet_disable_ibt(void)
> > +{
> > + u64 r;
> > +
> > + if (!cpu_feature_enabled(X86_FEATURE_IBT))
> > + return;
> Does this need a check for being already disabled?

We need that. ÂWe cannot write to those MSRs if the CPU does not
support it.

>
> >
> > + rdmsrl(MSR_IA32_U_CET, r);
> > + r &= ~(MSR_IA32_CET_ENDBR_EN | MSR_IA32_CET_LEG_IW_EN |
> > + ÂÂÂÂÂÂÂMSR_IA32_CET_NO_TRACK_EN);
> > + wrmsrl(MSR_IA32_U_CET, r);
> > + current->thread.cet.ibt_enabled = 0;
> > +}
> What's the locking for current->thread.cet?

Now CET is not locked until the application callsÂARCH_CET_LOCK.

>
> >
> > diff --git a/arch/x86/kernel/cpu/common.c
> > b/arch/x86/kernel/cpu/common.c
> > index 705467839ce8..c609c9ce5691 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -413,7 +413,8 @@ __setup("nopku", setup_disable_pku);
> > Â
> > Âstatic __always_inline void setup_cet(struct cpuinfo_x86 *c)
> > Â{
> > - if (cpu_feature_enabled(X86_FEATURE_SHSTK))
> > + if (cpu_feature_enabled(X86_FEATURE_SHSTK) ||
> > + ÂÂÂÂcpu_feature_enabled(X86_FEATURE_IBT))
> > Â cr4_set_bits(X86_CR4_CET);
> > Â}
> > Â
> > @@ -434,6 +435,23 @@ static __init int setup_disable_shstk(char *s)
> > Â__setup("no_cet_shstk", setup_disable_shstk);
> > Â#endif
> > Â
> > +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
> > +static __init int setup_disable_ibt(char *s)
> > +{
> > + /* require an exact match without trailing characters */
> > + if (strlen(s))
> > + return 0;
> > +
> > + if (!boot_cpu_has(X86_FEATURE_IBT))
> > + return 1;
> > +
> > + setup_clear_cpu_cap(X86_FEATURE_IBT);
> > + pr_info("x86: 'no_cet_ibt' specified, disabling Branch
> > Tracking\n");
> > + return 1;
> > +}
> > +__setup("no_cet_ibt", setup_disable_ibt);
> > +#endif
> > Â/*
> > Â * Some CPU features depend on higher CPUID levels, which may not
> > always
> > Â * be available due to CPUID level capping or broken
> > virtualization
> > diff --git a/arch/x86/kernel/elf.c b/arch/x86/kernel/elf.c
> > index 233f6dad9c1f..42e08d3b573e 100644
> > --- a/arch/x86/kernel/elf.c
> > +++ b/arch/x86/kernel/elf.c
> > @@ -15,6 +15,7 @@
> > Â#include <linux/fs.h>
> > Â#include <linux/uaccess.h>
> > Â#include <linux/string.h>
> > +#include <linux/compat.h>
> > Â
> > Â/*
> > Â * The .note.gnu.property layout:
> > @@ -222,7 +223,8 @@ int arch_setup_features(void *ehdr_p, void
> > *phdr_p,
> > Â
> > Â struct elf64_hdr *ehdr64 = ehdr_p;
> > Â
> > - if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
> > + if (!cpu_feature_enabled(X86_FEATURE_SHSTK) &&
> > + ÂÂÂÂ!cpu_feature_enabled(X86_FEATURE_IBT))
> > Â return 0;
> > Â
> > Â if (ehdr64->e_ident[EI_CLASS] == ELFCLASS64) {
> > @@ -250,6 +252,9 @@ int arch_setup_features(void *ehdr_p, void
> > *phdr_p,
> > Â current->thread.cet.shstk_enabled = 0;
> > Â current->thread.cet.shstk_base = 0;
> > Â current->thread.cet.shstk_size = 0;
> > + current->thread.cet.ibt_enabled = 0;
> > + current->thread.cet.ibt_bitmap_addr = 0;
> > + current->thread.cet.ibt_bitmap_size = 0;
> > Â if (cpu_feature_enabled(X86_FEATURE_SHSTK)) {
> > Â if (shstk) {
> > Â err = cet_setup_shstk();
> > @@ -257,6 +262,15 @@ int arch_setup_features(void *ehdr_p, void
> > *phdr_p,
> > Â goto out;
> > Â }
> > Â }
> > +
> > + if (cpu_feature_enabled(X86_FEATURE_IBT)) {
> > + if (ibt) {
> > + err = cet_setup_ibt();
> > + if (err < 0)
> > + goto out;
> > + }
> > + }
> You introduced 'ibt' before it was used.ÂÂPlease wait to introduce it
> until you actually use it to make it easier to review.
>
> Also, what's wrong with:
>
> if (cpu_feature_enabled(X86_FEATURE_IBT) && ibt) {
> ...
> }
>
> ?

I will fix it.