Re: [patch 00/38] x86/retbleed: Call depth tracking mitigation

From: Sami Tolvanen
Date: Mon Jul 18 2022 - 19:11:20 EST


On Mon, Jul 18, 2022 at 3:59 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> On Mon, Jul 18 2022 at 15:48, Sami Tolvanen wrote:
> > On Mon, Jul 18, 2022 at 2:18 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >>
> >> On Mon, Jul 18, 2022 at 10:44:14PM +0200, Thomas Gleixner wrote:
> >> > And we need input from the Clang folks because their CFI work also puts
> >> > stuff in front of the function entry, which nicely collides.
> >>
> >> Right, I need to go look at the latest kCFI patches, that sorta got
> >> side-tracked for working on all the retbleed muck :/
> >>
> >> Basically kCFI wants to preface every (indirect callable) function with:
> >>
> >> __cfi_\func:
> >> int3
> >> movl $0x12345678, %rax
> >> int3
> >> int3
> >> \func:
> >
> > Yes, and in order to avoid scattering the code with call target
> > gadgets, the preamble should remain immediately before the function.
> >
> >> Ofc, we can still put the whole:
> >>
> >> sarq $5, PER_CPU_VAR(__x86_call_depth);
> >> jmp \func_direct
> >>
> >> thing in front of that.
> >
> > Sure, that would work.
> >
> >> But it does somewhat destroy the version I had that only needs the
> >> 10 bytes padding for the sarq.
> >
> > There's also the question of how function alignment should work in the
> > KCFI case. Currently, the __cfi_ preamble is 16-byte aligned, which
> > obviously means the function itself isn't.
>
> That's bad. The function entry should be 16 byte aligned and as I just
> learned for AMD the ideal alignment would be possibly 32 byte as that's
> their I-fetch width. But my experiments with 16 bytes alignment
> independent of the padding muck is benefitial for both AMD and Intel
> over the 4 byte alignment we have right now.

OK, that's what I thought. KCFI hasn't landed in Clang yet, so it
shouldn't be a problem to fix this.

Sami