Re: [PATCH -tip] kcov: Make runtime functions noinstr-compatible

From: Marco Elver
Date: Thu Jun 04 2020 - 10:23:53 EST


On Thu, 4 Jun 2020 at 16:03, Andrey Konovalov <andreyknvl@xxxxxxxxxx> wrote:
>
> On Thu, Jun 4, 2020 at 1:09 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Thu, Jun 04, 2020 at 11:50:57AM +0200, Marco Elver wrote:
> > > The KCOV runtime is very minimal, only updating a field in 'current',
> > > and none of __sanitizer_cov-functions generates reports nor calls any
> > > other external functions.
> >
> > Not quite true; it writes to t->kcov_area, and we need to make
> > absolutely sure that doesn't take faults or triggers anything else
> > untowards.
> >
> > > Therefore we can make the KCOV runtime noinstr-compatible by:
> > >
> > > 1. always-inlining internal functions and marking
> > > __sanitizer_cov-functions noinstr. The function write_comp_data() is
> > > now guaranteed to be inlined into __sanitize_cov_trace_*cmp()
> > > functions, which saves a call in the fast-path and reduces stack
> > > pressure due to the first argument being a constant.
>
> Maybe we could do CFLAGS_REMOVE_kcov.o = $(CC_FLAGS_FTRACE) the same
> way we do it for KASAN? And drop notrace/noinstr from kcov. Would it
> resolve the issue? I'm not sure which solution is better though.

Sadly no. 'noinstr' implies 'notrace', but also places the function in
the .noinstr.text section for the purpose of objtool checking. But: we
should only mark a function 'noinstr' if it (and its callees)
satisfies the requirements that Peter outlined (are the requirements
documented somewhere?). In particular, we need to worry about vmalloc
faults.

[...]
> > > -static void notrace write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
> > > +static __always_inline void write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
> > > {
> > > struct task_struct *t;
> > > u64 *area;
> > > @@ -231,59 +231,59 @@ static void notrace write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
> > > }
> > > }
> >
> > This thing; that appears to be the meat of it, right?
> >
> > I can't find where t->kcov_area comes from.. is that always
> > kcov_mmap()'s vmalloc_user() ?
> >
> > That whole kcov_remote stuff confuses me.
> >
> > KCOV_ENABLE() has kcov_fault_in_area(), which supposedly takes the
> > vmalloc faults for the current task, but who does it for the remote?
>
> Hm, no one. This might be an issue, thanks for noticing!
>
> > Now, luckily Joerg went and ripped out the vmalloc faults, let me check
> > where those patches are... w00t, they're upstream in this merge window.
>
> Could you point me to those patches?
>
> Even though it might work fine now, we might get issues if we backport
> remote kcov to older kernels.
>
> >
> > So no #PF from writing to t->kcov_area then, under the assumption that
> > the vmalloc_user() is the only allocation site.
> >
> > But then there's hardware watchpoints, if someone goes and sets a data
> > watchpoint in the kcov_area we're screwed. Nothing actively prevents
> > that from happening. Then again, the same is currently true for much of
> > current :/
> >
> > Also, I think you need __always_inline on kaslr_offset()
> >
> >
> > And, unrelated to this patch in specific, I suppose I'm going to have to
> > extend objtool to look for data that is used from noinstr, to make sure
> > we exclude it from inspection and stuff, like that kaslr offset crud for
> > example.
> >
> > Anyway, yes, it appears you're lucky (for having Joerg remove vmalloc
> > faults) and this mostly should work as is.

Now I am a bit worried that, even though we're lucky today, with what
Andrey said about e.g. kcov_remote faults, it'll be hard to ensure we
won't break in future. The exact set of conditions that mean we're
lucky today may change and we have no way of checking this.

I'll try to roll a v2 based on the "if (_RET_IP_ in noinstr section)
return;" and whitelist in objtool approach. Unless you see something
very wrong with that. And I do hope we'll get compiler attributes
eventually.

Thanks,
-- Marco