Re: [thomas@xxxxxxxx: Re: [PATCH] um: Fix kcov crash before kernel is started.]

From: Dmitry Vyukov
Date: Mon Oct 09 2017 - 14:11:14 EST


On Mon, Oct 9, 2017 at 6:47 PM, Thomas Meyer <thomas@xxxxxxxx> wrote:
> ----- Forwarded message from Thomas Meyer <thomas@xxxxxxxx> -----
>
> Hi,
>
> are you able to shed light on this topic?
> Any help is greatly appreciated!
>
> With kind regards
> thomas
>
> Date: Sun, 8 Oct 2017 13:18:24 +0200
> From: Thomas Meyer <thomas@xxxxxxxx>
> To: Richard Weinberger <richard@xxxxxx>
> Cc: user-mode-linux-devel@xxxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] um: Fix kcov crash before kernel is started.
> User-Agent: NeoMutt/20170113 (1.7.2)
>
> On Sun, Oct 08, 2017 at 12:44:12PM +0200, Richard Weinberger wrote:
>> Am Sonntag, 8. Oktober 2017, 12:31:58 CEST schrieb Thomas Meyer:
>> > UMLs current_thread_info() unconditionally assumes that the top of the stack
>> > contains the thread_info structure. But on UML the __sanitizer_cov_trace_pc
>> > function is called for *all* functions! This results in an early crash:
>> >
>> > Prevent kcov from using invalid curent_thread_info() data by checking
>> > the system_state.
>> >
>> > Signed-off-by: Thomas Meyer <thomas@xxxxxxxx>
>> > ---
>> > kernel/kcov.c | 6 ++++++
>> > 1 file changed, 6 insertions(+)
>> >
>> > diff --git a/kernel/kcov.c b/kernel/kcov.c
>> > index 3f693a0f6f3e..d601c0e956f6 100644
>> > --- a/kernel/kcov.c
>> > +++ b/kernel/kcov.c
>> > @@ -56,6 +56,12 @@ void notrace __sanitizer_cov_trace_pc(void)
>> > struct task_struct *t;
>> > enum kcov_mode mode;
>> >
>> > +#ifdef CONFIG_UML
>> > + if(!(system_state == SYSTEM_SCHEDULING ||
>> > + system_state == SYSTEM_RUNNING))
>> > + return;
>> > +#endif
>>
>> Hmm, and why does it work on all other archs then?
>
> Hi,
>
> I guess UML is different then other archs! But to be honest I'm not sure
> why. I assume that __sanitizer_cov_trace_pc on other archs isn't called
> that early, or that curent_thread_info returns NULL on other archs when
> the first task isn't running yet.
>
> But as I fail to use/setup the qemu gdb attachment to debug early x86_64 code
> I can't say exactly why.
>
> Maybe someone how knows the inner workings of x86_64 and/or kcov can
> answer this question!


Hi,

Yes, kcov can have some issues with early bootstrap code, because it
accesses current and it can also conflict with say, per-cpu setup code
(at least it was the case for x86). For x86 and arm64 we just bulk
blacklist instrumentation of arch code involved in early bootstrap.
See e.g. KCOV_INSTRUMENT in arch/x86/boot/Makefile. I think you need
to do the same for um. Start with bulk ignoring as much as possible
until you get it booting and then bisect back from there.