Re: [PATCH v2] init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov

From: Peter Oberparleiter
Date: Fri May 14 2021 - 09:58:08 EST


I'm adding UML maintainers for their input.

On 08.05.2021 15:50, Lambert wrote:
>> On Jan 21, 2021, at 1:04 AM, Peter Oberparleiter <oberpar@xxxxxxxxxxxxx> wrote:
>> On 20.01.2021 17:20, Johannes Berg wrote:
>>> From: Johannes Berg <johannes.berg@xxxxxxxxx>
>>>
>>> On ARCH=um, loading a module doesn't result in its constructors
>>> getting called, which breaks module gcov since the debugfs files
>>> are never registered. On the other hand, in-kernel constructors
>>> have already been called by the dynamic linker, so we can't call
>>> them again.
>>>
>>> Get out of this conundrum by allowing CONFIG_CONSTRUCTORS to be
>>> selected, but avoiding the in-kernel constructor calls.
>>>
>>> Also remove the "if !UML" from GCOV selecting CONSTRUCTORS now,
>>> since we really do want CONSTRUCTORS, just not kernel binary
>>> ones.
>>>
>>> Signed-off-by: Johannes Berg <johannes.berg@xxxxxxxxx>
>>
>> Looks good+nicer than v1 to me!
>>
>> I also tested this patch on s390 and can confirm that it doesn't break
>> gcov-kernel there.
>>
>> Reviewed-by: Peter Oberparleiter <oberpar@xxxxxxxxxxxxx>
>>
>> Andrew, do we need additional reviews/sign-offs? Could this go in via
>> your tree?
>>
>>> ---
>>> Tested with a kernel configured with CONFIG_GCOV_KERNEL, without
>>> the patch nothing ever appears in /sys/kernel/debug/gcov/ (apart
>>> from the reset file), and with it we get the files and they work.
>>
>> Just to be sure: could you confirm that this test statement for UML also
>> applies to v2, i.e. that it fixes the original problem you were seeing?
>>
>
> Hi Johannes and Peter, sorry to bother but I have one question
> on this change. The do_ctors() won’t be executed for UML
> because *the constructors have already been called for ELF*.
>
> The init_array section for each C file is linked to *vmlinux* between
> *__ctors_start* and *__ctors_end* symbols. See link:
> https://elixir.bootlin.com/linux/v5.12.2/source/include/asm-generic/vmlinux.lds.h#L676
>
> In my environment, UML+GCC 10, I can't find __gcov_init executed
> before kernel starts. So I did some trace and found glibc __libc_csu_init
> will only execute constructors between *__init_array_start*and *__init_array_end*.
> Which means if do_ctors() is not executed for UML, no elsewhere will
> the constructors be executed.
>
> Shall we remove the *!defined(CONFIG_UML)* for GCC, or I just missed
> some steps to make the GCOV work for UML?

I'm not an expert for the UML-arch, but based on this report I did some
investigations. I think that this patch might be partially in error.

Before the patch no constructors were called for UML and as a result
gcov-kernel profiling was not available, neither for built-in code nor
for module code.

With the patch CONFIG_CONSTRUCTORS was made available on UML and with it
gcov-kernel profiling. This works well for kernel modules (which I
assume is what Johannes tested), but it fails for built-in code. Note
that since UML does not specify ARCH_HAS_GCOV_PROFILE_ALL you need to
manually add GCOV_PROFILE := y to any kernel Makefile if you want to
enable profiling for that built-in code.

The subject patch is based on the assumption that constructors for
built-in kernel code is called from glibc when the vmlinux executable is
started. This does not work the way it was assumed it would, and would
be broken if it did.

It does not work because, as was already pointed out, glibc is looking
for the __init_array_start symbol to locate the list of constructor
function pointers, but the kernel uses a different symbol name
(__ctors_start). If you add the __init_array_start and _end symbols
glibc will call these functions correctly, but the kernel will crash.

The reason for the crash is that the kernel implementation of
__gcov_init() which is called from the constructors emitted by GCC's
gcov-profiling code assumes that base kernel facilities (such as printk)
have been initialized and are working as expected. Of course this is not
the case when the function is called by glibc before any kernel init
function was run.

I see multiple ways to go forward:

0. Do nothing

This patch doesn't break anything - it's just only partially effective
in that in enables gcov-kernel profiling for modules, but not for
built-in code.


1. Revert this patch

=> gcov-kernel profiling for kernel modules on UML would no longer work


2. Enable do_ctors() on UML

=> gcov-kernel profiling would work on UML for both kernel modules and
built-in code. There's a risk though that other constructors unrelated
to gcov might be called too late (I saw at least one
register_tm_clones() call while looking at the .init_array section of an
on UML executable). This risk is specific to UML code that runs before
start_kernel(). I'm not sure if there is anything there that consciously
relies on constructors.


3. Change gcov_init() to work without relying on kernel facilities and
add init_array_start and _end symbols to KERNEL_CTORS in
include/asm-generic/vmlinux.lds.h so that glibc can find and run
constructors before start_kernel(). Not sure if this is easily possible.
Certainly this is the option that results in the most amount of work.

=> gcov-kernel profiling would work on UML, any other constructors for
code run before start_kernel() would work, but this would also affect
other kernel code that relies on constructors (currently KASAN)... so
might also have further unwanted side-effects.


Any thoughts, comments or ideas for other alternatives?

>
>>>
>>> v2:
>>> * special-case UML instead of splitting the CONSTRUCTORS config
>>> ---
>>> init/Kconfig | 1 -
>>> init/main.c | 8 +++++++-
>>> kernel/gcov/Kconfig | 2 +-
>>> 3 files changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/init/Kconfig b/init/Kconfig
>>> index b77c60f8b963..29ad68325028 100644
>>> --- a/init/Kconfig
>>> +++ b/init/Kconfig
>>> @@ -76,7 +76,6 @@ config CC_HAS_ASM_INLINE
>>>
>>> config CONSTRUCTORS
>>> bool
>>> - depends on !UML
>>>
>>> config IRQ_WORK
>>> bool
>>> diff --git a/init/main.c b/init/main.c
>>> index c68d784376ca..a626e78dbf06 100644
>>> --- a/init/main.c
>>> +++ b/init/main.c
>>> @@ -1066,7 +1066,13 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
>>> /* Call all constructor functions linked into the kernel. */
>>> static void __init do_ctors(void)
>>> {
>>> -#ifdef CONFIG_CONSTRUCTORS
>>> +/*
>>> + * For UML, the constructors have already been called by the
>>> + * normal setup code as it's just a normal ELF binary, so we
>>> + * cannot do it again - but we do need CONFIG_CONSTRUCTORS
>>> + * even on UML for modules.
>>> + */
>>> +#if defined(CONFIG_CONSTRUCTORS) && !defined(CONFIG_UML)
>>> ctor_fn_t *fn = (ctor_fn_t *) __ctors_start;
>>>
>>> for (; fn < (ctor_fn_t *) __ctors_end; fn++)
>>> diff --git a/kernel/gcov/Kconfig b/kernel/gcov/Kconfig
>>> index 3110c77230c7..f62de2dea8a3 100644
>>> --- a/kernel/gcov/Kconfig
>>> +++ b/kernel/gcov/Kconfig
>>> @@ -4,7 +4,7 @@ menu "GCOV-based kernel profiling"
>>> config GCOV_KERNEL
>>> bool "Enable gcov-based kernel profiling"
>>> depends on DEBUG_FS
>>> - select CONSTRUCTORS if !UML
>>> + select CONSTRUCTORS
>>> default n
>>> help
>>> This option enables gcov-based code profiling (e.g. for code coverage
>>>
>>
>>
>> --
>> Peter Oberparleiter
>> Linux on Z Development - IBM Germany


--
Peter Oberparleiter
Linux on Z Development - IBM Germany