Re: [PATCH for 4.19] tracepoint: Fix: out-of-bound tracepoint array iteration

From: Mathieu Desnoyers
Date: Sat Oct 13 2018 - 15:11:15 EST


----- On Oct 13, 2018, at 2:34 PM, Mathieu Desnoyers mathieu.desnoyers@xxxxxxxxxxxx wrote:

> ----- On Oct 13, 2018, at 11:24 AM, Ard Biesheuvel ard.biesheuvel@xxxxxxxxxx
> wrote:
>
>> On 12 October 2018 at 23:07, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote:
>>> Hi Mathieu,
>>>
>>> On 12 October 2018 at 22:05, Mathieu Desnoyers
>>> <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>>>> commit 46e0c9be206f ("kernel: tracepoints: add support for relative
>>>> references") changes the layout of the __tracepoint_ptrs section on
>>>> architectures supporting relative references. However, it does so
>>>> without turning struct tracepoint * const into const int * elsewhere in
>>>> the tracepoint code, which has the following side-effect:
>>>>
>>>> tracepoint_module_{coming,going} invoke
>>>> tp_module_going_check_quiescent() with mod->tracepoints_ptrs
>>>> as first argument, and computes the end address of the array
>>>> for the second argument with:
>>>>
>>>> mod->tracepoints_ptrs + mod->num_tracepoints
>>>>
>>>> However, because the type of mod->tracepoint_ptrs in module.h
>>>> has not been changed from pointer to int, it passes an end
>>>> pointer which is twice larger than the array, causing out-of-bound
>>>> array accesses.
>>>>
>>>> Fix this by introducing a new typedef: tracepoint_ptr_t, which
>>>> is either "const int" on architectures that have PREL32 relocations,
>>>> or "struct tracepoint * const" on architectures that does not have
>>>> this feature.
>>>>
>>>> Also provide a new tracepoint_ptr_defer() static inline to
>>>> encapsulate deferencing this type rather than duplicate code and
>>>> ugly idefs within the for_each_tracepoint_range() implementation.
>>>>
>>>
>>> Apologies for the breakage. FWIW, this looks like the correct approach
>>> to me (and mirrors what I did for initcalls in the same series)
>>>
>>>> This issue appears in 4.19-rc kernels, and should ideally be fixed
>>>> before the end of the rc cycle.
>>>>
>>>
>>> +1
>>>
>>>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
>>>> Link: http://lkml.kernel.org/r/20180704083651.24360-7-ard.biesheuvel@xxxxxxxxxx
>>>> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
>>>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>>>> Cc: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
>>>> Cc: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
>>>> Cc: Arnd Bergmann <arnd@xxxxxxxx>
>>>> Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
>>>> Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
>>>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
>>>> Cc: James Morris <james.morris@xxxxxxxxxxxxx>
>>>> Cc: James Morris <jmorris@xxxxxxxxx>
>>>> Cc: Jessica Yu <jeyu@xxxxxxxxxx>
>>>> Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
>>>> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
>>>> Cc: Nicolas Pitre <nico@xxxxxxxxxx>
>>>> Cc: Paul Mackerras <paulus@xxxxxxxxx>
>>>> Cc: Petr Mladek <pmladek@xxxxxxxx>
>>>> Cc: Russell King <linux@xxxxxxxxxxxxxxx>
>>>> Cc: "Serge E. Hallyn" <serge@xxxxxxxxxx>
>>>> Cc: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>
>>>> Cc: Thomas Garnier <thgarnie@xxxxxxxxxx>
>>>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>>>> Cc: Will Deacon <will.deacon@xxxxxxx>
>>>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>>>> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>>>> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
>>>
>>> Acked-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
>>>
>>
>> This fixes the build breakage for me that kbuild test robot reports.
>>
>> diff --git a/include/linux/module.h b/include/linux/module.h
>> index cdab2451d6be..e19ae08c7fb8 100644
>> --- a/include/linux/module.h
>> +++ b/include/linux/module.h
>> @@ -20,6 +20,7 @@
>> #include <linux/export.h>
>> #include <linux/rbtree_latch.h>
>> #include <linux/error-injection.h>
>> +#include <linux/tracepoint-defs.h>
>
> You've beat me to it :) I'll fold this change in a v2 of the patch,

Digging a bit deeper into module.c, I notice it's not really an
out-of-bound access that is generated by this issue, because
setting mod->num_tracepoints is done in by module.c like this:

mod->tracepoints_ptrs = section_objs(info, "__tracepoints_ptrs",
sizeof(*mod->tracepoints_ptrs),
&mod->num_tracepoints);

So basically, since sizeof(*mod->tracepoints_ptrs) is a pointer size
(rather than sizeof(int)), num_tracepoints is erroneously set to half the
size it should be on 64-bit arch. So we an odd number of tracepoints, we
lose the last tracepoint due to effect of integer division.

So in the module going notifier:

for_each_tracepoint_range(mod->tracepoints_ptrs,
mod->tracepoints_ptrs + mod->num_tracepoints,
tp_module_going_check_quiescent, NULL);

the expression (mod->tracepoints_ptrs + mod->num_tracepoints) actually
evaluates to something within the bounds of the array, but miss the
last tracepoint if the number of tracepoints is odd on 64-bit arch.

So I'll also update the patch changelog in v2. Given it does not change
the patch content, I'll keep your acked-by. Please let me know if you
spot anything.

Thanks,

Mathieu



>
> Thanks!
>
> Mathieu
>
>>
>> #include <linux/percpu.h>
>> #include <asm/module.h>
>>
>>
>>
>>
>>
>>>> ---
>>>> include/linux/module.h | 2 +-
>>>> include/linux/tracepoint-defs.h | 6 ++++++
>>>> include/linux/tracepoint.h | 36 +++++++++++++++++++++------------
>>>> kernel/tracepoint.c | 24 ++++++++--------------
>>>> 4 files changed, 38 insertions(+), 30 deletions(-)
>>>>
>>>> diff --git a/include/linux/module.h b/include/linux/module.h
>>>> index f807f15bebbe..cdab2451d6be 100644
>>>> --- a/include/linux/module.h
>>>> +++ b/include/linux/module.h
>>>> @@ -430,7 +430,7 @@ struct module {
>>>>
>>>> #ifdef CONFIG_TRACEPOINTS
>>>> unsigned int num_tracepoints;
>>>> - struct tracepoint * const *tracepoints_ptrs;
>>>> + tracepoint_ptr_t *tracepoints_ptrs;
>>>> #endif
>>>> #ifdef HAVE_JUMP_LABEL
>>>> struct jump_entry *jump_entries;
>>>> diff --git a/include/linux/tracepoint-defs.h b/include/linux/tracepoint-defs.h
>>>> index 22c5a46e9693..49ba9cde7e4b 100644
>>>> --- a/include/linux/tracepoint-defs.h
>>>> +++ b/include/linux/tracepoint-defs.h
>>>> @@ -35,6 +35,12 @@ struct tracepoint {
>>>> struct tracepoint_func __rcu *funcs;
>>>> };
>>>>
>>>> +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
>>>> +typedef const int tracepoint_ptr_t;
>>>> +#else
>>>> +typedef struct tracepoint * const tracepoint_ptr_t;
>>>> +#endif
>>>> +
>>>> struct bpf_raw_event_map {
>>>> struct tracepoint *tp;
>>>> void *bpf_func;
>>>> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
>>>> index 041f7e56a289..538ba1a58f5b 100644
>>>> --- a/include/linux/tracepoint.h
>>>> +++ b/include/linux/tracepoint.h
>>>> @@ -99,6 +99,29 @@ extern void syscall_unregfunc(void);
>>>> #define TRACE_DEFINE_ENUM(x)
>>>> #define TRACE_DEFINE_SIZEOF(x)
>>>>
>>>> +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
>>>> +static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
>>>> +{
>>>> + return offset_to_ptr(p);
>>>> +}
>>>> +
>>>> +#define __TRACEPOINT_ENTRY(name) \
>>>> + asm(" .section \"__tracepoints_ptrs\", \"a\" \n" \
>>>> + " .balign 4 \n" \
>>>> + " .long __tracepoint_" #name " - . \n" \
>>>> + " .previous \n")
>>>> +#else
>>>> +static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
>>>> +{
>>>> + return *p;
>>>> +}
>>>> +
>>>> +#define __TRACEPOINT_ENTRY(name) \
>>>> + static tracepoint_ptr_t __tracepoint_ptr_##name __used \
>>>> + __attribute__((section("__tracepoints_ptrs"))) = \
>>>> + &__tracepoint_##name
>>>> +#endif
>>>> +
>>>> #endif /* _LINUX_TRACEPOINT_H */
>>>>
>>>> /*
>>>> @@ -253,19 +276,6 @@ extern void syscall_unregfunc(void);
>>>> return static_key_false(&__tracepoint_##name.key); \
>>>> }
>>>>
>>>> -#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
>>>> -#define __TRACEPOINT_ENTRY(name) \
>>>> - asm(" .section \"__tracepoints_ptrs\", \"a\" \n" \
>>>> - " .balign 4 \n" \
>>>> - " .long __tracepoint_" #name " - . \n" \
>>>> - " .previous \n")
>>>> -#else
>>>> -#define __TRACEPOINT_ENTRY(name) \
>>>> - static struct tracepoint * const __tracepoint_ptr_##name __used \
>>>> - __attribute__((section("__tracepoints_ptrs"))) = \
>>>> - &__tracepoint_##name
>>>> -#endif
>>>> -
>>>> /*
>>>> * We have no guarantee that gcc and the linker won't up-align the tracepoint
>>>> * structures, so we create an array of pointers that will be used for iteration
>>>> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
>>>> index bf2c06ef9afc..a3be42304485 100644
>>>> --- a/kernel/tracepoint.c
>>>> +++ b/kernel/tracepoint.c
>>>> @@ -28,8 +28,8 @@
>>>> #include <linux/sched/task.h>
>>>> #include <linux/static_key.h>
>>>>
>>>> -extern struct tracepoint * const __start___tracepoints_ptrs[];
>>>> -extern struct tracepoint * const __stop___tracepoints_ptrs[];
>>>> +extern tracepoint_ptr_t __start___tracepoints_ptrs[];
>>>> +extern tracepoint_ptr_t __stop___tracepoints_ptrs[];
>>>>
>>>> DEFINE_SRCU(tracepoint_srcu);
>>>> EXPORT_SYMBOL_GPL(tracepoint_srcu);
>>>> @@ -371,25 +371,17 @@ int tracepoint_probe_unregister(struct tracepoint *tp,
>>>> void *probe, void *data)
>>>> }
>>>> EXPORT_SYMBOL_GPL(tracepoint_probe_unregister);
>>>>
>>>> -static void for_each_tracepoint_range(struct tracepoint * const *begin,
>>>> - struct tracepoint * const *end,
>>>> +static void for_each_tracepoint_range(
>>>> + tracepoint_ptr_t *begin, tracepoint_ptr_t *end,
>>>> void (*fct)(struct tracepoint *tp, void *priv),
>>>> void *priv)
>>>> {
>>>> + tracepoint_ptr_t *iter;
>>>> +
>>>> if (!begin)
>>>> return;
>>>> -
>>>> - if (IS_ENABLED(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)) {
>>>> - const int *iter;
>>>> -
>>>> - for (iter = (const int *)begin; iter < (const int *)end; iter++)
>>>> - fct(offset_to_ptr(iter), priv);
>>>> - } else {
>>>> - struct tracepoint * const *iter;
>>>> -
>>>> - for (iter = begin; iter < end; iter++)
>>>> - fct(*iter, priv);
>>>> - }
>>>> + for (iter = begin; iter < end; iter++)
>>>> + fct(tracepoint_ptr_deref(iter), priv);
>>>> }
>>>>
>>>> #ifdef CONFIG_MODULES
>>>> --
>>>> 2.17.1
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com