Re: WARN_ON_ONCE

From: Michael Ellerman
Date: Sat Dec 05 2020 - 08:51:45 EST


Alexey Kardashevskiy <aik@xxxxxxxxx> writes:
> On 04/12/2020 12:25, Michael Ellerman wrote:
>> Dmitry Vyukov <dvyukov@xxxxxxxxxx> writes:
>>> On Thu, Dec 3, 2020 at 10:19 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>>> On Thu, Dec 3, 2020 at 10:10 AM Alexey Kardashevskiy <aik@xxxxxxxxx> wrote:
>>>>>
>>>>> Hi!
>>>>>
>>>>> Syzkaller triggered WARN_ON_ONCE at
>>>>>
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/tracepoint.c?h=v5.10-rc6#n266
>>>>>
>>>>>
>>>>> ===
>>>>> static int tracepoint_add_func(struct tracepoint *tp,
>>>>> struct tracepoint_func *func, int prio)
>>>>> {
>>>>> struct tracepoint_func *old, *tp_funcs;
>>>>> int ret;
>>>>>
>>>>> if (tp->regfunc && !static_key_enabled(&tp->key)) {
>>>>> ret = tp->regfunc();
>>>>> if (ret < 0)
>>>>> return ret;
>>>>> }
>>>>>
>>>>> tp_funcs = rcu_dereference_protected(tp->funcs,
>>>>> lockdep_is_held(&tracepoints_mutex));
>>>>> old = func_add(&tp_funcs, func, prio);
>>>>> if (IS_ERR(old)) {
>>>>> WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM);
>>>>> return PTR_ERR(old);
>>>>> }
>>>>>
>>>>> ===
>>>>>
>>>>> What is the common approach here? Syzkaller reacts on this as if it was
>>>>> a bug but WARN_ON_ONCE here seems intentional. Do we still push for
>>>>> removing such warnings?
>>
>> AFAICS it is a bug if that fires.
>>
>> See the commit that added it:
>> d66a270be331 ("tracepoint: Do not warn on ENOMEM")
>>
>> Which says:
>> Tracepoint should only warn when a kernel API user does not respect the
>> required preconditions (e.g. same tracepoint enabled twice,
>
> This says that the userspace can trigger the warning if it does not use
> the API right.

No I don't think it says that.

It's saying that it should be a WARN if a *kernel* user of the
tracepoint API violates the API. The implication is that this condition
should never happen if the kernel is using the tracepoint API correctly,
and so if we hit this condition it indicates a bug in the kernel that
should be fixed.

>> or called
>> to remove a tracepoint that does not exist).
>>
>> Silence warning in out-of-memory conditions, given that the error is
>> returned to the caller.
>>
>>
>> So if you're seeing it then you've someone caused it to return something
>> other than ENOMEM, and that is a bug.
>
> This is an userspace bug which registers the same thing twice, the
> kernel returns a correct error. The question is should it warn by
> WARN_ON or pr_err(). The comment in bug.h suggests pr_err() is the right
> way, is not it?

Userspace must not be able to trigger a WARN.

What is the path into that code from userspace?

Either something on that path should be checking that it's not violating
the API and triggering the WARN, or if that's not possible/easy then the
WARN should be removed.

cheers