Re: linux-next: runtime warning after merge of the cel-fixes tree

From: Chuck Lever III
Date: Thu Apr 07 2022 - 12:04:50 EST




> On Apr 7, 2022, at 11:42 AM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Thu, 7 Apr 2022 15:35:24 +0000
> Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>
>>> --- a/kernel/trace/trace_events.c
>>> +++ b/kernel/trace/trace_events.c
>>> @@ -392,8 +392,9 @@ static void test_event_printk(struct trace_event_call *call)
>>> if (!(dereference_flags & (1ULL << arg)))
>>> goto next_arg;
>>>
>>> - /* Check for __get_sockaddr */;
>>> - if (str_has_prefix(fmt + i, "__get_sockaddr(")) {
>>> + /* Check for __get_sockaddr or __get_dynamic_array */;
>>> + if (str_has_prefix(fmt + i, "__get_sockaddr(") ||
>>> + str_has_prefix(fmt + i, "__get_dynamic_array(")) {
>>> dereference_flags &= ~(1ULL << arg);
>>> goto next_arg;
>>> }
>>
>> That looks reasonable for present and future kernels.
>
> Actually, I found an issue with the above that will not fix it, but the fix
> to that is not that difficult (currently testing it this time).

If you'd like to test it yourself, the commit that hits this
check is:

https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=for-rc&id=e2e917f8677da3a2c486c32a19964b3428024ce9


>> We're looking for a solution that can be back-ported to stable,
>> however, because the patch Mr. Rothwell had to revert is meant
>> to address a NULL pointer deref that was introduced several years
>> ago. (Otherwise I would have just used __get_sockaddr() and put
>> my pencil down).
>
> I can make a stable version of this patch, as __get_dynamic_array() has
> been around for a long time. We could even keep the check for
> "__get_sock_addr()" even though it does not exist in older kernels, but
> this is just a string check, so that doesn't matter.
>
>>
>> The simplest option is to take a brute-force approach and
>> convert the sockaddr to a presentation address with an snprintf()
>> call in the TP_fast_assign() arm of the tracepoints. Allow that
>> to be carried into -stable as needed; then in v5.19 apply a
>> clean-up that converts that mess into a proper __get_sockaddr().
>>
>> That way, it stays my issue to address without needing to
>> co-ordinate with fixes in other areas.
>
> This change makes the tracing system more robust, which means it's not just
> changing to solve an issue with your code. I'm happy to get this working
> and backported.

I'm not saying "don't bother fixing the deref check". Backporting
it if you can would be awesome!

But /depending/ on that fix makes fixing my NULL crash into a two-
patch cross-subsystem fix. We would have to document that carefully
so distributors can see that dependency if they need the NULL deref
fix in their kernels.

A narrow single patch fix for my NULL crash would be better for
them IMO, so I'm going to go with the snprintf() version of the
fix.


--
Chuck Lever