Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

From: Tetsuo Handa
Date: Mon May 25 2020 - 06:44:05 EST


On 2020/05/25 17:42, Petr Mladek wrote:
> I see few drawbacks with this patch:
>
> 1. It will cause adding much more messages into the logbuffer even
> though they are not flushed to the console. It might cause that
> more important messages will get overridden before they reach
> console. They might also make hard to read the full log.

Since the user of this twist option will select console loglevel in a way
KERN_DEBUG messages are not printed to consoles, KERN_DEBUG messages will
be immediately processed (and space for future messages will be reclaimed).
Therefore, I don't think that more important messages will get overridden.

This twist option might increase possibility of mixing KERN_DEBUG messages
and non-KERN_DEBUG messages due to KERN_CONT case.

But if these concerns turn out to be a real problem, we can redirect
pr_devel()/pr_debug() to simple snprintf() which evaluates arguments
but discards the result without storing into the logbuffer.

>
> 2. Crash inside printk() causes recursive messages. They are currently
> printed into the printk_safe() buffers and there is a bigger risk
> that they will not reach the console.

Currently "static char textbuf[LOG_LINE_MAX];" is "static" because it is used
under logbuf_lock. If we remove "static", we can use "char textbuf[LOG_LINE_MAX];"
without logbuf_lock. Then, we can bring potentially dangerous-and-slow vscnprintf()
in vprintk_store() to earlier stage (and vprintk_store() will need to do simple
copy) so that oops in printk() will happen before entering printk-safe context.
I think that this change follows a direction which lockless logbuf will want.

>
> 3. pr_debug() messages are not printed by default. It is possible that
> nobody used them for ages. You might get many errors in less
> maintained code instead in the really used one. I mean that you
> will get more noise with less gain.

Given that potentially dangerous-and-slow vscnprintf() is done outside of
printk-safe context, we can get more test coverage without difficult things.

>
>
> Have you tested this patch by the syzcaller with many runs, please?
> Did it helped to actually discover more bugs?
> Did it really made things easier?

syzbot can't test with custom patches. The only way to test this patch is
to send to e.g. linux-next.git which syzbot is testing.

>
> I am not able to judge usefulness without more data. My intuition
> tells me that we should keep the number of syzcaller-related twists
> as small as possible. Otherwise, syscaller will diverge more and
> more from reality.

The twist options are not specific to syzkaller. Anyone can selectively
enable the twist options.



On 2020/05/25 18:11, Sergey Senozhatsky wrote:
> On (20/05/25 10:42), Petr Mladek wrote:
>> On Sun 2020-05-24 23:50:34, Tetsuo Handa wrote:
>>> syzbot found a NULL pointer dereference bug inside mptcp_recvmsg() due to
>>> ssock == NULL, but this bug manifested inside selinux_socket_recvmsg()
>>> because pr_debug() was no-op [1].
>>>
>>> pr_debug("fallback-read subflow=%p",
>>> mptcp_subflow_ctx(ssock->sk));
>>> copied = sock_recvmsg(ssock, msg, flags);
>>
>> The NULL pointer deference was found even without this patch.
>> This patch would just cause that it will manifest itself on another
>> place. What is the benefit, please?

It would help localizing the bug in this specific case.

It's not only about %p, even %d can crash kernel or leak sensitive
info (if it happens after-free/out-of-bounds/uninit). Overall it
increases code coverage and allows to catch more bugs earlier.

>
> Right, I don't get this patch. A NULL-deref is still a NULL pointer deref.
> pr_debug() will fault reading one byte from the address and print something
> like "fallback-read subflow=(efault)" to printk-safe buffer, but then
> sock_recvmsg() is still going to do its thing.

Since this NULL pointer dereference already happens before calling pr_debug(),
we won't store "fallback-read subflow=(efault)" to printk-safe buffer.

Just evaluating pr_devel()/pr_debug() arguments would help finding some bugs.
Again, we can change this twist option to redirect pr_devel()/pr_debug() to
simple snprintf() which evaluates arguments but discards the result without
storing into the logbuffer.