Re: [PATCH] kernel/trace: Add TRACING_ALLOW_PRINTK config option

From: Alexei Starovoitov
Date: Thu Jun 25 2020 - 00:00:08 EST


On Thu, Jun 25, 2020 at 10:00:09AM +0800, Nicolas Boichat wrote:
> On Thu, Jun 25, 2020 at 1:25 AM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Wed, Jun 24, 2020 at 9:07 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > >
> > > On Wed, 24 Jun 2020 16:45:24 +0800
> > > Nicolas Boichat <drinkcat@xxxxxxxxxxxx> wrote:
> > >
> > > > trace_printk is only meant as a debugging tool, and should never be
> > > > compiled into production code without source code changes, as
> > > > indicated by the warning that shows up on boot if any trace_printk
> > > > is called:
> > > > ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
> > > > ** **
> > > > ** trace_printk() being used. Allocating extra memory. **
> > > > ** **
> > > > ** This means that this is a DEBUG kernel and it is **
> > > > ** unsafe for production use. **
> > > >
> > > > If this option is set to n, the kernel will generate a build-time
> > > > error if trace_printk is used.
> > > >
> > > > Signed-off-by: Nicolas Boichat <drinkcat@xxxxxxxxxxxx>
> > >
> > > Interesting. Note, this will prevent modules with trace_printk from
> > > being loaded as well.
> >
> > Nack.
> > The message is bogus. It's used in production kernels.
> > bpf_trace_printk() calls it.
>
> Interesting. BTW, the same information (trace_printk is for debugging
> only) is repeated all over the place, including where bpf_trace_printk
> is documented:
> https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L757
> https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/bpf.h#L706
> https://elixir.bootlin.com/linux/latest/source/kernel/trace/trace.c#L3157
>
> Steven added that warning (2184db46e425c ("tracing: Print nasty banner
> when trace_printk() is in use")), so maybe he can confirm if it's
> still relevant.

The banner is nasty and it's actively causing harm.
Every few month I have to explain to users that it's absolulte ok to
ignore that banner. Nothing bad is happening with the kernel.
The kernel is still perfectly safe for production use.
It's not a debug kernel.

What bpf_trace_printk() doc is saying that it's not recommended to use
this helper for production bpf programs. There are better alternatives.
It is absolutely fine to use bpf_trace_printk() to debug production and
experimental bpf programs on production servers, android phones and
everywhere else.

> Also, note that emitting the build error is behind a Kconfig option,
> you don't have to select it if you don't want to (the default is =y
> which allows trace_printk).
>
> If the overhead is real, we (Chrome OS) would like to make sure
> trace_printk does not slip into production kernels (we do want to
> provide basic tracing support so we can't just remove CONFIG_TRACING
> as a whole which would make trace_printk no-ops). I could also imagine
> potential security issues if people print raw pointers/sensitive data
> in trace_printk, assuming that the code is for debugging only.
>
> Also, the fact that the kernel test robot already found a stray
> trace_printk in drivers/usb/cdns3/gadget.c makes me think that this
> change is working as intended ,-) (we're going to need to add a few
> Kconfig deps though for other debugging options that intentionally use
> trace_printk).