Re: [PATCH v3 2/2] usb: typec: ucsi: Implement ChromeOS UCSI driver

From: Greg Kroah-Hartman
Date: Tue Apr 09 2024 - 04:35:38 EST


On Mon, Apr 08, 2024 at 06:04:22AM -0700, Guenter Roeck wrote:
> On Thu, Apr 4, 2024 at 6:30 AM Greg Kroah-Hartman
> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> [ ... ]
>
> > > > > if (WARN_ON_ONCE(val_len > MAX_EC_DATA_SIZE))
> > > > > return -EINVAL;
> > > >
> > > > So if you trigger this, you just rebooted all boxes that have
> > > > panic-on-warn enabled (hint, the HUGE majority in quantity of Linux
> > > > systems out there.)
> > > >
> > > > So don't do that, just handle it like this.
> > >
> > > Does that mean that we should not use WARN at all? What is the best
> > > current practice for WARN usage?
> >
> > To never use it. Handle the issue and recover properly.
> >
> > > I'm asking because for me this looks like a perfect usecase. If I were
> > > at the positiion of the driver developer, I'd like to know the whole
> > > path leading to the bad call, not just the fact that the function was
> > > called with the buffer being too big.
> >
> > Then use ftrace if you are a driver developer, don't crash users boxes
> > please.
> >
> > If you REALLY need a traceback, then provide that, but do NOT use WARN()
> > for just normal debugging calls that you want to leave around in the
> > system for users to trip over.
> >
>
> That is not common practice.
>
> $ git grep WARN_ON drivers/gpu | wc
> 3004 11999 246545
> $ git grep WARN_ON drivers/net/ | wc
> 3679 14564 308230
> $ git grep WARN_ON drivers/net/wireless | wc
> 1985 8112 166081
>
> We get hundreds of thousands of reports with warning backtraces from
> Chromebooks in the field _every single day_. Most of those are from
> drm and wireless subsystems. We even had to scale back the percentage
> of reported warning backtraces because the large volume overwhelmed
> the reporting system. When approached about it, developers usually
> respond with "this backtrace is absolutely necessary", but nothing
> ever happens to fix the reported problems. In practice, they are just
> ignored.

Then push back on the developers please, this isn't ok. WARN_ON
triggers so many automated systems it's not funny. And if a trace back
is really needed, there is a function for that, but really, just fix the
issue and handle it properly.

> This means that any system using drm or wireless interfaces just can
> not really enable panic-on-warn because that would crash the system
> all the time.

I guess Android doesn't use wireless or drm :)

Again, billions of systems in the world has this enabled, let's learn to
live with it and fix up our coding practices to not be lazy.

thanks,

greg k-h