Re: [PATCH] Prevent crash on missing sysfs attribute group

From: Ingo Molnar
Date: Tue Apr 03 2012 - 04:04:23 EST



* Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:

> > Huh, so put repeated, duplicated, inconsistently applied sanity
> > checks into dozens of sysfs attribute using kernel subsystems?
>
> [...]
>
> No. I was not talking about every usage site.

Note, I'm not arguing that this isn't a bug in the P4 PMU driver
- it is clearly a bug and I've applied the fix for it. I'm
arguing about the escallation vector that this bug takes - that
is unnecessarily disruptive:

You were talking about:

> >> FIX perf to include sanity checks.

and what the PMU drivers do here is not uncommon at all, and the
bug (for which I applied the fix and will push to Linus ASAP) is
not uncommon either:

Bugs happen and indirections happen too. perf uses a generic PMU
driver layer where the lower level layers register themselves.
There's at least a dozen similar constructs in the kernel and
you suggest that the right solution is to put checks in every
one of them, while the nice patch from Bruno could catch it too,
in one central place?

If the PMU code used those attributes directly and could
crash/misbehave then you'd have a point. But the first thing
that makes real use of these objects is sysfs - so it's
trivially useful to at minimum have a sanity check there...

> [...] I was talking about the sites that are don't have a
> direct call chain to the sysfs methods and instead do
> something clever that makes backtraces worthless.
>
> In the normal case sysfs registration problems are simple to
> trace back to their source because the backtrace points a
> finger at the piece of code that when registering had a
> problem.

You mean the crash backtrace?

I don't think we should spuriously crash the kernel on NULL
pointer input to generic facilities, especially when a check is
so simple and would catch so many similar patterns of bugs.

That lack of a check escallated a simple missing (and
unimportant) attribute into a "box won't boot at all" bug.
*That* is not acceptable behavior and robustness from a generic
facility, in my book.

In that sense the crash behaves like a BUG_ON().

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/