Re: [PATCH] Prevent crash on missing sysfs attribute group

From: Eric W. Biederman
Date: Tue Apr 03 2012 - 06:42:43 EST


Ingo Molnar <mingo@xxxxxxxxxx> writes:

> * Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>
>> Ingo Molnar <mingo@xxxxxxxxxx> writes:
>>
>> > * Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>> >
>> >> > Huh, so put repeated, duplicated, inconsistently applied sanity
>> >> > checks into dozens of sysfs attribute using kernel subsystems?
>> >>
>> >> [...]
>> >>
>> >> No. I was not talking about every usage site.
>> >
>> > Note, I'm not arguing that this isn't a bug in the P4 PMU driver
>> > - it is clearly a bug and I've applied the fix for it. I'm
>> > arguing about the escallation vector that this bug takes - that
>> > is unnecessarily disruptive:
>> >
>> > You were talking about:
>> >
>> >> >> FIX perf to include sanity checks.
>> >
>> > and what the PMU drivers do here is not uncommon at all, and the
>> > bug (for which I applied the fix and will push to Linus ASAP) is
>> > not uncommon either:
>>
>> > Bugs happen and indirections happen too. perf uses a generic
>> > PMU driver layer where the lower level layers register
>> > themselves. There's at least a dozen similar constructs in
>> > the kernel and you suggest that the right solution is to put
>> > checks in every one of them, while the nice patch from Bruno
>> > could catch it too, in one central place?
>>
>> What is uncommon is that perf_pmu_register is called from an
>> early initcall, and then later a device_init call is used to
>> register the pmu subsystem with sysfs.
>
> This has no relevance to the bug and crash pattern itself
> whatsoever, so stop blathering about unrelated things.

Crashing or BUG_ON or WARN_ON has no relevance to how hard it is to
debug this. They only have relevance on how hard it is to get
the information from a machine that experience this problem.

The call to perf_pmu_register was not in the stack backtrace,
because the call to perf_pmu_register happened long ago and
we put off registering with sysfs until later.

By the time we got to sysfs there was no information available
that would let us know which client of the perf events subsystem
it possibly could be. Not enough information when we register
with sysfs to blame it on something makes it hard to debug.

> Not filling out a sysfs object attribute is an *easy* driver
> level mistake, I've seen it happen on numerous occasions. Not
> crashing on it in the sysfs layer is an *obvious* debugging
> helper, and I don't understand why you are even arguing about
> this.

I agree that not filling out some field of something that is
expected is an easy driver bug to make. As such I would actually
like something to go on the next time someone makes that mistake.

Because the perf pmu subsystem is atypical and has asynchronous
registration with respect to sysfs. These kinds of problems
are harder to find then they need to be.

The only tools to track this down that were realistically available
were git bisect and asking on the list of developers if this bug
looked familiar. There was really not enough information at the point
of the crash to find which attribute was not properly initialized
without being intimately familiar with the perf pmu subsystem.

Having to resort to a git bisect when a few checks extra checks
could have caught the culprit red handed and we could have had
a the function name where the that started the registration
of the sysfs attribute to start debugging with.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/