Re: hung bootup with "drm/radeon/kms: move radeon KMS on/off switchout of staging."

From: Ingo Molnar
Date: Thu Feb 04 2010 - 15:23:37 EST



* Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote:

> On Thu, 4 Feb 2010 20:32:32 +0100
> Ingo Molnar <mingo@xxxxxxx> wrote:
> > Nobody has reacted to my related boot hang bugreport yet - and it's
> > detailed and fully reproducible (so i can test any proposed fixes as
> > well in short order). I.e. my limited testing has triggered two
> > separate bugs in the same driver - and this will show up in -rc7.
> >
> > It might be all OK and no-one else will see trouble. Or past patterns
> > might repeat themselves and i might simply be an early bird for
> > trouble to come.
> >
> > My (oft repeated) point is that adding new sub-features to existing
> > drivers is not what we do in late -rc's: there's simply not enough
> > time to shake out bugs/regressions in them.
> >
> > We introduce new functionality to existing drivers in the merge
> > window - in the two weeks following a stable kernel's release.
>
> This is the .config issue right? It doesn't sound like the bug is new,
> you're just seeing now it because of the way you run tests. It shouldn't
> affect any more or fewer users than it did before, and reverting the "move
> radeon KMS out of staging" won't fix the bug at all or prevent anyone from
> seeing it. People using KMS will still use KMS and people without it
> won't, [...]

I think you are missing my point. My point is very simple: existing non-KMS
users of CONFIG_DRM_RADON=y (a pre-existing driver) might turn on the new
sub-feature (CONFIG_DRM_RADEON_KMS=y), in the expectation that this is a safe
addition to his currently well-working driver.

( I have to confess i do that all the time for drivers that work well for me,
and if it pops up in a late -rc i sure expect it to be safe to enable. I
dont even read the help text most of the time - if the single-line summary
sounds useful i enable it. Especially if the Kconfig help entry says it's
safe with a new distro, it's not CONFIG_EXPERIMENTAL, it's not marked
CONFIG_BROKEN, it's not in CONFIG_STAGING, etc. )

That action might hang or crash his kernel, and if that user then reports:

" Hey, -rc7 just hung on me after enabling this new .config option it
offered for the radeon driver i am using, please add this to the list of
regressions. "

is this really the right kind of reply:

" Since we moved it from drivers/staging/ to drivers/ this hang you are
seeing is technically not a regression, we might or might not fix it. "

?

I doubt the user would be overly enthusiastic about that kind of reply ;-)

Guys, you should really _think_ about it a minute and realize what the
purpose of a regression policy is.

It's not to be a PITA to subsystem maintainers, it's not an annoyance just to
keep you from doing cool stuff. It's not something which you should try to
lawyer your way out of via an as narrow interpretation as you can.

A regression policy is something that generally helps the quality of Linux,
so it's worth interpreting broadly and generously in spirit not just in
letter. If there's a single most prominent complaint i hear about the
upstream kernel is that it breaks too often. (right after 'it doesnt support
my graphics hardware' - so i sure can relate to the pragmatic reasons of
pushing KMS strongly!)

If i run into a crash and a hang, you can bet that others will as well.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/