Re: [patch] e1000=y && e1000e=m regression fix

From: Ingo Molnar
Date: Mon Jun 09 2008 - 15:25:53 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Fri, 11 Apr 2008, Jeff Garzik wrote:
> > Linus Torvalds wrote:
> > > .. but that said, I think your patch is certainly better than what we have
> > > now (or what Ingo was complaining about for the next merge window). I
> > > certainly could live with it. I would just suggest against ever then
> > > removing that "generic E1000" choice.
> >
> > You mean never ever remove PCI-E support from e1000?
>
> No. I mean never ever remove the *configure* level thinking that
> "e1000 is e1000".
>
> There is no sense in *ever* showing it as two drivers to users,
> because users do not see them as separate chipsets. They look
> identical, down to the part names.
>
> If it's a single family, and users can't even easily tell whether they
> have version 1 or version 2 (PCI vs PCI-E), you shouldn't even ask
> them. You should literally ask them: "do you want e1000 support".
>
> That's it.
>
> Once you have asked them that, you can then decide "ok, if you
> *really* know what version of the chip you have, you can decide to
> only get limited driver support".
>
> But that's a secondary thing from a user perspective.
>
> See the patch I already sent out.

btw., in the last 2-3 months i've hit this bug about a dozen times, on
various test-systems i have. And i just hit it a minute ago again,
reminding me of this open issue, with such a config:

CONFIG_E1000=y
# CONFIG_E1000_NAPI is not set
CONFIG_E1000_DISABLE_PACKET_SPLIT=y
CONFIG_E1000E=y
CONFIG_E1000E_ENABLED=y

Every time this bug hits i lose about 30 minutes of testing (sometimes
hours of it, because my testing stalls) and once it took half an hour of
head-scratching to notice that the bl**dy CONFIG_E1000E_ENABLED=y again
was killing the e1000 driver i rely on having.

With up to 10 test-systems and a healthy mix of old and new distros it's
just not realistic to reconfigure all those distros to use e1000e.
(Also, i frequently have to bisect back into older kernels and have
scripting to make this work most of the time - if i standardized on
e1000e i'd lose the ability to do automated bisection.)

i have a patch that undoes this e1000 damage but sometimes i forget to
apply it and then the bug can hit me. Whoever thinks that this isnt a
problem in practice hasnt been doing a lot of systematic testing. It's
quite a PITA and it's still not fixed upstream. (and it's not eligible
for the v2.6.26 regression list anymore as it got introduced in v2.6.25)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/