Re: The stability crisis

Chuck Lever (cel@monkey.org)
Sun, 4 Jul 1999 13:55:56 -0400 (EDT)


On Sun, 4 Jul 1999, Andrea Arcangeli wrote:
> We just have interactive and many kind of not interactive debugging tools.
> I am also been able to discover hardware bugs (Epox lockup by running
> disable_dma, now called PCI_QUIRKS) using one of such debugging tools (I
> think it was developed by Ingo).
>
> They all are in the IKD patches I am maintaining for the 2.2.x series.
> There's also kdb from SGI.

these are great tools. the problem is that before they are useful, we
have to wait for them to become available for the kernel revision we are
debugging. neither IKD nor kdb are ported to 2.3 at the moment. i'd love
it if they were integrated into the kernel and enabled/disabled via
CONFIG. i would even do some of this myself if i thought such a patch
might be accepted. to clarify, i recognize that interactive kernel
debugging tools are available, but just not "integrated" as well as they
could be.

i'm diagnosing problems that could use a debugger like kdb -- blocked
processes not making progress, for example. i want to know things like
the process's callback trace and current regs. i want to be able to stop
the kernel and paw through data structures. how are people getting that
kind of information now in 2.3 (that was a genuine question, not
rhetorical)?

> >> Surely quality is more important than quantity in terms of amount of
> >> code (or features) added to the kernel. Or do people think that Windows
> >> 2000 with its 35-40 million lines of code and bloat is a good thing? :-)
> >
> >i know the entrance bar has always been high for Unix-like systems; it's
>
> Fun :). I think the plain _opposite_. Before discovering Linux my only
> entrance for hacking Windows was to binary-patch the assembler. That was a
> far higher entrance bar IMHO.

agreed- windows is nuts. but you *can* diagnose problems on binary-only
systems when given good tools. my experience has been with MVS (gulp),
AIX, and Solaris... both IBM and Sun design their systems to make it easy
to capture problems the first time, and to allow savvy admins the ability
to diagnose some problems themselves. and they reserve the right to
completely ignore such admins :-)

how does that apply to open source? i'm really interested in two things
here: improving the quality of bug reporting, and making it easier to
diagnose problems and hack on the kernel. both Solaris and AIX have some
good ideas to offer here, IMO. they've had to develop good tools because
their users/customers receive their product in binary-only form. think
how nice it will be for open source users to have source code along with
the tools.

> The reason I like GPL is that it makes sure that there will _never_ be an
> entrance bar at all, since _everybody_ can enter (and that's really great
> IMO). Be sure that I wouldn't be here now if there would been an entrance
> bar also on the linux kernel. And new hackers have this great mailing-list
> that it's a great place to learn things, where very often many seasoned
> kernel hackers (e.g. Linus, Alan, SCT, etc...) teach us new things every
> day and really nicely answer to our questions.

no arguments there!

- Chuck Lever

--
corporate:	<chuckl@netscape.com>
personal:	<chucklever@netscape.net> or <cel@monkey.org>

The Linux Scalability project: http://www.citi.umich.edu/projects/linux-scalability/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/