2.1.X and its separation from the Linux User base

Jim Dennis (jimd@starshine.org)
Thu, 05 Feb 1998 05:29:36 -0800


> It has been almost two years since 2.0.X was released and it seems to me
> that just a few highly specialized people are using 2.1.X for mostly
> limited purposes. I use it for a workstation and I know of others as well.
> Some installations for limited server use I know.

That is the intent of this development model.

> But mostly the people who run production servers or fully featured
> networking stuff use 2.0.X. On the Campus I run I use 2.0.X exclusively.
> I patch 2.0.X with certain 2.1.X features if I need to have them.

There are "servers" and "workstations" -- and there are
"users" and "power users"

"servers" *should* run stable versions (unless they are
"lab servers" for kernel developers).

"workstations" for normal users *should* also run stable versions
(unless you are the BOFH -- and you don't like them).

"workstations" for "power users" *might* run the beta
kernels *if* they're willing to assume that risk model for their
work.

> Have tried to run a 2.1.X kernel during the burn-in phase of a new server
> but quickly had to switch back to 2.0.X due to problems with networking.

That's fine. I wish everyone would grab the latest beta for
a day or two of "shakedown" testing when configuring a new
machine.

> I think it will be very difficult to release 2.2.x and to get a large
> number of users to test it before release. There is not even much concern
> to fix the existing bugs in 2.1.X (See the IP masquerading issue) . Many
> people (like I) have or had to just give up on trying 2.1.X in a fully
> featured environment.

I think that the 2.1.9x kernels will raise a broader interest
in testing and bug fixes. I know at least one driver developer
who's waiting until things stabilize enough that he can tell
whether the bugs he's chasing are his or in other parts of the
kernel.

I agree that a much smaller percentage of Linux users will
be interested in testing the 2.1.x kernels than we saw for
the 1.3.x and the 1.1.x and the 0.99x releases. There are
two obvious reasons for this:

There are *lots* more Linux users now then there
were three years ago -- and many of them are purely
business and non-power users. I hear estimates of
3 to 8 million Linux installations world wide. You
won't find a significant percentage of "beta testers"
or "early adopters" out of *any* group of 3 million
people unless there's a darn pressing need.

There isn't a "darn pressing need." 2.0.x is pretty
good. It has everything that the vast majority of
users want in a kernel (and most of the things that
they "think" they want are probably better done in
user space or as modules and refinements to 2.0).

Now that I've said that let's ask a different question. If
the *percentage* of users that will be beta testers and
"early adopters" will be smaller, what does that mean about
the total number of testers? What effect to either of these
have one the quality of testing?

I don't know. I don't even place too much credence in the
3 to 8 million user statistic that I quoted, and I can't
even guess at any other numbers. However, my intuition is
that a tiny percentage of the total current users is likely
to be a much larger number of tester than we had in total
user base five years ago.

I think we also have an edge in the quality of testing that
can be done today over "the bad old days." The users that
are out there now have had lots more experience with Linux
(when I first started with Linux -- I couldn't tell *if*
something was a bug -- or if it was "supposed" to do that
-- or if I was just "doing something wrong").

In addition the phenomenal success of Linux has attracted
many users with vast experience in other Unix platforms.
I used to know alot of "workstation bigots" who considered
PC's to be worthless toys -- and extended that opinion to
early versions of Linux by association. There used to be
considerable resistance to Linux in some circles (and still
is -- but the circles are no smaller, more isolated, and
considered to be the sorts of lunatic fringes that they used
to consider *us* to be).

In short, Linux is taken seriously.

Another aspect in our favor is that some major companies
have a vested interest in Linux development. For example
the big U.S. automobile manufacturers are a driving force behind
IPSec and IPv6. The Linux FreeS/WAN project is important to
them. Also some smaller companies, like Red Hat, are doing
lots of business, getting lots of good press, and following
the original free software aggenda by giving back to the
effort.

I was highly resistant to RPM. There are still things I
*don't* like about it (like the fact that I can't tell them
to just use RCS/CVS check-in's on those configuration files
instead of that ".orig" crap). However, I've also gotten
spoiled rotten by the ability to grab a directory full of
packages and install them all with a simple 'rpm -i' or
'rpm -U' command. A little more work on that and KDE and
Linux boxes may actually beat our Mac in the "ease of use"
category.

I just sent my mom (two thousand miles away) a Mac for her
birthday. She's never used a computer before (except for
"terminals" at work, now Windoze boxen running 3270 emulators,
and the occasional game of Mahjonng for DOS -- that was
installed by a co-worker). The crucial question was: what
can I send here that requires almost no skills and no
administration?

Understandably we aren't *quite* there yet. By this time next
year, I might be able to actually send her a PC running Linux.
It might still require the occasional long distance modem
call (or the ssh session to here "ml.org" name or some such)
to fix something or show here a new trick. But we're really
close!

> I think the release time between stable kernels should be reduced
> significantly. Release a stable kernel each 6 months at least and develop
> a testing program. Keep the user base in touch with kernel development
> otherwise we will loose momentum and users will not test the development
> kernel anymore.

Hmm. I personally like the current model. "Stable" means
feature freeze -- bug fixes only. However, to accomodate
your request, I could see where someone (some volunteer,
hint, hint), might maintain a "New features patchkit" to
the stable kernel.

Thus you (or whoever) would go through the "beta's" and
find modules and patches that can reasonably be "down ported"
or "back merged" (or whatever you'd like to call it) into
a patch kit to the stable tree.

Anyone who wanted *some* new features but required a
more stable kernel could grab the patch kit, and selectively
apply the desired new features patches.

Obviously some new features would not be portable to
the older kernels. So this effort might get bogged down
rather quickly. However, there might be some benefit to
it -- and any volunteer that would like to try the experiment
will probably do so.

Another couple of questions that come to mind are:

How can we improve the testing of the beta's?

How can we insure that *most* beta's will have
a reasonable subset of functionality (that they
are "usable" for some definition of the term)?

I agree that these are serious issues. In particular
the last one has guaranteed that many people weren't willing
to experiment with the 2.1.x kernels for some time now.

I would really like to see a set of regression tests included
with the kernel. With this you'd fetch the lastest beta,
do a fresh level 0 backup, build the new kernel (maybe on a
Friday afternoon), and to a 'make test' (or it's equivalent)
and let that run all weekend.

It might run 'crashme' for twelve hours, it might include
a serious "rootshell" goodies like the old LDT bug exploits
and the F00F and the Cyrix "Coma" (hidden-CLI?) regressions.
It would also run a series of driver specific (non-destructive)
tests. For some of the tests you might need another machine
available -- to "attack" the new kernel with 'ping of death'
and 'land/la tierra' 'boink' and other bogus packets. It
could also co-operate with the new kernel on a set of
NFS and RPC load tests. I also suspect that we could
do wonders with a set of regression that "poke" values
into various nodes under /proc. I have no idea of what
should be poked where -- but I bet some of you do!

We might also need some special loadable modules that are
used only for testing (to *beat* on some API's for some
sorts of stress testing). Unfortunately I don't know
of any examples -- or if there's any feasibility to this.

We should probably send out the word to Universities and
trade schools. There have to be classes in "SQA Automation"
that need challenging and rewarding "real world" assignments.
(And my call for the "adopt a man page" project is still
floating around, too).

This would give a users of a new kernel some assurance that
this kernel works on *their* mixture of hardware.

We'd also like a couple hundred (thousand) DejaGNU scripts
for various of applications. We might have a massive
./configure script that finds various packages that are
installed on the testbed system (elm, fractint, PostgreSQL,
etc) and run suites of reqressions one each of those.

This would give the users of a new kernel some assurance that
it works with *their* software.

(One might ask how much value we all get back from them.
Given the large number of configuration options in the Linux
kernel -- I've been amazed for years that I can more or less
arbitrarily select a set of options and get a running
kernel out of it. I'd hate to see a dependency tree of all
the options, and I'm fascinated that various drivers are
stomping all over one another, all the time).

Another thing that I think would be very cool is if a few
organizations -- the ones that really have a strategic
interest in Linux -- were to devote a few machines to
full time regression testing. Have these machines run
an automated process that builds many different kernels
(spanning the tree of available compile time options) and
having small isolated networks of these hammering on
one another.

> Recommendation:
>
> For 2.2: Freeze, fix the bugs and get things out ASAP before trying any
> new ideas that will break things again. It will be difficult to get 2.2.X
> accepted since most people by this time are mostly satisfied with 2.0.X
> kernels and have developed patches for 2.0.X if they need more
> functionality.

I know people who still run 0.99p15 or 1.09, or 1.2.13.
The people who adopted Linux a long time ago tend to be
the "daring" early adopter types. The huge surge in users
that we've had since 2.x shipped are *not* pre-selected for
this characteristic. So we should face the simple fact that
2.0.x will be around for a long time to come.

I don't see anything wrong with that.

As for "feature freeze": I'd like to see that happen for
2.1.90 Of course I don't have *any* say in that. I'm not
a contributor -- I'm a lurker who butts in with my opinion.
(For all I know "feature freeze" was called months ago during
one of my hiatus' from the list). We probably won't see
a really stable 2.2 in only ten builds after feature freeze
(not if the last couple of versions are any indication).

I'd really like to see us avoid the 1.3.99pre20 naming
fiasco, and I wasn't much enamored of the 0.99p10 scheme
either.

Would it break anything to have *four* part numbering?
(He said innocently, while fearing that his question would
ignite the fumes!).

I think that a four part numbering system (just for the betas!)
might give us "alpha" levels. The key would be to have
"usable" betas of the form: 2.1.10 and "alphas" of the form
2.1.10.xx. We then set some regression criteria to define
"usable."

This would be a better alternative to the
"stable+new_features_patches" idea I presented above.
However, the two are not exclusive.

Obviously every concientous developer checks his or her work
before submitting a set of pages. Presumably Linus builds a
new kernel from a fresh set of merges and gives it at least the
"laugh test" (our colloquial term for a process whose official
name was the "Build Acceptance Test" when I was actually in QA).

However, these only cover two or three system configurations.
They should be considered "alphas" until they've been run
on a representative sample of systems and passed a reasonable
set of regresssions. (Usually our criteria for going "out"
to "beta" was "no known 'cat 1' bugs" -- that is, no known
defects that will cause data loss or system failure).

I realize that sometimes code will be submitted with known
defects. When the developer "gets stuck" and wants some
others to "look at it" -- this can be a reasonable thing
to do (so long as its documented). Those should definitely
be confined to "alphas."

> Start 2.3.X for those people who cannot wait to experiment with new ideas.
> Just my unhumble opinion....

Who wants to "rush" 2.2 out the door?

Somewhere there was a wishlist. Of the items therein,
which ones have been adopted by active developers?

(A better questions might be: Can we have a roll call
of new features that are under active development?).

Of those, how many seem to be "done"?

So, what's "left"? What is being actively developed and
isn't "done"?

(Note: There are many items on the wishlist that basically
have no chance in blazes of going into 2.2 -- since no one
is working on them. This is a natural by product of having
people like me (non-programmers) adding pie-in-the-sky wishes
in the hopes that someone "out there" wants to "adopt" them.
I don't see this as a problem; it simply make those items
irrelevant to 2.2).

Linus,

How were you planning on handling the 2.1.90 through 99
builds? Do you see any value in adding the alpha indicator?
Do you expect a flurry of builds between 2.1.90 and 2.2
(like the "build-a-day" marathons from a couple of years
ago)?

(Luckily for me I don't have to make any decisions here.
I can just make suggestions and let y'all fight it out
and let Linus decide. However, I hope someone who knows
lots more about kernel internals than I do gets excited
about building test suites).

--
Jim Dennis  (800) 938-4078		consulting@starshine.org
Proprietor, Starshine Technical Services:  http://www.starshine.org
        PGP  1024/2ABF03B1 Jim Dennis <jim@starshine.org>
        Key fingerprint =  2524E3FEF0922A84  A27BDEDB38EBB95A