Welcome to linux-kernel

Majordomo@vger.rutgers.edu
Wed, 13 May 1998 22:36:56 -0400


--

Welcome to the linux-kernel mailing list!

Please save this message for future reference. Thank you.

If you ever want to remove yourself from this mailing list, you can send mail to <Majordomo@vger.rutgers.edu> with the following command in the body of your email message:

unsubscribe linux-kernel linuxkernel@uwsg.indiana.edu

Here's the general information for the list you've subscribed to, in case you don't already have it:

= CONTENTS = = 0. = Introduction = 1. = The kernel mailing list = 1.1. = How do I get off of this mailing list ??? = 1.2. = Topic = 1.3. = Off-topic = 1.4. = Etiquette = 1.5. = Bug Reports = 1.6. = Kernel patches = 1.7. = Be ready to get spammed = 2. = On- and off-line resources = 2.1. = Kernel resources = 2.1.1. = The Linux Kernel HOWTO - How to compile a kernel = 2.1.2. = Linux v2 Information HQ = 2.1.3. = The kernel hackers guide = 2.1.4 = Mailing lists = 2.1.5. = Writing device drivers = 2.1.6. = The Linux Wish List = 2.1.7. = Periodicals = 2.1.8. = Books = 2.2. = Other points of interest = 2.2.1. = The Linux Home Page = 2.2.2. = The Linux documentation project = 2.2.3. = The Linux Software Map = 2.2.4. = IRC = 2.2.5. = Linux news groups = 2.2.6. = Linux Gazette = 2.2.7. = Other URLs = 3. = The kernel = 3.1. = Common problems = 3.1.1. = Before you dive into it = 3.1.2. = File system corruption = 3.1.3. = Signal 11 = 3.1.4. = Seasonal Problems = 3.1.4.1. = Warning: possible SYN flooding. Sending cookies. = 3.1.4.2. = Kernel hangs / no output after "Now booting the kernel ..." = 3.1.4.3. = Ignoring P6 Local APIC Spurious Interrupt Bug = 3.2. = How to get started on kernel development = A. = Appendix A - Maintainers = B. = Appendix B - Linux Kernel Mailing List Bug Report Form = C. = Appendix C - Unresolved Issues = D. = Appendix D - Change Log

= 0. = Introduction

This is the Linux kernel mailing list FAQ and usage policy. It is posted to the mailing list regularly. On the World Wide Web it is available on <URL:http://kernelfaq.iconsult.com>.

Note that thisq FAQ is not so much a FAQ but more a collection of "frequent answers".

The first section contains information about the kernel mailing list. Please read it before posting to the list or be ready to be flamed.

= 1. = The kernel mailing list

= 1.1. = How do I get off of this mailing list ???

Send mail containing "unsubscribe linux-kernel" (without the quote marks) in the body of the message (not the subject line) to majordomo@vger.rutgers.edu.

(echo "unsubscribe linux-kernel" | mail majordomo@vger.rutgers.edu)

If that doesn't work you probably are subscribed using another email address than the one you're using now. In that case the first thing to do is to find out the email address you subscribed with. (You're on your own here, but your system adminstrator might help you by checking the mail transfer agent's log files)

When you found out the address use:

(echo "unsubscribe linux-kernel" "the address you subscribed with" | mail majordomo@vger.rutgers.edu)

By the way, a small tip for mailing lists in general: On some mailing lists the "unsubscribe <list> <address>" syntax doesn't work. In that case use Netscape Navigator to send a mail with faked sender address: Enter your "address you subscribed with" in Netscape's "Options/Mail and News preferences/Your Email" field, fill in your current address in the "Reply-to Address" and send the unsubscribe mail from Netscape. Normally the mailing list software will believe the faked "From:" field in the mail.

= 1.2. = Topic

This list discusses Linux kernel development. All submissions relevant to that, such as bug reports, enhancement ideas, kernel patches or reports that a patch fixed a bug are appropriate. Please note the emphasis on kernel development as opposed to development of Linux systems in general.

= 1.3. = Off-topic

Please do not ask basic installation or non-kernel related configuration questions on this list. If you are unclear of the distinction between the Linux kernel and other parts of a Linux system, please do not post here until you have learned somewhere else. In particular, if you don't know the difference between XFree86 and the Linux kernel, please do not ask about it here. (If you don't know what I'm talking about, this means you.)

= 1.4. = Etiquette

As Linux grows in popularity, it is inevitable that subscriptions to the list will greatly increase. The list is already quite large and beginning to suffer from the classic Usenet signal to noise ratio problem. Reading the list daily gives a sense of involvment and excitement at being so close to the cutting edge of a compelling and rapidly evolving technology. It is important to note that your post will go out to many, many people and that "send" key is so very close... A good idea to consider: "My opinions really don't matter, but my _code_ most certainly does!" Please help us to prove that this list can scale well to such a large audience.

- Questions

Before posting a question to this list, think twice about whether it is indeed kernel-related. Perhaps another newsgroup or mailing list is better suited for the question. See Section 2 for a list of on-line resources.

In any event have a quick look at the Documentation/Changes file, and ensure that your software is up-to-date. Sometimes things change within the kernel which stop user-level code from working. You'll feel a little silly if the answer to the problem is in the documentation that comes with the kernel, but you just didn't read it.

A good strategy is to wait a day after writing something before posting. The very same information may hit the list during that time, especially if the problem you are experiencing is one which many people will find (e.g., "ps and top have stopped working!"). Probably someone else will ask about it too; there's nothing more annoying than seeing the same question on the list over and over again.

- Answers

Before posting an answer to this list, also think twice! When off-topic mail arrives (e.g., "I can't build the kernel", "how do I convert ASCII to EBCDIC" or "Make money fast"), it is best to answer directly (i.e., off this list). Despite our best efforts, these questions will always appear; there is no easy way to avoid this without moving away from creative anarchy. Dumb questions are at least a positive sign of usage and growth. We all hate spam, but flaming to the list just makes it worse.

Before you post an answer to a legitimate question, think twice again. If possible try to give an answer that might help more people than the original poster. For example posting generic strategies helps a lot of people (especially newbies). Some great examples of such posts by Cameron McKinnon (how to get started) and Doug Ledford (on extfs problems) have ended up in this document.

I know all those 'think twice' are more easily said than done, but remember _everyone_ that even tries to think will make the kernel mailing list a more enjoyable place for all.

"Most people think about twice a year. I got famous by thinking once a week." - George B. Shaw (see Appendix A)

= 1.5. = Bug Reports

There are a few things to consider before reporting kernel error messages:

- Try to have a clue

A good rule of thumb that applies to everything in life - even to linux kernel development. Think of things that might be of interest to the developers, things that are redundant. Find out how other people's report bugs look and what the reaction in the list is.

- The developers don't have access to your system.

This means they don't have much information on how your kernel was built, which addresses certain routines were compiled to or which hardware you run. To get a rough idea what information might be relevant to the developers read the following paragraphs, then have a look at the bug report form and data collection shell script in Appendix B.

The most complicated thing to do is to add symbolic information to your kernel error message. Once (back in the good old days) this was quite an ordeal, but with modern klogd/syslogd install this gets quite easy. Make sure your kernel's System.map is installed in the right place (/boot/System.map, /System.map or /usr/src/linux/System.map) and from now on klogd will automatically add symbolic information to the kernel messages it logs. See 'man klogd' to check whether the version of klogd you run does already support that feature.

For similar functionality look at the ksymoops program in the kernel source tree, which can be used when klogd/syslogd logged a 'raw' kernel Oops.. message to your disk (or if you copied it down by hand, because the system froze before being able to write to the hard disk.)

When symbolic information is added to the report you'll have to provide everything else relevant to the problem. A general rule of thumb is: Too much information won't hurt, not enough will. Be sure to include at least some general description of your hardware like processor, RAM, how many and what kind of disks, disk controllers (IDE? SCSI?) and expansion board. In particular, make sure you mention which kernel you are trying to use. Use the bug report form and data collection shell script from Appendix B.

If you feel you should include your .config file, please send the output of "grep '^[^#]' .config" instead of the whole thing, as this saves a lot of wasted space.

- The developers are busy developing

Often the developers are so busy developing, they will read your mail but not have the time to answer it. While you might say 'it does not take much time to answer an email' you might overlook the fact that developers often get flooded with email, so much that they get nightmares about it. Answering an email does in fact not take much time, answering 100 emails does.

- Trying to help the developers makes the bug vanish faster ...

If you like to be of great help to the developers you might find out if other people have the same problem. Finding the general patterns of a bug is a job that does not take months, and it is a job that you can perform if you have never seen a single line of C source.

Try to find some conditions that reliably trigger the problem, this includes asking other people if they have similar problems. If yes, which hard- and software do they use? For example you might find out your ext2fs file system errors are limited to users of the brand xyz SCSI controller Mark 42. _Such_ a result will alert the developer of the xyz SCSI driver, while a message like 'My ext2fs got bad! Linux sucks!' probably won't.

- Try to reach the appropriate people.

Sometimes it is better to communicate to the developers directly by email instead of posting to the mailing list. See the MAINTAINERS file in the Linux source tree to find out the maintainer for a specific Linux subsystem. In addition, there are a number of mailing lists for specific parts of the kernel (e.g. scsi, net, etc.); you might want to join those lists as well, since that is where the experts hang out.

- Use the Linux kernel mailing list bug report form from Appendix B of this document. This form will ask the right questions.

= 1.6. = Kernel patches

A little bit of consideration first: If possible create patches that change _one_ thing in the kernel. Doing so enables people to choose which part of your changes they use (or even include in the distribution kernel.) While your SCSI driver fixes might be perfectly sane, people might not like your change to the network layers changing network addresses from being big-endian to little-endian.

Always use unified (-u) diff format when submitting kernel patches. The unified diff format is very readable and allows 'reverse' application for undoing a patch (which is extremely useful when the patch provider 'diffed' the sources the wrong way round).

Assuming you have two source trees of the _same_ version of Linux, an original one {original-source-tree} and one with your personal changes {your-source-tree}, the recommended procedure for creating patch files is:

$ make -C {original-source-tree} distclean $ make -C {your-source-tree} distclean $ diff -urN {original-source-tree} {your-source-tree} >/tmp/linux-patch

The patch file will then be in /tmp/linux-patch waiting to be deployed on the Linux kernel mailing list. When posting your patch, don't forget to mention what it does.

Of course you need to set up two identical source directories to be able to diff the tree later. A nice trick -- it requires a little bit of consideration, though -- is to create the 'your-source-tree' from hard links to the 'original-source-tree':

$ tar xzvf linux-2.1.anything.tar.gz $ mv linux linux-2.1.anything.orig $ cp -av --link linux-2.1.anything.orig linux-2.1.anything

This will hardlink every source file from the original tree to a new location; it is very fast, since it does not need to create more than 20 megabytes of files.

You can now apply patches to the linux-2.1.anything source tree, since patch does not change the original files but move them to <filename>.orig, so the contents of the hard-linked file will not be changed.

Assuming that your editor does the same thing, too (moving original files to backup files before writing out changed ones) you can freely edit within the hardlinked tree.

Now the changed tree can be diffed at high speed, since most files don't just have indentical contents, they are identical files in both trees. Naturally removing that tree is quite fast, too.

Thanks to Janos Farkas <URL:mailto:chexum@shadow.banki.hu> for that trick.

= 1.7. = Be ready to get spammed

Some "nice guys" are obviously monitoring the kernel mailing list to get email addresses. Every time I post to the mailing list I get a bunch of "Earn $800 a week extra income" mails. Be ready to ignore or handle this (maybe using procmail).

= 2. = On- and off-line resources

If you think something should be listed here, please send an e-mail to the current list maintainer (<URL:mailto:kernelfaq@iconsult.com>). Resources will be listed in "URL embedded in text file" syntax to ease transition to these resources.

= 2.1. = Kernel resources

= 2.1.1. = The Linux Kernel HOWTO - How to compile a kernel

You definitley ought to look at this unless you can cite every line of the kernel Makefile from memory.

This document describes how to obtain, unpack, compile and install a new kernel and shows some of the pitfalls that lurk on the road of upgrading.

<URL:http://www.caldera.com/LDP/Kernel-HOWTO.html> <URL:http://sunsite.unc-edu/LDP/Kernel-HOWTO.html>

If you are looking for all the other HOWTOS and mini-HOWTOS just check out LDP itself:

<URL:http://www.caldera.com/LDP/> <URL:http://sunsite.unc-edu/LDP/>

= 2.1.2. = Linux v2 Information HQ

<URL:http://www.linuxhq.com>

A very useful site that lists Linux V2 information including, but not limited to, "what's new", "how to upgrade", source code, official and unofficial kernel patches for V2.0 and V2.1 and archives of the Linux mailing lists. Two thumbs up.

Please note that this was known as <URL:http://www.ecsnet.com> for some time, and while the latter URL will continue to be vaild for a while, it will eventually go away; it might be wise to update bookmarks to reflect the new location.

= 2.1.3. = The kernel hackers guide

<URL:http://www.redhat.com:8080/HyperNews/get/khg.html>

Once upon a time it was a paper document but due to the 'moving target' nature of a constantly developed kernel it is now completely web-based, using a hyper-news system so users can add input.

= 2.1.4 = Mailing lists

As mentioned above, there are a number of more specialised Linux mailing lists. Many of these are run from vger.rutgers.edu. To get a listing of those running at vger, send an e-mail to <URL:mailto:majordomo@vger.rutgers.edu>, containing the single word "lists". Some of the lists are mentioned in the MAINTAINERS file in the Linux source code.

A lot of mailing lists are archived at the Linux v2 Information HQ <URL:http://www.linuxhq.com/lnxlists>.

Having a close look at the linux-admin list would be worthwile. Many of the off-topic questions on linux-kernel are appropriate for linux-admin and the latter list seems to be pretty well behaved. The admin list is approaching the amount of traffic on the kernel list.

= 2.1.5. = Writing device drivers

<URL:http://www.redhat.com/~johnsonm/devices.html>

A paper of a talk by Michael K. Johnson given at Spring DECUS'95. According to the author this is probably a bit dated but might be still worth reading.

= 2.1.6. = The Linux Wish List

<URL:http://www.cs.uml.edu/~acahalan/linux/wishlist.html>

The Linux Wish List, compiled from the contents of the kernel mailing list. Check this before posting enhancement requests to the mailing list or to get some inspiration for your kernel project.

= 2.1.7. = Periodicals

- Linux Journal

<URL:http://www.ssc.com/lj/>

Linux Journal has had a long-running series of articles called Kernel Korner which, has had quite a bit useful information in it. Four articles on writing runtime-loadable character device drivers in issues 23 - 26 have been made available on Linux Journal's WWW Site:

<URL:http://www.ssc.com/lj/issue23/1219.html> <URL:http://www.ssc.com/lj/issue24/kk24.html> <URL:http://www.ssc.com/lj/issue25/kk25.html> <URL:http://www.ssc.com/lj/issue26/interrupt.html>

= 2.1.8. = Books

There aren't any book reviews in this section, just pointers to those. First you might want to look at Cameron McKinnon's article on geting started with kernel development, which also mentions some books.

The next place to stop is the Linux Reading List Mini-HOWTO, which although being a bit dated list a lot of the books useful to kernel programmers:

<URL:http://www.caldera.com/LDP/HOWTO/mini/Reading-List> <URL:http://sunsite.unc.edu/LDP/HOWTO/mini/Reading-List>

= 2.2. = Other points of interest

= 2.2.1. = The Linux Home Page

<URL:http://www.linux.org>

A very good starting point for the new and old Linux user. It contains many starting points including information on the history and creation of Linux, FAQ's and HOWTO's, Linux Software Map, Manual Pages, Usenet Newsgroups, Where To Get Linux, Linux Documentation Map, Linux International, Linux On The World Wide Web, Hot Linux News, Linux Gazette, and Linux Journal.

= 2.2.2. = The Linux documentation project

The Linux documentation project:

<URL:http://www.caldera.com/LDP> <URL:http://sunsite.unc.edu/LDP>

The LDP contains _loads_ of Linux documentation, including, but not limited to the HOWTOS, FAQs, manpages and various guides.

= 2.2.3. = The Linux Software Map

There are a few different web frontends for the Linux Software Map:

<URL:http://www.ssc.com/linux/lsm.html> <URL:http://www.ngs.fi/lsm> <URL:http://www.boutell.com/lsm>

If you prefer downloading the whole database and browsing it offline get it at:

<URL:ftp://ftp.execpc.com/pub/lsm/>

= 2.2.4. = IRC

There is said to be online support on IRC, namely servers irc.linpeople.org (207.16.36.11) or vinge.linpeople.org. Look for the channels #help or #natter.

Also irc.blackdown.org is a good place for advanced topics and kernel hackers.

An IRC client is required to connect to the many IRC networks available on the Internet. A goot place to find IRC clients is:

<URL:ftp://cs-pub.bu.edu/pub/irc/clients>

= 2.2.5. = Linux news groups

<URL:news:comp.os.linux.advocacy> Benefits of Linux compared to other operating systems

<URL:news:comp.os.linux.announce> Announcements important to the Linux community. (Moderated)

<URL:news:comp.os.linux.answers> FAQs, HOWTOs, READMEs, etc. about Linux. (Moderated)

<URL:news:comp.os.linux.development.apps> Writing Linux applications, porting to Linux

<URL:news:comp.os.linux.development.system> Linux kernels, device drivers, modules

<URL:news:comp.os.linux.hardware> Hardware compatibility with the Linux operating system

<URL:news:comp.os.linux.m68k> Linux operating system on 680x0 Amiga, Atari, VME

<URL:news:comp.os.linux.misc> Linux-specific topics not covered by other groups

<URL:news:comp.os.linux.networking> Networking and communications under Linux

<URL:news:comp.os.linux.setup> Linux installation and system administration

<URL:news:comp.os.linux.x> Linux X Window System servers, clients, libs and fonts

= 2.2.6. = Linux Gazette

<URL:http://www.ssc.com/lg>

The Linux Gazette is a monthly compilation of basic tips, tricks, suggestions, ideas, and short articles about Linux designed to make using Linux fun and easy. LG began as a personal project of John M. Fisk, and grew to include contributions freely provided by a growing number of authors.

= 2.2.7. = Other URLs

- Linux kernel <URL:ftp://ftp.cs.helsinki.fi/pub/Software/Linux/Kernel/>

- Linux on the web <URL:http://www.ssc.com/linux/web.html>

- Sunsite Linux archive <URL:http://sunsite.unc.edu/pub/Linux/>

- Find new files on Sunsite <URL:http://sunsite.unc.edu/paulc/incoming.html>

- Linux kernel change summary <URL:http://www.crynwr.com/kchanges/>

- Linux Source Navigator <URL:http://sunsite.unc.edu/linux-source>

- Linux archive search <URL:http://torgo.ml.org/las>

- Linux network drivers <URL:http://cesdis1.gsfc.nasa.gov:80/linux/drivers/>

- Linux SMP project <URL:http://www.uk.linux.org/SMP/title.html> - Linux NOW! <URL:http://www.linuxnow.com>

- Linux applications and utilities page <URL:http://www.xnet.com/~blatura/linapps.shtml>

- Woven goods for linux - a collection of WWW applications and hypertext-based information about Linux. <URL:http://www.fokus.gmd.de/linux>

= 3. = The kernel

= 3.1. = Common problems

= 3.1.1. = Before you dive into it

Read up on all strategies for error recovery. Your file system corruption might be caused by the same problem that causes another user's Signal 11 trouble; a kernel is a complex piece of software, and errors happening in the kernel or in hardware below might cause every thinkable and unthinkable kind of problem. Errors propagate in unforseeable ways. There are hardware problems which show up only on Linux while Win95, NT, OS/2, Doom and Quake run fine on the same computer. You are dealing with software handling hardware, this is extremely complex and right next to black magic. Flipping one bit of memory in the kernel creates strange results, just like adding a drop of tabasco sauce to a witch's cauldron might make her conjure up a horde of croaking frogs instead of the (by her) desired white prince.

= 3.1.2. = File system corruption

"On a block device (block size 1024 bytes) with ext2fs, I don't reliably get back from a file what I first wrote in."

Normally any kind of file system corruption is a sign of hardware problems or problems with a low level I/O driver. Ext2fs is a quite tested and stable file system, but due to its high performance it is likely to dig out problems in the lower levels of the system.

When I experienced ext2fs file system corruption myself, a query on the linux newsgroups showed others had experienced similar problems. It turned out to be a firmware problem in the Conner CFP1060S hard drive, when data was read _really fast_ the buffer cache algorithm in the drive firmware failed.

There are a lot of things to try when you get ext2fs corruption:

- Use tune2fs to set your file system to drop into read only mode when an error occurs. This will prevent small errors causing catastrophes. The command is:

tune2fs -e remount-ro /dev/???

You should also use that command to set a time interval between file systems checks, because (especially on long running servers) it can take eons to reach the maximum mount count.

- Check the partition tables

Especially on the Intel x86 platform partition tables can easily be broken. Such a problem can occur when the drives were partitioned using another disk controller than they are used with. Use fdisk dump the tables and ensure partitions are set up correctly.

- Tune Linux and your BIOS to slow and safe parameters. Turn off bus (PCI) optimizations.

- Use an empty partition to check if the problem lurks in the kernel levels below the file system. Probably the most simple test is to copy /dev/zero to the partition (using dd) and comparing the partition and /dev/zero afterwards (using cmp) if there are any differences.

A very thorough test has been suggested on the mailing list by Doug Ledford <URL:mailto:dledfor@dialnet.net>:

"I'll go one step further with this. I would recommend that the people having problems with ext2fs corruption run the following test (if possible):

Let's say you have a hard drive partition of decent size that you don't mind losing the data on (or even if you do mind, this test can turn up a lot of errors so if you have an inconvenient way of getting back, then you should probably do this anyway)

First, get the exact size of the partition (or the whole drive as the case may be in some circumstances) in 1K blocks.

Divide this total number of blocks into 4 equal chunks (most drives do this easily, some may have a few odd sized chunks).

Write a script like this:

badblocks -w -s -b 1024 -o /tmp/list.1 /dev/??? (blocks * .25) 0 & badblocks -w -s -b 1024 -o /tmp/list.2 /dev/??? (blocks * .5) (blocks * .25) & badblocks -w -s -b 1024 -o /tmp/list.3 /dev/??? (blocks * .75) (blocks * .5) & badblocks -w -s -b 1024 -o /tmp/list.4 /dev/??? (blocks) (blocks * .75) &

A simple shell script like this will run four simultaneous badblocks programs on the drive. A person can then check the files in the /tmp directory to see if any were returned as bad. With modern IDE or SCSI drives, all of these files should have a zero length unless one of two things is true. One, you have a drive developing too many bad sectors to be mapped out (which is cause for alarm in itself) or two, you have corruption in your low level driver (or other low level hardware such as memory or cache or bus transfer problems). If these test return all 0 length files, then we should start looking elsewhere for the problem. Run the test several times, as a single pass may not show the problem. If you are really courageous, you can try doubling the tests by splitting the drive into 8 equal chunks (or if you have two drives you can do both drives at four chunks each at the same time). This is a standard test I use with the aic7xxx driver to find problems with tagged queueing and high commands per lun values. It seems to show problems much quicker than any file system activity would (in my case, I had as many as 24 of these running simultaneously on 6 drives in order to test this out, talk about a dog slow machine, it took about 5 minutes just to start X windows under this load).

In any case, running tests like these to rule out hardware corruption would help greatly in increasing the level of confidence that somehow the ext2fs layer is at fault (which I personally don't think it is except under very rare occasions since I have a hard hit news server running that file system without problems, but I've taken the care and gone to the lengths to run these tests on the particular hardware in that machine and identified bad combinations that can cause problems and worked around them at the driver level)."

Later on Doug followed up to another article on the ext2fs corruption thread:

"Correct. And it's very useful information to have at that. If you can produce corruption problems without going through the ext2fs code, then you have hardware corruption of some sort. An example of some of the things in the past that I have personally seen cause hardware corruption which made one *THINK* that something was wrong with the ext2fs code when there wasn't:

1. Bad CPU fans on pentium and high speed 486 machines 2. Bad SCSI cables 3. Memory timing settings in BIOS being just a tad too aggressive 4. Bad memory 5. Bad Pipeline Burst (or other) cache 6. Too long of a SCSI or IDE cable 7. Interference between SCSI and IDE cables running in close proximity to each other 8. Flaky CPU (had been overclocked and partially burnt out) 9. Esoteric BIOS options being enabled when they shouldn't be (this takes some experimentation to find and fix, a change BIOS settings, test to see if problem is gone, if not, reboot and change settings again type thing)

These are a few examples. A second thing to keep in mind is that the ext2fs is a rather fast filesystem by unix standards (it beats the hell out of the EAFS HTFS DTFS etc filesystems from SCO, but who's comparing SCO to linux anyway :) so if you have hardware corruption problems that don't show up except under heavy load, ext2fs is a good filesystem to bring those out :)

And of course, the very reason I posted my original email as part of this thread. A person needs to always keep in mind that if they are getting ext2fs errors about corruption, this does *NOT* always mean the ext2fs is at fault. It means that somewhere along the way, either due to code in the ext2fs, or code in the block driver you are using, or code in the low level driver you are using, or somewhere between the CPU, RAM, cache, bus, controller, drive bus, drive, and magnetic media, something is getting corrupted. It is important in these cases to try and isolate software faults from hardware faults. The purpose of the "script" I posted was to give a convenient way of trying to narrow down the line between hardware and software. There is still software involved with that script, but not as much. You are down to just the badblocks program, the various buffer mechanisms, and the block driver itself (with its underlying low level driver). Generally speaking, the buffer cache is considered to be safe code, so you can rule that out. Most of the block drivers are considered to be the same, so they can be ruled out. This leaves the underlying low level driver and the badblocks program as suspect. The badblocks program is rather simple in design, and an inspection of the source will result in the conclusion that it too can be ruled out (not to mention how many times it's been used to find these problems, yet I've never once heard of it causing sectors that are fine to be mapped as bad unless the underlying driver had problems). That means that the script I posted is really stressing hardware and your underlying low level driver. All in all, that greatly reduces the number of variables to look at. So, a failure during the testing by the badblocks program gives a person somewhere to look. They can either fiddle with compile options for their low level driver, or they can start the process of trying to enable/disable things in the computer's BIOS to try and find a culprit (disable cache this run, delay memory timings that run, etc) which then allows a person to try and pinpoint the exact problem, get it fixed, and be on their way :) Further, as long as you fail this test, there is no sense at all in even looking at the ext2fs code since you won't know if you've fixed anything by changing it unless something you did just happened to slow things down enough to keep the problem from showing up. In this case, instead of slowing the machine down to be reliable and leaving fast code in place, you've slowed the code down so it doesn't break your faulty hardware."

- Use debugging tools to check your system.

Memtest-86, a thorough, stand alone memory tet for 386, 486 and 586 systems:

<URL:ftp://sunsite.unc.edu/pub/Linux/system/misc/memtest86-1.2.tar.gz>

If you think a particular tool shall be listed please mail to <URL:mailto:kernelfaq@iconsult.com>

= 3.1.3. = Signal 11

If your processes frequently die because of a signal 11, there might be a problem with your hard- or software. There's a FAQ regarding signal 11 at <URL:http://www.bitwizard.nl/sig11>.

You should read the Signal 11 FAQ even if you have a different problem; the procedures mentioned in the FAQ will probably help finding that one, too.

= 3.1.4. = Seasonal Problems

= 3.1.4.1. = Warning: possible SYN flooding. Sending cookies.

> I got 44 of these 2 days ago, then another 35 more. > > " Warning: possible SYN flooding. Sending cookies."

This need not be an attack. It _does_ mean that your backlog has become full. This can be a consequence of crummy network connections between you and legitimate remote sites. If you normally don't see 67 connection attempts per second then it's probably an attack.

> My interpretation is that somebody has flooded the irc port to > kill the server, am I right? What are the chances that this is > not an attack, but just "one of those things?"

It very much depends on how busy your irc port is and how bad network conditions are between you and the users of your irc. To really find out if you are being attacked you would need to start taking TCP dumps and look for streams of SYN packets with addresses that are unreachable. Large numbers of packets from the same unreachable address would be a give away.

Answered by Eric Schenk <url:mailto:Eric.Schenk@dna.lth.sh>

= 3.1.4.2. = Kernel hangs / no output after "Now booting the kernel ..."

> The kernel is loaded, uncompressed and it hangs after the message "Now booting the kernel...".

Are you sure you have VTs enabled? They became optional in 2.1.31, and default to being disabled.

= 3.1.4.3. = Ignoring P6 Local APIC Spurious Interrupt Bug

> Is there a problem with the P6? Or with the board? If it's a > problem with the P6, do all P6's have this problem? Does this > bug affect the system in any way?

It's a problem with the Local APIC on most steppings of the Pentium Pro CPU. Specifically, a spurious interrupt is delivered as an exception 15 (a reserved code) rather than as interrupt 15. The bug is benign, so long as the kernel ignores the exception 15.

> Is it a problem to comment out the line in the kernel that shows > this message? It's messing up the display.

You can safely comment it out.

Answered by Leonard N. Zubkoff <url:mailto:lnz@dandelion.com>

= 3.2. = How to get started on kernel development

Cameron MacKinnon <URL:mailto:mackin@interlog.com> wrote a wonderful article on that topic:

"... I'm not a pro, but I generally know what's going on for least part of the time. Here's what I did:

I bought books. Here's reviews: LINUX Kernel Internals, Beck et al, Addison Wesley, 0-201-87741-4. I read about a third of it. It's dated (1.2 kernels) and doesn't have anything about SCSI in it, but it's the only Linux kernel book out there. There's a new version out for 2.0 kernels, but only in the original German. 'The Design and Implementation of the 4.4 BSD Operating System', McKusick et al, Addison Wesley, 0-201-54979-4. A much more readable book, IMHO. It talks about the BSD design in general, why things changed over time, why and how specific performance tradeoffs were made, etcetera. Also, 'The Magic Garden Explained' or something like that, borrowed, pub. and ISBN unknown. This book is a very thorough coverage of the design of System 5 Release 4 (SVR4), but not as easy to read as the BSD book. Bottom line: Beg, borrow, check out or steal one book, any book, on the design of the UNIX operating system. Sit in a library or a bookstore reading it, if you haven't got the money. You need to understand how schedulers, pagers, swappers, top and bottom halves, wait queues, inodes, ttys, the boot process, init and some other stuff work. Most of this stuff will be applicable to Linux at the concept level, regardless of the book (ignore anything on SysV STREAMS). Unless you're extremely gifted, the concepts won't reveal themselves to you from kernel source code. LEARN THE CONCEPTS. The Linux community is not a good place to do this - this list assumes that if you're here, you already know them. If you're one of those truly unlucky people with no access to such a book, try to find this info on the net. I've never really looked. If all else fails, proceed to step two:

I read Michael Johnson's Kernel Hackers' Guide. It wasn't perfect when I read it, but that was a while ago. 1) It's probably perfect by now. 2) It's free. You can get it anywhere, including here: <URL:http://www.redhat.com:8080/HyperNews/get/khg.html> It does a good job of mapping the concepts you just learned to actual kernel function calls and processes in Linux. Also, many kernel functions have man pages, though they're horribly out of date.

I subscribed to mailing lists. Initially I was all over: gcc, kernel, a few scsi lists, security... Now I've got it down to a core of kernel, two SCSI driver lists, DIALD, security and SMP. Don't be afraid to subscribe to a lot of lists (read-only!) for a few weeks to see what interests you. You can always unsubscribe later. Some people prefer reading the lists via news, but I'd recommend mail: You SAVE the mail on your hard disk. It becomes your personal reference library (N.B. UNIX has some really great text search and processing tools). You read all the mail. This gives you a feel for what's being worked on and what's not, who knows what they're talking about and who doesn't, and what snags are troubling other users. This is important so you can ask senior developers PRIVATELY when you have questions relating to The Code - unless you genuinely believe that a lot of list subscribers also want the answer. Also, some of the news gateways appear to be brutally broken, randomly mixing messages from different linux lists like a cypherpunk remailer gone mad. I recommend going straight to the source: send 'help' to mailto:majordomo@vger.rutgers.edu

I quickly got over the idea that I could learn everything about the kernel. Last time I looked, it was over 600,000 lines of source. I can muck around with SCSI and network device drivers, I understand the mid level SCSI code, and I've got a reasonably good handle on the scheduler. That leaves high level networking, filesystems, the buffer cache and memory management, to name a few, ABOUT WHICH I HAVEN'T A CLUE. Pick an area you want to diddle with, and concentrate on that. If you don't believe me, grab a dictionary and look up 'hubris'.

I read most (some?) of the important stuff in Documentation/ (you should read it all) and then: I dove into the code, wholeheartedly, for nights (days?) at a time. Pick drivers. Concentrate on the simple ones - you want concepts, not nasty workarounds for buggy hardware. Try 'wc *.c|sort' in your favorite directory. Pick ones that look well formatted and well commented, and see how they're written and how they interact with the higher level stuff. Go into each subdirectory in the whole linux/ tree, and learn what lives there. You should be able to identify what's what from the stuff you read in those books. Note especially mm/ and kernel/, along with their counterparts under arch/. Here lie most of the important functions for juggling memory, interrupts, processes etcetera. Learn to use grep, find and xargs effectively. If you have a strong constitution, look in the scripts/ directory and the Makefiles everywhere to see how the kernel actually gets built. If you're a bit twiddler at heart, look at the low level stuff for your favorite architecture under arch/.

If you've still got the lust for knowledge at this point, you will probably have found 'that special something' that interests you in the kernel. You will know generally how things work from the source, and you will know the right people to ask from the source and the mailing lists. If you have a question, go ahead and ask it. I've found developers to be very helpful when asked questions by someone who's obviously studied the sources. Play around. Recompile. Benchmark. Test.

One thing that's probably overlooked by a lot of Linux people: BSD, 'the other free UNIX'. I can't even tell you the difference between FreeBSD and NetBSD, but for my purposes, I don't care. They're available free on the net or a CD, just like Linux <URL:http://ftp.freebsd.org> and <URL:http://www.freebsd.org>. If you're stumped by something in Linux, seeing how BSD does it is often helpful, especially for device drivers. Also (ahem) BSD code sometimes seems to be commented and formatted somewhat better. I don't run it, I just look at the source.

At this stage your hats will no longer fit, and your dog will have run off with your girlfriend. No matter, because you'll be able to ask, and sometimes answer, intelligent questions about kernel design, in your particular specialty areas. You'll be fixing insidious bugs, improving performance, and posting things like 'this patch is from memory and untested, but it will solve your problem on 2.1.87: [proper patch syntax]'

I'm not at this stage yet, and I've been working at it for a while. That's why I usually post answers to questions like 'where do I begin' rather than 'why did it hang'. The above is working for me, it might work for you. May the Source be With You, Always."

= A. = Appendix A - Maintainers

This policy and FAQ is currently maintained by Frohwalt Egerer <URL:mailto:froh@iconsult.com>. For communication concerning this document please use <URL:mailto:kernelfaq@iconsult.com>, that mail alias will always point to the current maintainer.

The policy is created from input on the linux kernel mailing list. If desired by members of the list a vote will be held to ratify the policy.

Thanks to David A Rusling <URL:mailto:rusling@linux.reo.dec.com> for providing the foundation to section one. Thanks to Colin Plumb <URL:mailto:colin@nyx.net> who refined my proposal for section one using his fine taste of language.

Thanks to Cameron MacKinnon <URL:mailto:mackin@interlog.com> for his really great article on getting started with kernel development which I adopted into this document.

Thanks to Doug Ledford <URL:mailto:dledford@dialnet.net> for his excellent description on how to hunt down filesystem corruption problems.

Thanks to Eric Hoeltzel <URL:mailto:eric@dogbert.sitewerks.com> for the enormous amount of suggestions.

Thanks to Evgeny Rodichev <URL:maito:er@sai.msu.su> for providing the ver_linux shell script.

And thanks to W. Reilly Cooley, Kevin Fenzi, Gabriel Paubert, Marc Merlin, Tethys, Antoine Reid, Sebastian Benoit, Regis Duchesne, Riccardo Facchetti, J. Sean Connel, Seth M. Landsman, Martin Radford, James Mastros, Nicholas J. Leon, E.Rodichev, Antoine Reid, Ben Clifford, Melissa Johnson, Dave Wreski, Greg Patterson, Keith Rohrer, Roch-Alexandre Nomine-Beguin, Raymo <slk9q@cc.usu.edu>, Elliot Lee, Greg Alexander, Billy Harvey, Harald Milz, John Carter, Janos Farkas, Tony Gale, Garst R. Reese, Peter P. Eiserloh, Axel Boldt and all that I forgot to mention for their input, suggestions and articles on the mailing list. This FAQ would not exist without their help.

The quote of George B. Shaw in section 1.4 might not be accurate. I heard it on German TV, remembered it for a few weeks and translated it back to English for this FAQ.

= B. = Appendix B - Linux Kernel Mailing List Bug Report Form

Please use the following form to report bugs to the Linux kernel mailing list. Having a standardized bug report form makes it easier for you not to overlook things, and easier for the developers to find just the little tad of information they're really interested in.

First run the ver_linux script included at the end of this Appendix or at <URL:ftp://ftp.sai.msu.su//sai2/ftp/pub/Linux/ver_linux> It checks out the version of some important subsystems.

Use that information to fill in all fields of the bug report form, and post it to the mailing list with a subject of "ISSUE: <one lime summary from [1.]>" for easy identification by the developers

[1.] One line summary of the problem: [2.] Full description of the problem/report: [3.] Keywords (i.e., modules, networking, kernel): [4.] Kernel version (from /proc/version): [5.] Output of Oops.. message with symbolic information resolved (see Kernel Mailing List FAQ, Section 1.5): [6.] A small shell script or example program which triggers the problem (if possible) [7.] Environment [7.1.] Software (add the output of the ver_linux script here) [7.2.] Processor information (from /proc/cpuinfo): [7.3.] Module information (from /proc/modules): [7.4.] SCSI information (from /proc/scsi/scsi): [7.5.] Other information that might be relevant to the problem (please look in /proc and include all information that you think to be relevant): [X.] Other notes, patches, fixes, workarounds:

ver_linux (by Evgeny Rodichev <URL:mailto:er@sai.msu.su>

#!/bin/sh # Before running this script please ensure that your PATH is # typical as you use for compilation/istallation. I use # /bin /sbin /usr/bin /usr/sbin /usr/local/bin, but it may # differs on your system. # echo '-- Versions installed: (if some fields are empty or looks' echo '-- unusual then possibly you have very old versions)' uname -a insmod -V 1>/tmp/ver_linux.tmp 2>>/tmp/ver_linux.tmp awk 'NR==1{print "Kernel modules ",$NF}' /tmp/ver_linux.tmp rm -f /tmp/ver_linux.tmp echo "Gnu C " `gcc --version` ld -v 2>&1 | awk -F\) '{print $1}' | awk \ '/BFD/{print "Binutils ",$NF}' ls -l `ldd /bin/sh | awk '/libc/{print $3}'` | awk -F. \ '{print "Linux C Library " $(NF-2)"."$(NF-1)"."$NF}' ldd -v | awk '{print "Dynamic Linker (ld.so)", $3}' ls -l /usr/lib/libg++.so | awk -F. \ '{print "Linux C++ Library " $4"."$5"."$6}' ps --version 2>&1 | awk 'NR==1{print "Procps ", $NF}' mount --version | awk -F\- '{print "Mount ", $NF}' netstat --version | awk \ 'NR==1{if ($5 != "") { n=split($5,buf,"-"); ver=buf[n]; done=1 }} NR==2{if (done != 1) ver=$3 } END{print "Net-tools ",ver}' loadkeys -h 2>&1 | awk 'NR==1{print "Kbd ",$3}' expr --v | awk '{print "Sh-utils ", $NF}'

= C. = Appendix C - Unresolved Issues

> "I have recently found a need for an encrypted fs, I have > however had trouble finding any ones that meet my needs."

=> Which encrypted fs implementations are out there, kernel and userspace versions?

= D. = Appendix D - Change Log

$Log: draft,v $ Revision 1.11 1997/04/10 10:24:01 cvs Consistent layout (use 2 spaces between sentences) Added more information on unsubscribing Added examples how to do kernel diffs Added SYN Flooding information Added P6 Local APIC Information Imported ver_linux update by Evgeny Rodichev

Revision 1.10 1997/04/10 07:29:19 cvs Contents updated automatically

Revision 1.9 1997/04/07 05:44:23 cvs Added a revision header to the top of the draft. Otherwise 1.9 is equivalent to 1.8

Revision 1.8 1997/04/07 05:40:09 cvs Imported yet another set of suggestions by Eric. Imported the ver_linux data collection script of Evgeny Rodichev.

Revision 1.7 1997/04/07 04:32:48 cvs FAQ is available on http://kernelfaq.iconsult.com from now on.

Revision 1.6 1997/04/06 22:05:07 cvs Imported lots of e-mailed spelling changes for the parts added in 1.5 Elaborated a bit more on some sections First draft of the Bug Report Form Included, updated and corrected a number of resources Started to use '=' markers so grep will build the table of contents for me

Revision 1.5 1997/04/06 02:43:07 cvs Fixed typo in Newsgroups: line. Feeling sheepish.

Revision 1.4 1997/04/06 02:41:29 cvs Imported spelling and stylistic corrections Elaborated on bug reports (section 1.5.) Included secion 1.6. "Kernel patches" Added some new resources to section 2 Added section 3, and 3.1 (common problems) and move the 'getting started' text to 3.2. Added a changelog (maintained by cvs)

Revision 1.1-1.3 "long ago" Early drafts, 1.3 was not released to the public.