Re: kernel BUG at page_alloc.c:98 -- compiling with distcc

From: Marco Roeland
Date: Mon Apr 05 2004 - 06:49:29 EST


On Monday April 5th 2004 Marco Fais wrote:

> I was not saying *this is the problem*, just noticing that all the
> systems that show this problem have this network card, while the other
> systems that are working perfectly are using other network hardware
> (e100 driver) :)

Yes, my conclusion was too hasty, it *is* driver related! ;-)

With hindsight we also should have tried, of course, a 'strace distccd
--no-detach' in a crashing and a non-crashing situation. This would
probably have shown that 'sendfile()' was the first missing system call
(and therefore likely the culprit) in the crashing situation. Oh, well...

> If you read Andrew's message, seems that distcc uses a function that
> trigger the problem -- sendfile() -- so, if netcat doesn't use it, it's
> clear why doesn't panic the kernel.

Yes, sendfile() in combination with the 8139too driver seems to be
causing the trouble. Until that will hopefully be fixed, it doesn't seem
easy to workaround against. At the moment it looks like it is not an
easy configurable option to *not* want to use zero_copy functionality,
either in the kernel, nor in distcc.

There is an '8139cp' driver too, it's supposed to be working better
as well, perhaps that one might not free the pages that are to be
zero_copied across the network before they are sent?! That is the real
problem if I understand Andrew's mail correctly.

You might send a 'linux 8139too sendfile() panic' kind of bugreport
to the 'netdev@xxxxxxxxxxx' mailing list. That is the list where the
networking gurus are supposed to be hanging out. Although IMVHO this bug
is more on the kernel than on the network side. Also filing an entry to
bugzilla.kernel.org might speed up someone fixing the real problem.

Easiest workaround might be to just use a customised distcc for the
machines involved: just download the source from 'distcc.samba.org', do
a regular './configure', and then in the generated 'src/config.h' hand
edit '#undef HAVE_SENDFILE' and '#undef HAVE_SYS_SENDFILE_H'. That
should stop distcc from using sendfile().
--
Marco Roeland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/