Re: devfs breakage in 2.4.0 release

From: Andris Pavenis (pavenis@latnet.lv)
Date: Fri Jan 12 2001 - 06:35:20 EST


On Saturday 06 January 2001 15:28, Andris Pavenis wrote:
> Noticed following devfs related problems with kernel version 2.4.0 on one
> Pentium 200MMX box (the same problem with 2.4.0-ac2, but earlier
> 2.4.0-test10 doesn't have this problem)
>
> I was able to reproduce it reliably by following steps:
>          - booted machine in runlevel 3
>          - logged in as user and started MC (on first console)
>          - logged out
>          - logged in as different user (in this case root) and tried to
> start MC again
>
> This time it hangs. The source of problem appears to be devfs related as
> devfsd exited with error message that it cannot state vcc/1 as there is no
> such file or directory. Related parts of log files (boot parameter
> devfs=dall) and other related information (I hope...) is in attachment. Of
> course MC is not behaving nicely, but the primary source of problem seems
> to be devfs

As I tested devfsd dies after I'm logging out (very often on P200MMX,
much more seldom on P3 700). I suspect some devfs related race

> On this machine kernel was compiled for Pentium CPUs. I tried to reproduce
> the same on a different machine with Pentium III 700 using kernel 2.4.0.
> It took more relogging as on Pentium 200, but I got the same problem once
> (on slower machine I was able to reproduce it more reliably).
>

I tries 2.4.1-pre3 and got the same. Modifying devfsd.c to retry stating
some times before giving up workarounds the problem (As far as I tested
I'm getting only one retry ...)

Perhaps it's kernel's bug anyway, but I think it's doesn't harm to make
devfsd slightly more errorproof. I'm including patch for devfsd (I had also
to define __USE_GNU to get devfsd compile with glibc-2.2 at all ...)

Of course best solution would be to fix the race itself (it appeared sometimes
between 2.4.0-test10 and 2.4.0-test12, first one is OK) ....

Andris

--- devfsd/devfsd.c~1 Mon Jul 3 22:43:07 2000
+++ devfsd/devfsd.c Fri Jan 12 13:19:33 2001
@@ -189,6 +189,7 @@
 #include <signal.h>
 #include <regex.h>
 #include <errno.h>
+#define __USE_GNU
 #include <dlfcn.h>
 #include <rpcsvc/ypclnt.h>
 #include <rpcsvc/yp_prot.h>
@@ -918,15 +919,29 @@
     [RETURNS] Nothing.
 */
 {
+ int tries=0;
     mode_t new_mode;
     struct stat statbuf;
 
+Retry:
     if (lstat (info->devname, &statbuf) != 0)
     {
- SYSLOG (LOG_ERR, "error stat(2)ing: \"%s\"\t%s\n",
- info->devname, ERRSTRING);
- SYSLOG (LOG_ERR, "exiting\n");
- exit (1);
+ if (tries<10)
+ {
+ tries++;
+ SYSLOG (LOG_ERR, "error stat(2)ing: \"%s\"\t%s\n",
+ info->devname, ERRSTRING);
+ SYSLOG (LOG_ERR, "retrying (attempt %d) ...\n",tries);
+ usleep (1000); /* Let's sleep a bit */
+ goto Retry;
+ }
+ else
+ {
+ SYSLOG (LOG_ERR, "error stat(2)ing: \"%s\"\t%s\n",
+ info->devname, ERRSTRING);
+ SYSLOG (LOG_ERR, "exiting\n");
+ exit (1);
+ }
     }
     new_mode = (statbuf.st_mode & S_IFMT) |
         (entry->u.permissions.mode & ~S_IFMT);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jan 15 2001 - 21:00:33 EST