NFS client stuck in nfs_free_dentries (2.2.15)

From: Miquel van Smoorenburg (miquels@cistron.nl)
Date: Fri May 19 2000 - 09:16:16 EST


We have a shell server that mounts the user home directories, the
mail directories and the stuff from our public FTP server using
NFS from other Linux (unfsd) servers.

We tried to run the 2.2.13 kernel on it, but it crashed every few
hours. Last week I decided to retry it with 2.2.15, and it got an
uptime of 9 days until it crashed again today.

Magic sysrq revealed that the program counter was always in the same
area:

EIP: 0010:[c01466eb]
EIP: 0010:[c0146702]
EIP: 0010:[c014673c]
EIP: 0010:[c0146777]
EIP: 0010:[c0146771]

Excerpt from System.map:

c01466c4 t nfs_free_dentries
c014678c t nfs_zap_caches

In other words, it's stuck in nfs_free_dentries(). But how is it
possible that it gets stuck in while ((tmp = tmp->next) != head) { } ?
Ah, perhaps if dentry->d_count == 0, is that a valid possibility ?

I've applied the following patch to see if I can catch it in the act.
Seeing that it might take a week to trigger the bug, I thought I'd
post this here first for the real nfs hackers to look at.

--- fs/nfs/inode.c.orig Thu May 4 02:16:46 2000
+++ fs/nfs/inode.c Fri May 19 16:12:27 2000
@@ -411,6 +411,7 @@
 {
         struct list_head *tmp, *head = &inode->i_dentry;
         int unhashed;
+ int cnt = 0;
 
 restart:
         tmp = head;
@@ -422,6 +423,13 @@
                 dprintk("nfs_free_dentries: found %s/%s, d_count=%d, hashed=%d\n",
                         dentry->d_parent->d_name.name, dentry->d_name.name,
                         dentry->d_count, !list_empty(&dentry->d_hash));
+ if (cnt++ > 20000) {
+ printk("nfs_free_dentries: got stuck - debug:\n");
+ printk(" found %s/%s, d_count=%d, hashed=%d\n",
+ dentry->d_parent->d_name.name, dentry->d_name.name,
+ dentry->d_count, !list_empty(&dentry->d_hash));
+ break;
+ }
                 if (!dentry->d_count) {
                         dget(dentry);
                         d_drop(dentry);

Mike.

-- 
Denial. It's not just a river in Egypt.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue May 23 2000 - 21:00:17 EST