Re: NFSD stops working

From: Neil Brown
Date: Tue Apr 11 2006 - 06:42:42 EST


On Tuesday April 11, pkoch@xxxxxxxxxxxxxxx wrote:
> Hi,
>
> we have a big problem with the nfsd.
>
> Our system is running SuSE SLES 9 SP2 (AMD64)on
> 6 Sun X4100 (1 Opteron 275, 4GB RAM, 1 *QLA2340)
> All machines are connect via SAN to a EMC CX700.
>
> Software installed SuSE SLES 9 SP2, kernel 2.6.5-7.191-smp
> which is required for the GFS (Adic StorNEXT 2.6.1)
> and also for EMC Powerpath to access the lun's.

Any chance of upgrading to SP3 ??

>
> After 8 to 80h one (not allways the same)
> of the NFS server is starting to increase
> the load from 0 -> ~7.5.
> The system is not exporting any file system.
> The system can be accessed using ssh.
> Files CAN be seen, created and delete on the machine
> locally.
> The nfsd can't not be stopped and taking all of the
> cpu time.
> The load is rather small.
> Only solution is a hard reboot for the node.
>
> dmesg shows nothing, neither messages.
> fuser will not work (hanging)
>
> a) How can I get more information ?

echo t > /proc/sysrq-trigger
and collect the trace from /var/log/messages. That will show what
each nfsd thread is doing.

Post that and I'll see if it helps.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/