[PATCH] Deadlock during heavy write activity to userspace NFS serveron local NFS mount

From: Avi Kivity
Date: Mon Jul 26 2004 - 08:12:57 EST


On heavy write activity, allocators wait synchronously for kswapd to
free some memory. But if kswapd is freeing memory via a userspace NFS
server, that server could be waiting for kswapd, and the system seizes
instantly.

This patch (against RHEL 2.4.21-15EL, but should apply either literally
or conceptually to other kernels) allows a process to declare itself as
kswapd's little helper, and thus will not have to wait on kswapd.

--- a/include/linux/prctl.h 2003-10-23 09:00:00.000000000 +0200
+++ b/include/linux/prctl.h 2004-07-21 13:43:01.000000000 +0300
@@ -43,5 +43,10 @@
# define PR_TIMING_TIMESTAMP 1 /* Accurate timestamp based
process timing */

+/* Get/set PF_MEMALLOC task flag bit */
+#define PR_GET_KSWAPD_HELPER 15
+#define PR_SET_KSWAPD_HELPER 16
+
+
#endif /* _LINUX_PRCTL_H */
--- a/kernel/sys.c 2003-10-23 09:00:00.000000000 +0200
+++ b/kernel/sys.c 2004-07-21 13:42:59.000000000 +0300
@@ -1400,6 +1400,22 @@ asmlinkage long sys_prctl(int option, un
}
current->keep_capabilities = arg2;
break;
+ case PR_GET_KSWAPD_HELPER:
+ if (current->flags & PF_MEMALLOC)
+ error = 1;
+ break;
+ case PR_SET_KSWAPD_HELPER:
+ switch (arg2) {
+ case 0:
+ current->flags &= ~PF_MEMALLOC;
+ break;
+ case 1:
+ current->flags |= PF_MEMALLOC;
+ break;
+ default:
+ error = -EINVAL;
+ }
+ break;
default:
error = -EINVAL;
break;


--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/