[PATCH][V4]Make get_user_pages interruptible

From: Ying Han
Date: Mon Nov 24 2008 - 18:22:17 EST


From: Ying Han <yinghan@xxxxxxxxxx>

make get_user_pages interruptible
The initial implementation of checking TIF_MEMDIE covers the cases of OOM
killing. If the process has been OOM killed, the TIF_MEMDIE is set and it
return immediately. This patch includes:

1. add the case that the SIGKILL is sent by user processes. The process can
try to get_user_pages() unlimited memory even if a user process has sent a
SIGKILL to it(maybe a monitor find the process exceed its memory limit and
try to kill it). In the old implementation, the SIGKILL won't be handled
until the get_user_pages() returns.

2. change the return value to be ERESTARTSYS. It makes no sense to return
ENOMEM if the get_user_pages returned by getting a SIGKILL signal.
Considering the general convention for a system call interrupted by a
signal is ERESTARTNOSYS, so the current return value is consistant to that.

Signed-off-by: Paul Menage <menage@xxxxxxxxxx>
Singed-off-by: Ying Han <yinghan@xxxxxxxxxx>

include/linux/sched.h | 1 +
kernel/signal.c | 2 +-
mm/memory.c | 9 +-

diff --git a/include/linux/sched.h b/include/linux/sched.h
index b483f39..f9c6a8a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1790,6 +1790,7 @@ extern void sched_dead(struct task_struct *p);
extern int in_group_p(gid_t);
extern int in_egroup_p(gid_t);

+extern int sigkill_pending(struct task_struct *tsk);
extern void proc_caches_init(void);
extern void flush_signals(struct task_struct *);
extern void ignore_signals(struct task_struct *);
diff --git a/kernel/signal.c b/kernel/signal.c
index 105217d..f3f154e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1497,7 +1497,7 @@ static inline int may_ptrace_stop(void)
* Return nonzero if there is a SIGKILL that should be waking us up.
* Called with the siglock held.
*/
-static int sigkill_pending(struct task_struct *tsk)
+int sigkill_pending(struct task_struct *tsk)
{
return sigismember(&tsk->pending.signal, SIGKILL) ||
sigismember(&tsk->signal->shared_pending.signal, SIGKILL);
diff --git a/mm/memory.c b/mm/memory.c
index 164951c..482820a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1218,12 +1218,12 @@ int __get_user_pages(struct task_struct *tsk, struct m
struct page *page;

/*
- * If tsk is ooming, cut off its access to large memory
- * allocations. It has a pending SIGKILL, but it can't
- * be processed until returning to user space.
+ * If we have a pending SIGKILL, don't keep
+ * allocating memory.
*/
- if (unlikely(test_tsk_thread_flag(tsk, TIF_MEMDIE)))
- return i ? i : -ENOMEM;
+ if (unlikely(sigkill_pending(current) ||
+ sigkill_pending(tsk)))
+ return i ? i : -ERESTARTSYS;

if (write)
foll_flags |= FOLL_WRITE;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/