Re: OOM killer, page fault

From: Minchan Kim
Date: Mon Nov 02 2009 - 01:37:59 EST


On Mon, 2 Nov 2009 14:02:16 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:

> On Mon, 2 Nov 2009 13:56:40 +0900
> Minchan Kim <minchan.kim@xxxxxxxxx> wrote:
>
> > On Mon, 2 Nov 2009 13:24:06 +0900 (JST)
> > KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> >
> > > Hi,
> > >
> > > (Cc to linux-mm)
> > >
> > > Wow, this is very strange log.
> > >
> > > > Dear all,
> > > >
> > > > (please Cc)
> > > >
> > > > With 2.6.32-rc5 I got that one:
> > > > [13832.210068] Xorg invoked oom-killer: gfp_mask=0x0, order=0, oom_adj=0
> > >
> > > order = 0
> >
> > I think this problem results from 'gfp_mask = 0x0'.
> > Is it possible?
> >
> > If it isn't H/W problem, Who passes gfp_mask with 0x0?
> > It's culpit.
> >
> > Could you add BUG_ON(gfp_mask == 0x0) in __alloc_pages_nodemask's head?
> >
>
> Maybe some code returns VM_FAULT_OOM by mistake and pagefault_oom_killer()
> is called. digging mm/memory.c is necessary...

I suspect GPU drivers related to X.
It seems many of them returs VM_FAULT_OOM.

If it happens by file map fault, following debug patch can show the culpit.

Norbert, Could you apply this patch and test again?
If you can get the address, you can find function symbol with System.map.


diff --git a/mm/memory.c b/mm/memory.c
index 7e91b5f..47e4b15 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2713,7 +2713,11 @@ static int __do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
vmf.page = NULL;

ret = vma->vm_ops->fault(vma, &vmf);
- if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))
+ if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE))) {
+ printk(KERN_DEBUG "vma->vm_ops->fault : 0x%lx\n", vma->vm_ops->fault);
+ WARN_ON(1);
+
+ }
return ret;

if (unlikely(PageHWPoison(vmf.page))) {





--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/