Re: OOM killer, page fault

From: Minchan Kim
Date: Mon Nov 02 2009 - 18:31:37 EST


Hi, Hugh.

On Mon, 2 Nov 2009 16:26:56 +0000 (GMT)
Hugh Dickins <hugh.dickins@xxxxxxxxxxxxx> wrote:

> On Mon, 2 Nov 2009, Minchan Kim wrote:
> > On Mon, 2 Nov 2009 14:02:16 +0900
> > KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> > >
> > > Maybe some code returns VM_FAULT_OOM by mistake and pagefault_oom_killer()
> > > is called. digging mm/memory.c is necessary...
> > >
> > > I wonder why...now is this code
> > > ===
> > > static int do_nonlinear_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > > unsigned long address, pte_t *page_table, pmd_t *pmd,
> > > unsigned int flags, pte_t orig_pte)
> > > {
> > > pgoff_t pgoff;
> > >
> > > flags |= FAULT_FLAG_NONLINEAR;
> > >
> > > if (!pte_unmap_same(mm, pmd, page_table, orig_pte))
> > > return 0;
> > >
> > > if (unlikely(!(vma->vm_flags & VM_NONLINEAR))) {
> > > /*
> > > * Page table corrupted: show pte and kill process.
> > > */
> > > print_bad_pte(vma, address, orig_pte, NULL);
> > > return VM_FAULT_OOM;
> > > }
> > >
> > > pgoff = pte_to_pgoff(orig_pte);
> > > return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
> > > }
> > > ==
> > > Then, OOM...is this really OOM ?
> >
> > It seems that the goal is to kill process by OOM trick as comment said.
> >
> > I found It results from Hugh's commit 65500d234e74fc4e8f18e1a429bc24e51e75de4a.
> > I think it's not a real OOM.
> >
> > BTW, If it is culpit in this case, print_bad_pte should have remained any log. :)
>
> Yes, the chances are that this is not related to Norbert's problem.
> But thank you for reminding me of that not-very-nice hack of mine.
>
> It was kind-of valid at the time that I wrote it (2.6.15), when
> VM_FAULT_OOM did kill the faulting process. But since then the fault
> path has rightly been changed (in x86 at least, I didn't check the rest)
> to let the OOM killer decide who to kill: so now there's a danger that
> a pagetable corruption there will instead kill some unrelated process.
>
> Being lazy, I'm inclined simply to change that to VM_FAULT_SIGBUS now:
> which doesn't actually guarantee that the process will be killed, but
> should be better than just repeatedly re-faulting on the entry. (I
> don't much want to SIGKILL current since mm might not be current's.)
>
> That aberrant use of VM_FAULT_OOM has recently been copied into
> do_swap_page() (the first instance; the second instance is right -
> hmm, well, the second instance is normally right, but I guess it
> also covers pagetable corruption cases which we can't distinguish
> there; oh well) and should be corrected there too.
>
> Does VM_FAULT_SIGBUS sound good enough to you?

I am Okay.
First of all, we have to prevent innocent process killing.
Second, although it returns SIGBUS, we can distinguish it from normal SIGBUS
by bad pte log.
Third, we don't want to add new VM_FAULT_XXX as possible as. :)


>
> Hugh


--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/