Re: abnormal OOM killer message

From: Mel Gorman
Date: Wed Aug 19 2009 - 06:36:16 EST


On Wed, Aug 19, 2009 at 03:49:58PM +0900, Minchan Kim wrote:
> On Wed, 19 Aug 2009 15:24:54 +0900
> ????????? <chungki.woo@xxxxxxxxx> wrote:
>
> > Thank you very much for replys.
> >
> > But I think it seems not to relate with stale data problem in compcache.
> > My question was why last chance to allocate memory was failed.
> > When OOM killer is executed, memory state is not a condition to
> > execute OOM killer.
> > Specially, there are so many pages of order 0. And allocating order is zero.
> > I think that last allocating memory should have succeeded.
> > That's my worry.
>
> Yes. I agree with you.
> Mel. Could you give some comment in this situation ?
> Is it possible that order 0 allocation is failed
> even there are many pages in buddy ?
>

Not ordinarily. If it happens, I tend to suspect that the free list data
is corrupted and would put a check in __rmqueue() that looked like

BUG_ON(list_empty(&area->free_list) && area->nr_free);

The second question is, why are we in direct reclaim this far above the
watermark? It should only be kswapd that is doing any reclaim at that
point. That makes me wonder again are the free lists corrupted.

The other possibility is that the zonelist used for allocation in the
troubled path contains no populated zones. I would put a BUG_ON check in
get_page_from_freelist() to check if the first zone in the zonelist has no
pages. If that bug triggers, it might explain why OOMs are triggering for
no good reason.

I consider both of those possibilities abnormal though.

> >
> > -----------------------------------------------------------------------------------------------------------------------------------------------
> > page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
> > <== this is last chance
> > zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
> > <== uses ALLOC_WMARK_HIGH
> > if (page)
> > goto got_pg;
> >
> > out_of_memory(zonelist, gfp_mask, order);
> > goto restart;
> > -----------------------------------------------------------------------------------------------------------------------------------------------
> >
> > > Let me have a question.
> > > Now the system has 79M as total swap.
> > > It's bigger than system memory size.
> > > Is it possible in compcache?
> > > Can we believe the number?
> >
> > Yeah, It's possible. 79Mbyte is data size can be swap.
> > It's not compressed data size. It's just original data size.
>
> You means your pages with 79M are swap out in compcache's reserved
> memory?
>

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/