Re: [PATCH mmotm] memcg: further prevent OOM with too many dirtypages

From: Michal Hocko
Date: Tue Jul 17 2012 - 02:33:03 EST


On Mon 16-07-12 21:52:51, Hugh Dickins wrote:
> On Mon, 16 Jul 2012, Michal Hocko wrote:
> > On Mon 16-07-12 01:35:34, Hugh Dickins wrote:
> > > But even so, the test still OOMs sometimes: when originally testing
> > > on 3.5-rc6, it OOMed about one time in five or ten; when testing
> > > just now on 3.5-rc6-mm1, it OOMed on the first iteration.
> > >
> > > This residual problem comes from an accumulation of pages under
> > > ordinary writeback, not marked PageReclaim, so rightly not causing
> > > the memcg check to wait on their writeback: these too can prevent
> > > shrink_page_list() from freeing any pages, so many times that memcg
> > > reclaim fails and OOMs.
> >
> > I guess you managed to trigger this with 20M limit, right?
>
> That's right.
>
> > I have tested
> > with different group sizes but the writeback didn't trigger for most of
> > them and all the dirty data were flushed from the reclaim.
>
> I didn't examine writeback stats to confirm, but I guess that just
> occasionally it managed to come in and do enough work to confound us.
>
> > Have you used any special setting the dirty ratio?
>
> No, I wasn't imaginative enough to try that.
>
> > Or was it with xfs (IIUC that one
> > does ignore writeback from the direct reclaim completely).
>
> No, just ext4 at that point.
>
> I have since tested the final patch with ext4, ext3 (by ext3 driver
> and by ext4 driver), ext2 (by ext2 driver and by ext4 driver), xfs,
> btrfs, vfat, tmpfs (with swap on the USB stick) and block device:
> about an hour on each, no surprises, all okay.
>
> But I didn't experiment beyond the 20M memcg.

Great coverage anyway. Thanks a lot Hugh!

>
> Hugh

--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/