Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects

From: Chris Wilson
Date: Wed Jul 13 2011 - 04:40:39 EST


On Wed, 13 Jul 2011 17:19:22 +0900, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> (2011/07/13 16:41), Chris Wilson wrote:
> > On Wed, 13 Jul 2011 09:19:24 +0900, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> >> (2011/07/12 19:06), Chris Wilson wrote:
> >>> On Tue, 12 Jul 2011 18:36:50 +0900, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> >>>> Hi,
> >>>>
> >>>> sorry for the delay.
> >>>>
> >>>>> On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@xxxxxxxxxx> wrote:
> >>>>>> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> >> The matter is not in contention. The problem is happen if the mutex is taken
> >> by shrink_slab calling thread. i915_gem_inactive_shrink() have no way to shink
> >> objects. How do you detect such case?
> >
> > In the primary allocator for the backing pages whilst the mutex is held we
> > do __NORETRY and a manual shrinkage of our buffers before failing. That's
> > the largest allocator, all the others are tiny and short-lived by
> > comparison and left to fail.
>
> __NORETRY perhaps might help to avoid false positive oom. But, __NORETRY still makes
> full page reclaim and may drop a lot of innocent page cache, and then system may
> become slow down.

But in this context, that is memory the user has requested to be used with
the GPU, so the page cache is sacrificed to meet the allocation, if
possible.

> Of course, you don't meet such worst case scenario so easy. But you may need to
> think worst case if you touch memory management code.

Actually we'd much rather you took us into account when designing the mm.

> > That's the *error* I hit when I originally returned -1.
>
> You misunderstand the code. The third argument is critically important.
> Only if it's 0 (ie sc->nr_to_scan==0), shrinker must not return negative.
> Thus, my patch checked nr_to_scan argument. and I've suggested look at
> shrink_icache_memory().

Ok.

> If you are thinking the shrinker protocol is too complicated, doc update
> patch is really welcome.

What I don't understand is the disconnect between objects to shrink and
the number of pages released. We may have tens of thousands of single page
objects that are expensive to free in comparison to a few 10-100MiB
objects that are just sitting idle. Would it be better to report the
estimated number of shrinkable pages instead?
-Chris

--
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/