Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

From: Daniel Vetter
Date: Fri Jun 24 2022 - 16:23:58 EST

Next message: Daniel Latypov: "Re: [PATCH v2] kunit: tool: refactor internal kconfig handling, allow overriding"
Previous message: Daniel Vetter: "Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker"
In reply to: Daniel Vetter: "Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Jun 20, 2022 at 08:18:04AM -0700, Rob Clark wrote:
> On Mon, Jun 20, 2022 at 7:09 AM Dmitry Osipenko
> <dmitry.osipenko@xxxxxxxxxxxxx> wrote:
> >
> > On 6/19/22 20:53, Rob Clark wrote:
> > ...
> > >> +static unsigned long
> > >> +drm_gem_shmem_shrinker_count_objects(struct shrinker *shrinker,
> > >> + struct shrink_control *sc)
> > >> +{
> > >> + struct drm_gem_shmem_shrinker *gem_shrinker = to_drm_shrinker(shrinker);
> > >> + struct drm_gem_shmem_object *shmem;
> > >> + unsigned long count = 0;
> > >> +
> > >> + if (!mutex_trylock(&gem_shrinker->lock))
> > >> + return 0;
> > >> +
> > >> + list_for_each_entry(shmem, &gem_shrinker->lru_evictable, madv_list) {
> > >> + count += shmem->base.size;
> > >> +
> > >> + if (count >= SHRINK_EMPTY)
> > >> + break;
> > >> + }
> > >> +
> > >> + mutex_unlock(&gem_shrinker->lock);
> > >
> > > As I mentioned on other thread, count_objects, being approximate but
> > > lockless and fast is the important thing. Otherwise when you start
> > > hitting the shrinker on many threads, you end up serializing them all,
> > > even if you have no pages to return to the system at that point.
> >
> > Daniel's point for dropping the lockless variant was that we're already
> > in trouble if we're hitting shrinker too often and extra optimizations
> > won't bring much benefits to us.
>
> At least with zram swap (which I highly recommend using even if you
> are not using a physical swap file/partition), swapin/out is actually
> quite fast. And if you are leaning on zram swap to fit 8GB of chrome
> browser on a 4GB device, the shrinker gets hit quite a lot. Lower
> spec (4GB RAM) chromebooks can be under constant memory pressure and
> can quite easily get into a situation where you are hitting the
> shrinker on many threads simultaneously. So it is pretty important
> for all shrinkers in the system (not just drm driver) to be as
> concurrent as possible. As long as you avoid serializing reclaim on
> all the threads, performance can still be quite good, but if you don't
> performance will fall off a cliff.
>
> jfwiw, we are seeing pretty good results (iirc 40-70% increase in open
> tab counts) with the combination of eviction + multigen LRU[1] +
> sizing zram swap to be 2x physical RAM
>
> [1] https://lwn.net/Articles/856931/
>
> > Alright, I'll add back the lockless variant (or will use yours
> > drm_gem_lru) in the next revision. The code difference is very small
> > after all.
> >
> > ...
> > >> + /* prevent racing with the dma-buf importing/exporting */
> > >> + if (!mutex_trylock(&gem_shrinker->dev->object_name_lock)) {
> > >> + *lock_contention |= true;
> > >> + goto resv_unlock;
> > >> + }
> > >
> > > I'm not sure this is a good idea to serialize on object_name_lock.
> > > Purgeable buffers should never be shared (imported or exported). So
> > > at best you are avoiding evicting and immediately swapping back in, in
> > > a rare case, at the cost of serializing multiple threads trying to
> > > reclaim pages in parallel.
> >
> > The object_name_lock shouldn't cause contention in practice. But objects
> > are also pinned on attachment, hence maybe this lock is indeed
> > unnecessary.. I'll re-check it.
>
> I'm not worried about contention with export/import/etc, but
> contention between multiple threads hitting the shrinker in parallel.
> I guess since you are using trylock, it won't *block* the other
> threads hitting shrinker, but they'll just end up looping in
> do_shrink_slab() because they are hitting contention.
>
> I'd have to do some experiments to see how it works out in practice,
> but my gut feel is that it isn't a good idea

Yeah trylock on anything else than the object lock is No Good in the
shrinker. And it really shouldn't be needed, since import/export should
pin stuff as needed. Which should be protected by the dma_resv object
lock. If not, we need to fix that.

Picking a random drm-internal lock like this is definitely no good design.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Next message: Daniel Latypov: "Re: [PATCH v2] kunit: tool: refactor internal kconfig handling, allow overriding"
Previous message: Daniel Vetter: "Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker"
In reply to: Daniel Vetter: "Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]