Re: [RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation

From: Suren Baghdasaryan
Date: Wed Feb 10 2021 - 12:29:42 EST


On Wed, Feb 10, 2021 at 9:21 AM Daniel Vetter <daniel@xxxxxxxx> wrote:
>
> On Wed, Feb 10, 2021 at 5:39 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> >
> > On Wed, Feb 10, 2021 at 5:06 AM Daniel Vetter <daniel@xxxxxxxx> wrote:
> > >
> > > On Tue, Feb 09, 2021 at 12:16:51PM -0800, Suren Baghdasaryan wrote:
> > > > On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel@xxxxxxxx> wrote:
> > > > >
> > > > > On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig@xxxxxxx> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
> > > > > > > On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig@xxxxxxx> wrote:
> > > > > > >> Am 09.02.21 um 13:11 schrieb Christian König:
> > > > > > >>> [SNIP]
> > > > > > >>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
> > > > > > >>>>>> +{
> > > > > > >>>>>> + spin_lock(&pool->lock);
> > > > > > >>>>>> + list_add_tail(&page->lru, &pool->items);
> > > > > > >>>>>> + pool->count++;
> > > > > > >>>>>> + atomic_long_add(1 << pool->order, &total_pages);
> > > > > > >>>>>> + spin_unlock(&pool->lock);
> > > > > > >>>>>> +
> > > > > > >>>>>> + mod_node_page_state(page_pgdat(page),
> > > > > > >>>>>> NR_KERNEL_MISC_RECLAIMABLE,
> > > > > > >>>>>> + 1 << pool->order);
> > > > > > >>>>> Hui what? What should that be good for?
> > > > > > >>>> This is a carryover from the ION page pool implementation:
> > > > > > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdccccff8edcd4d147a5b08d8cd20cff2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484888114923580%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9%2BIBC0tezSV6Ci4S3kWfW%2BQvJm4mdunn3dF6C0kyfCw%3D&amp;reserved=0
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> My sense is it helps with the vmstat/meminfo accounting so folks can
> > > > > > >>>> see the cached pages are shrinkable/freeable. This maybe falls under
> > > > > > >>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
> > > > > > >>>> for now, or let the drivers using the shared page pool logic handle
> > > > > > >>>> the accounting themselves?
> > > > > > >> Intentionally separated the discussion for that here.
> > > > > > >>
> > > > > > >> As far as I can see this is just bluntly incorrect.
> > > > > > >>
> > > > > > >> Either the page is reclaimable or it is part of our pool and freeable
> > > > > > >> through the shrinker, but never ever both.
> > > > > > > IIRC the original motivation for counting ION pooled pages as
> > > > > > > reclaimable was to include them into /proc/meminfo's MemAvailable
> > > > > > > calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
> > > > > > > non-slab kernel pages" seems like a good place to account for them but
> > > > > > > I might be wrong.
> > > > > >
> > > > > > Yeah, that's what I see here as well. But exactly that is utterly nonsense.
> > > > > >
> > > > > > Those pages are not "free" in the sense that get_free_page could return
> > > > > > them directly.
> > > > >
> > > > > Well on Android that is kinda true, because Android has it's
> > > > > oom-killer (way back it was just a shrinker callback, not sure how it
> > > > > works now), which just shot down all the background apps. So at least
> > > > > some of that (everything used by background apps) is indeed
> > > > > reclaimable on Android.
> > > > >
> > > > > But that doesn't hold on Linux in general, so we can't really do this
> > > > > for common code.
> > > > >
> > > > > Also I had a long meeting with Suren, John and other googles
> > > > > yesterday, and the aim is now to try and support all the Android gpu
> > > > > memory accounting needs with cgroups. That should work, and it will
> > > > > allow Android to handle all the Android-ism in a clean way in upstream
> > > > > code. Or that's at least the plan.
> > > > >
> > > > > I think the only thing we identified that Android still needs on top
> > > > > is the dma-buf sysfs stuff, so that shared buffers (which on Android
> > > > > are always dma-buf, and always stay around as dma-buf fd throughout
> > > > > their lifetime) can be listed/analyzed with full detail.
> > > > >
> > > > > But aside from this the plan for all the per-process or per-heap
> > > > > account, oom-killer integration and everything else is planned to be
> > > > > done with cgroups.
> > > >
> > > > Until cgroups are ready we probably will need to add a sysfs node to
> > > > report the total dmabuf pool size and I think that would cover our
> > > > current accounting need here.
> > > > As I mentioned, not including dmabuf pools into MemAvailable would
> > > > affect that stat and I'm wondering if pools should be considered as
> > > > part of MemAvailable or not. Since MemAvailable includes SReclaimable
> > > > I think it makes sense to include them but maybe there are other
> > > > considerations that I'm missing?
> > >
> > > On Android, yes, on upstream, not so much. Because upstream doesn't have
> > > the android low memory killer cleanup up all the apps, so effectively we
> > > can't reclaim that memory, and we shouldn't report it as such.
> > > -Daniel
> >
> > Hmm. Sorry, I fail to see why Android's low memory killer makes a
> > difference here. In my mind, the pages in the pools are not used but
> > kept there in case heaps need them (maybe that's the part I'm wrong?).
> > These pages can be freed by the shrinker if memory pressure rises. In
> > that sense I think it's very similar to reclaimable slabs which are
> > already accounted as part of MemAvailable. So it seems logical to me
> > to include unused pages in the pools here as well. What am I missing?
>
> Ah yes, those page pool pages we can list. But conceptually (at least
> in the internals) they're shrunk through mm shrinker callbacks, like
> slab cache memory. So not exactly sure where to list that.
>
> Since we have the same pools for gpu allocations on the ttm side and
> John is looking into unifying those, maybe we could add that as a
> patch on top? For some nice consistency across all gpu drivers from
> android to discrete. I think if you, John and Christian from ttm side
> can figure out how these page pools should be reported we'll have
> something that fits? Maybe John can ping you on the other thread with
> the shared pool rfc between ttm and dma-buf heaps (there's so much
> going right now all over I'm a bit lost).

Sounds good. I'll follow up with John to see where this discussion fits better.
Thanks!

>
> Cheers, Daniel
>
> >
> > >
> > > >
> > > > > Android (for now) only needs to account overall gpu
> > > > > memory since none of it is swappable on android drivers anyway, plus
> > > > > no vram, so not much needed.
> > > > >
> > > > > Cheers, Daniel
> > > > >
> > > > > >
> > > > > > Regards,
> > > > > > Christian.
> > > > > >
> > > > > > >
> > > > > > >> In the best case this just messes up the accounting, in the worst case
> > > > > > >> it can cause memory corruption.
> > > > > > >>
> > > > > > >> Christian.
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch