Re: [RFC] Per file OOM badness

From: Daniel Vetter
Date: Tue Jan 30 2018 - 04:31:55 EST


On Wed, Jan 24, 2018 at 01:11:09PM +0100, Christian König wrote:
> Am 24.01.2018 um 12:50 schrieb Michal Hocko:
> > On Wed 24-01-18 12:23:10, Michel Dänzer wrote:
> > > On 2018-01-24 12:01 PM, Michal Hocko wrote:
> > > > On Wed 24-01-18 11:27:15, Michel Dänzer wrote:
> > [...]
> > > > > 2. If the OOM killer kills a process which is sharing BOs with another
> > > > > process, this should result in the other process dropping its references
> > > > > to the BOs as well, at which point the memory is released.
> > > > OK. How exactly are those BOs mapped to the userspace?
> > > I'm not sure what you're asking. Userspace mostly uses a GEM handle to
> > > refer to a BO. There can also be userspace CPU mappings of the BO's
> > > memory, but userspace doesn't need CPU mappings for all BOs and only
> > > creates them as needed.
> > OK, I guess you have to bear with me some more. This whole stack is a
> > complete uknonwn. I am mostly after finding a boundary where you can
> > charge the allocated memory to the process so that the oom killer can
> > consider it. Is there anything like that? Except for the proposed file
> > handle hack?
>
> Not that I knew of.
>
> As I said before we need some kind of callback that a process now starts to
> use a file descriptor, but without anything from that file descriptor mapped
> into the address space.

For more context: With DRI3 and wayland the compositor opens the DRM fd
and then passes it to the client, which then starts allocating stuff. That
makes book-keeping rather annoying.

I guess a good first order approximation would be if we simply charge any
newly allocated buffers to the process that created them, but that means
hanging onto lots of mm_struct pointers since we want to make sure we then
release those pages to the right mm again (since the process that drops
the last ref might be a totally different one, depending upon how the
buffers or DRM fd have been shared).

Would it be ok to hang onto potentially arbitrary mmget references
essentially forever? If that's ok I think we can do your process based
account (minus a few minor inaccuracies for shared stuff perhaps, but no
one cares about that).
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch