Re: [RFC] Make use of non-dynamic dmabuf in RDMA

From: Jason Gunthorpe
Date: Wed Aug 25 2021 - 08:38:10 EST


On Wed, Aug 25, 2021 at 02:27:08PM +0200, Christian König wrote:
> Am 25.08.21 um 14:18 schrieb Jason Gunthorpe:
> > On Wed, Aug 25, 2021 at 08:17:51AM +0200, Christian König wrote:
> >
> > > The only real option where you could do P2P with buffer pinning are those
> > > compute boards where we know that everything is always accessible to
> > > everybody and we will never need to migrate anything. But even then you want
> > > some mechanism like cgroups to take care of limiting this. Otherwise any
> > > runaway process can bring down your whole system.
> > Why? It is not the pin that is the problem, it was allocating GPU
> > dedicated memory in the first place. pinning it just changes the
> > sequence to free it. No different than CPU memory.
>
> Pinning makes the memory un-evictable.
>
> In other words as long as we don't pin anything we can support as many
> processes as we want until we run out of swap space. Swapping sucks badly
> because your applications become pretty much unuseable, but you can easily
> recover from it by killing some process.
>
> With pinning on the other hand somebody sooner or later receives an -ENOMEM
> or -ENOSPC and there is no guarantee that this goes to the right process.

It is not really different - you have the same failure mode once the
system runs out of swap.

This is really the kernel side trying to push a policy to the user
side that the user side doesn't want..

Dedicated systems are a significant use case here and should be
supported, even if the same solution wouldn't be applicable to someone
running a desktop.

Jason