Re: [PATCH] drm/virtio: Use vmalloc for command buffer allocations.

From: David Riley
Date: Fri Aug 30 2019 - 13:49:41 EST


Hi Gerd,

On Fri, Aug 30, 2019 at 4:16 AM Gerd Hoffmann <kraxel@xxxxxxxxxx> wrote:
>
> Hi,
>
> > > > - kfree(vbuf->data_buf);
> > > > + kvfree(vbuf->data_buf);
> > >
> > > if (is_vmalloc_addr(vbuf->data_buf)) ...
> > >
> > > needed here I gues?
> > >
> >
> > kvfree() handles vmalloc/kmalloc/kvmalloc internally by doing that check.
>
> Ok.
>
> > - videobuf_vmalloc_to_sg in drivers/media/v4l2-core/videobuf-dma-sg.c,
> > assumes contiguous array of scatterlist and that the buffer being converted
> > is page aligned
>
> Well, vmalloc memory _is_ page aligned.

True, but this function gets called for all potential enqueuings (eg
resource_create_3d, resource_attach_backing) and I was concerned that
some other usage in the future might not have that guarantee. But if
that's overly being paranoid, I'm willing to backtrack on that.

> sg_alloc_table_from_pages() does alot of what you need, you just need a
> small loop around vmalloc_to_page() create a struct page array
> beforehand.

That feels like an extra allocation when under memory pressure and
more work, to not gain much -- there still needs to be a function that
iterates through all the pages. But I don't feel super strongly about
it and can change it if you think that it will be less maintenance
overhead.

> Completely different approach: use get_user_pages() and don't copy the
> execbuffer at all.

This one I'd rather not do unless we can show that the copy is
actually a problem. As it stands I expect this to be a fallback
instead of the regular case, and if it's not then the VMs need to
revisit how long the control queue size is. Right now most
execbuffers end up being copied into a single contiguous buffer which
results in 3 descriptors of the virtio queue. If vmalloc ends up
being used (which is only a fallback because vmemdup_user tries
kmalloc first still), then there'll be 66 descriptors for a 256KB
buffer. I think that's fine for exceptional cases, but not ideal if
it was commonplace.

Chia-I suggested that we could have a flag for the ioctl execbuffer
indicating that the buffer is BO to solve that, but again I'd prefer
to see that it's actually a problem.

I'll also be moving the sg table construction out of the critical
section and properly accounting for the required number of descriptors
before entering it in response to his comments.

Thanks,
Dave