Re: [RFC PATCH V2 0/7] Do not read from descripto ring
From: Michael S. Tsirkin
Date: Mon Jul 12 2021 - 08:58:46 EST
On Mon, Jul 12, 2021 at 11:07:44AM +0800, Jason Wang wrote:
>
> 在 2021/7/12 上午12:08, Michael S. Tsirkin 写道:
> > On Fri, Jun 04, 2021 at 01:38:01PM +0800, Jason Wang wrote:
> > > 在 2021/5/14 下午7:13, Michael S. Tsirkin 写道:
> > > > On Thu, May 06, 2021 at 01:38:29PM +0100, Christoph Hellwig wrote:
> > > > > On Thu, May 06, 2021 at 04:12:17AM -0400, Michael S. Tsirkin wrote:
> > > > > > Let's try for just a bit, won't make this window anyway:
> > > > > >
> > > > > > I have an old idea. Add a way to find out that unmap is a nop
> > > > > > (or more exactly does not use the address/length).
> > > > > > Then in that case even with DMA API we do not need
> > > > > > the extra data. Hmm?
> > > > > So we actually do have a check for that from the early days of the DMA
> > > > > API, but it only works at compile time: CONFIG_NEED_DMA_MAP_STATE.
> > > > >
> > > > > But given how rare configs without an iommu or swiotlb are these days
> > > > > it has stopped to be very useful. Unfortunately a runtime-version is
> > > > > not entirely trivial, but maybe if we allow for false positives we
> > > > > could do something like this
> > > > >
> > > > > bool dma_direct_need_state(struct device *dev)
> > > > > {
> > > > > /* some areas could not be covered by any map at all */
> > > > > if (dev->dma_range_map)
> > > > > return false;
> > > > > if (force_dma_unencrypted(dev))
> > > > > return false;
> > > > > if (dma_direct_need_sync(dev))
> > > > > return false;
> > > > > return *dev->dma_mask == DMA_BIT_MASK(64);
> > > > > }
> > > > >
> > > > > bool dma_need_state(struct device *dev)
> > > > > {
> > > > > const struct dma_map_ops *ops = get_dma_ops(dev);
> > > > >
> > > > > if (dma_map_direct(dev, ops))
> > > > > return dma_direct_need_state(dev);
> > > > > return ops->unmap_page ||
> > > > > ops->sync_single_for_cpu || ops->sync_single_for_device;
> > > > > }
> > > > Yea that sounds like a good idea. We will need to document that.
> > > >
> > > >
> > > > Something like:
> > > >
> > > > /*
> > > > * dma_need_state - report whether unmap calls use the address and length
> > > > * @dev: device to guery
> > > > *
> > > > * This is a runtime version of CONFIG_NEED_DMA_MAP_STATE.
> > > > *
> > > > * Return the value indicating whether dma_unmap_* and dma_sync_* calls for the device
> > > > * use the DMA state parameters passed to them.
> > > > * The DMA state parameters are: scatter/gather list/table, address and
> > > > * length.
> > > > *
> > > > * If dma_need_state returns false then DMA state parameters are
> > > > * ignored by all dma_unmap_* and dma_sync_* calls, so it is safe to pass 0 for
> > > > * address and length, and DMA_UNMAP_SG_TABLE_INVALID and
> > > > * DMA_UNMAP_SG_LIST_INVALID for s/g table and length respectively.
> > > > * If dma_need_state returns true then DMA state might
> > > > * be used and so the actual values are required.
> > > > */
> > > >
> > > > And we will need DMA_UNMAP_SG_TABLE_INVALID and
> > > > DMA_UNMAP_SG_LIST_INVALID as pointers to an empty global table and list
> > > > for calls such as dma_unmap_sgtable that dereference pointers before checking
> > > > they are used.
> > > >
> > > >
> > > > Does this look good?
> > > >
> > > > The table/length variants are for consistency, virtio specifically does
> > > > not use s/g at the moment, but it seems nicer than leaving
> > > > users wonder what to do about these.
> > > >
> > > > Thoughts? Jason want to try implementing?
> > >
> > > I can add it in my todo list other if other people are interested in this,
> > > please let us know.
> > >
> > > But this is just about saving the efforts of unmap and it doesn't eliminate
> > > the necessary of using private memory (addr, length) for the metadata for
> > > validating the device inputs.
> >
> > Besides unmap, why do we need to validate address?
>
>
> Sorry, it's not validating actually, the driver doesn't do any validation.
> As the subject, the driver will just use the metadata stored in the
> desc_state instead of the one stored in the descriptor ring.
>
>
> > length can be
> > typically validated by specific drivers - not all of them even use it ..
> >
> > > And just to clarify, the slight regression we see is testing without
> > > VIRTIO_F_ACCESS_PLATFORM which means DMA API is not used.
> > I guess this is due to extra cache pressure?
>
>
> Yes.
>
>
> > Maybe create yet another
> > array just for DMA state ...
>
>
> I'm not sure I get this, we use this basically:
>
> struct vring_desc_extra {
> dma_addr_t addr; /* Buffer DMA addr. */
> u32 len; /* Buffer length. */
> u16 flags; /* Descriptor flags. */
> u16 next; /* The next desc state in a list. */
> };
>
> Except for the "next" the rest are all DMA state.
>
> Thanks
I am talking about the dma need state idea where we interrogate the DMA
API to figure out whether unmap is actually a nop.
>
> >
> > > So I will go to post a formal version of this series and we can start from
> > > there.
> > >
> > > Thanks
> > >
> > >