Re: [PATCH 09/16] dma-direct: Support PCI P2PDMA pages in dma-direct map_sg

From: Logan Gunthorpe
Date: Mon May 03 2021 - 13:07:36 EST




On 2021-05-02 5:28 p.m., John Hubbard wrote:
>> @@ -387,19 +388,37 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
>
> This routine now deserves a little bit of commenting, now that it is
> doing less obvious things. How about something like this:
>
> /*
> * Unmaps pages, except for PCI_P2PDMA pages, which were never mapped in the
> * first place. Instead of unmapping PCI_P2PDMA entries, simply remove the
> * SG_PCI_P2PDMA mark
> */
> void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
> int nents, enum dma_data_direction dir, unsigned long attrs)
> {
>

Ok.

>> struct scatterlist *sg;
>> int i;
>>
>> - for_each_sg(sgl, sg, nents, i)
>> + for_each_sg(sgl, sg, nents, i) {
>> + if (sg_is_pci_p2pdma(sg)) {
>> + sg_unmark_pci_p2pdma(sg);
>> + continue;
>> + }
>> +
>> dma_direct_unmap_page(dev, sg->dma_address, sg_dma_len(sg), dir,
>> attrs);
>> + }
>
> The same thing can be achieved with fewer lines and a bit more clarity.
> Can we please do it like this instead:
>
> for_each_sg(sgl, sg, nents, i) {
> if (sg_is_pci_p2pdma(sg))
> sg_unmark_pci_p2pdma(sg);
> else
> dma_direct_unmap_page(dev, sg->dma_address,
> sg_dma_len(sg), dir, attrs);
> }
>
>

That's debatable (the way I did it emphasizes the common case). But I'll
consider changing it.

>
> Also here, a block comment for the function would be nice. How about
> approximately this:
>
> /*
> * Maps each SG segment. Returns the number of entries mapped, or 0 upon
> * failure. If any entry could not be mapped, then no entries are mapped.
> */
>
> I'll stop complaining about the pre-existing return code conventions,
> since by now you know what I was thinking of saying. :)

Not really part of this patchset... Seems like if you think there should
be a comment like that here, you should send a patch. But this patch
starts returning a negative value here.

>> int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
>> enum dma_data_direction dir, unsigned long attrs)
>> {
>> - int i;
>> + struct pci_p2pdma_map_state p2pdma_state = {};
>
> Is it worth putting this stuff on the stack--is there a noticeable
> performance improvement from caching the state? Because if it's
> invisible, then simplicity is better. I suspect you're right, and that
> it *is* worth it, but it's good to know for real.
>
>> struct scatterlist *sg;
>> + int i, ret = 0;
>>
>> for_each_sg(sgl, sg, nents, i) {
>> + if (is_pci_p2pdma_page(sg_page(sg))) {
>> + ret = pci_p2pdma_map_segment(&p2pdma_state, dev, sg,
>> + attrs);
>> + if (ret < 0) {
>> + goto out_unmap;
>> + } else if (ret) {
>> + ret = 0;
>> + continue;
>
> Is this a bug? If neither of those "if" branches fires (ret == 0), then
> the code (probably unintentionally) falls through and continues on to
> attempt to call dma_direct_map_page()--despite being a PCI_P2PDMA page!

No, it's not a bug. Per the documentation of pci_p2pdma_map_segment(),
if it returns zero the segment should be mapped normally. P2PDMA pages
must be mapped with physical addresses (or IOVA addresses) if the TLPS
for the transaction will go through the host bridge.

> See below for suggestions:
>
>> + }
>> + }
>> +
>> sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
>> sg->offset, sg->length, dir, attrs);
>> if (sg->dma_address == DMA_MAPPING_ERROR)
>
> This is another case in which "continue" is misleading and not as good
> as "else". Because unless I'm wrong above, you really only want to take
> one path *or* the other.

No, per above, it's not one path or the other. If it's a P2PDMA page it
may still need to be mapped normally.

> Also, the "else if (ret)" can be simplified to just setting ret = 0
> unconditionally.

I don't follow. If ret is set, we need to unset it before the end of the
loop.

> Given all that, here's a suggested alternative, which is both shorter
> and clearer, IMHO:
>
> for_each_sg(sgl, sg, nents, i) {
> if (is_pci_p2pdma_page(sg_page(sg))) {
> ret = pci_p2pdma_map_segment(&p2pdma_state, dev, sg,
> attrs);
> if (ret < 0)
> goto out_unmap;
> else
> ret = 0;
> } else {
> sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
> sg->offset, sg->length, dir, attrs);
> if (sg->dma_address == DMA_MAPPING_ERROR)
> goto out_unmap;
> sg_dma_len(sg) = sg->length;
> }
> }

No, per the comments above, this does not accomplish the same thing and
is not correct.

I'll try to add a comment to the code to make it more clearer. But the
kernel doc on pci_p2pdma_map_segment() does mention what must be done
for different return values explicitly.

Logan