Re: [PATCH 4/9] soc: apple: Add SART driver
From: Sven Peter
Date: Sat Apr 02 2022 - 08:39:20 EST
Hi,
Thanks for the review!
On Mon, Mar 21, 2022, at 18:07, Arnd Bergmann wrote:
> On Mon, Mar 21, 2022 at 5:50 PM Sven Peter <sven@xxxxxxxxxxxxx> wrote:
>>
>> The NVMe co-processor on the Apple M1 uses a DMA address filter called
>> SART for some DMA transactions. This adds a simple driver used to
>> configure the memory regions from which DMA transactions are allowed.
>>
>> Co-developed-by: Hector Martin <marcan@xxxxxxxxx>
>> Signed-off-by: Hector Martin <marcan@xxxxxxxxx>
>> Signed-off-by: Sven Peter <sven@xxxxxxxxxxxxx>
>
> Can you add some explanation about why this uses a custom interface
> instead of hooking into the dma_map_ops?
Sure.
In a perfect world this would just be an IOMMU implementation but since
SART can't create any real IOVA space using pagetables it doesn't fit
inside that subsytem.
In a slightly less perfect world I could just implement dma_map_ops here
but that won't work either because not all DMA buffers of the NVMe
device have to go through SART and those allocations happen
inside the same device and would use the same dma_map_ops.
The NVMe controller has two separate DMA filters:
- NVMMU, which must be set up for any command that uses PRPs and
ensures that the DMA transactions only touch the pages listed
inside the PRP structure. NVMMU itself is tightly coupled
to the NVMe controller: The list of allowed pages is configured
based on command's tag id and even commands that require no DMA
transactions must be listed inside NVMMU before they are started.
- SART, which must be set up for some shared memory buffers (e.g.
log messages from the NVMe firmware) and for some NVMe debug
commands that don't use PRPs.
SART is only loosely coupled to the NVMe controller and could
also be used together with other devices. It's also the only
thing that changed between M1 and M1 Pro/Max/Ultra and that's
why I decided to separate it from the NVMe driver.
I'll add this explanation to the commit message.
>
>> +static void sart2_get_entry(struct apple_sart *sart, int index, u8 *flags,
>> + phys_addr_t *paddr, size_t *size)
>> +{
>> + u32 cfg = readl_relaxed(sart->regs + APPLE_SART2_CONFIG(index));
>> + u32 paddr_ = readl_relaxed(sart->regs + APPLE_SART2_PADDR(index));
>
> Why do you use the _relaxed() accessors here and elsewhere in the driver?
This device itself doesn't do any DMA transactions so it needs no memory
synchronization barriers. Only the consumer (i.e. rtkit and nvme) read/write
from/to these buffers (multiple times) and they have the required barriers
in place whenever they are used.
These buffers so far are only allocated at probe time though so even using
the normal writel/readl here won't hurt performance at all. I can just use
those if you prefer or alternatively add a comment why _relaxed is fine here.
This is a bit similar to the discussion for the pinctrl series last year [1].
>
>> +struct apple_sart *apple_sart_get(struct device *dev)
>> +{
>> + struct device_node *sart_node;
>> + struct platform_device *sart_pdev;
>> + struct apple_sart *sart;
>> +
>> + sart_node = of_parse_phandle(dev->of_node, "apple,sart", 0);
>> + if (!sart_node)
>> + return ERR_PTR(ENODEV);
>
> The error pointers need to take negative values, like 'ERR_PTR(-ENODEV)',
> here and everywhere else in the driver.
Ouch, that's my second bug of that kind in the past days. I'll fix it here
and check the other patches in this series as well.
Thanks,
Sven
[1] https://lore.kernel.org/lkml/87sfz8zdzb.wl-maz@xxxxxxxxxx/