Re: [PATCH V2] xen/virtio: Handle PCI devices which Host controller is described in DT

From: Oleksandr Tyshchenko
Date: Wed Oct 19 2022 - 14:10:24 EST



On 19.10.22 03:34, Stefano Stabellini wrote:

Hello Stefano

> On Tue, 18 Oct 2022, Oleksandr Tyshchenko wrote:
>> On 18.10.22 03:33, Stefano Stabellini wrote:
>>> On Sat, 15 Oct 2022, Oleksandr Tyshchenko wrote:
>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
>>>>
>>>> Use the same "xen-grant-dma" device concept for the PCI devices
>>>> behind device-tree based PCI Host controller, but with one modification.
>>>> Unlike for platform devices, we cannot use generic IOMMU bindings
>>>> (iommus property), as we need to support more flexible configuration.
>>>> The problem is that PCI devices under the single PCI Host controller
>>>> may have the backends running in different Xen domains and thus have
>>>> different endpoints ID (backend domains ID).
>>> Hi Oleksandr,
>>>
>>> From another email I understood that you successfully managed to
>>> describe in device tree all the individual virtio pci devices so that
>>> you can have iommu-map/iommu-map-mask properties under each virtio
>>> device node. Is that right?
>> No. Here [1] I mentioned that I had experimented with PCI-IOMMU bindings
>> (iommu-map/iommu-map-mask properties) as IOMMU bindings (iommu property)
>> is insufficient for us and got it worked.
>> Also I provided a link to the current patch. Sorry, if I was unclear.
>>
>> Just to be clear:
>>
>> We do not describe in device-tree all the individual virtio-pci devices
>> (and we do not have to), we only describe generic PCI host bridge node.
>> So we have only a *single* iommu-map property under that PCI host bridge
>> node.
>> The iommu-map property in turn describes the IOMMU connections for the
>> endpoints within that PCI Host bridge according to:
>> https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/pci-iommu.txt__;!!GF_29dbcQIUBPA!078KT_6M5f7P5_m6O2EotvWED-yuSZKHzzqykDoW5DPtQOWJQeoZB4QWsJCqCkP-wFtLry5TiLAz3uhNnB2ccNY9CsN57Q$ [kernel[.]org]
>>
>> For the instance, the following iommu-map property under that PCI host
>> bridge node describes the relationship between IOMMU and two PCI devices
>> (0000:00:01.0 and 0000:00:02.0):
>> iommu-map = <0x08 0xfde9 0x01 0x08 0x10 0xfde9 0x02 0x08>;
>> For 0000:00:01.0 we pass the endpoint ID 1 (backend domid 1)
>> For 0000:00:02.0 we pass the endpoint ID 2 (backend domid 2)
>> Other PCI devices (i.e 0000:00:03.0) are untranslated (are not required
>> to use grants for the virtio).
> That's great! I misunderstood. Actually I wonder if iommu-map might be
> suitable also for hotplug devices (as long as the backend domid is known
> beforehand). I think that should work?

I don't see at the moment any reasons why actually not. I assume, it
would also work.
For hotplug devices the arch_setup_dma_ops() -> ... ->
xen_grant_setup_dma_ops()
will also be called as it is called for *boot* devices, isn't it?


> It should be possible to specify
> PCI device IDs even if those device IDs are not present yet?


I may mistake, but I think, yes. I think, nothing prevents us from doing
so when creating PCI Host bridge node in the toolstack (or Xen if it is
a dom0less).
I think, we do not violate anything. We just describe the IOMMU mapping
scheme for PCI devices. If PCI device with specified RID appears at some
point in future,
the corresponding SID (backend domid) will be assigned to it, if not
appears - nothing bad will happen.

I think, this is similar to interrupt-map property where we describe the
interrupt mapping scheme for PCI devices.


>
> If this work, it could be the best solution actually.


thanks, I cannot say for sure whether it will 100% work as we don't have
a working hotplug at the moment, so it is not possible to re-check, but
I don't see why that solution won't work for us.


>
>
>>> If that is the case, then I would rather jump straight to that approach
>>> because I think it is far better than this one.
>> Please see above, I don't have any other approach except the one
>> implemented in current patch.
>>
>> [1]
>> https://urldefense.com/v3/__https://lore.kernel.org/xen-devel/16485bc9-0e2a-788a-93b8-453cc9ef0d3c@xxxxxxxx/__;!!GF_29dbcQIUBPA!078KT_6M5f7P5_m6O2EotvWED-yuSZKHzzqykDoW5DPtQOWJQeoZB4QWsJCqCkP-wFtLry5TiLAz3uhNnB2ccNaoN0AUOw$ [lore[.]kernel[.]org]
>>
>>
>>> Cheers,
>>>
>>> Stefano
>>>
>>>
>>>
>>>> So use generic PCI-IOMMU bindings instead (iommu-map/iommu-map-mask
>>>> properties) which allows us to describe relationship between PCI
>>>> devices and backend domains ID properly.
>>>>
>>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
>>>> ---
>>>> Slightly RFC. This is needed to support Xen grant mappings for virtio-pci devices
>>>> on Arm at some point in the future. The Xen toolstack side is not completely ready yet.
>>>> Here, for PCI devices we use more flexible way to pass backend domid to the guest
>>>> than for platform devices.
>>>>
>>>> Changes V1 -> V2:
>>>> - update commit description
>>>> - rebase
>>>> - rework to use generic PCI-IOMMU bindings instead of generic IOMMU bindings
>>>>
>>>> Previous discussion is at:
>>>> https://urldefense.com/v3/__https://lore.kernel.org/xen-devel/20221006174804.2003029-1-olekstysh@xxxxxxxxx/__;!!GF_29dbcQIUBPA!1mSETxg8CRohlL5OpYo0VaLBXtbWRLZlam9QABMP_YUzsYcrn8no1FxBPvhQnNRCSzp3pkC1dXIgmhdaZmJ3oyV6yWUy3w$ [lore[.]kernel[.]org]
>>>>
>>>> Based on:
>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git/log/?h=for-linus-6.1__;!!GF_29dbcQIUBPA!1mSETxg8CRohlL5OpYo0VaLBXtbWRLZlam9QABMP_YUzsYcrn8no1FxBPvhQnNRCSzp3pkC1dXIgmhdaZmJ3oyWa-6yyug$ [git[.]kernel[.]org]
>>>> ---
>>>> drivers/xen/grant-dma-ops.c | 87 ++++++++++++++++++++++++++++++++-----
>>>> 1 file changed, 76 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c
>>>> index daa525df7bdc..b79d9d6ce154 100644
>>>> --- a/drivers/xen/grant-dma-ops.c
>>>> +++ b/drivers/xen/grant-dma-ops.c
>>>> @@ -10,6 +10,7 @@
>>>> #include <linux/module.h>
>>>> #include <linux/dma-map-ops.h>
>>>> #include <linux/of.h>
>>>> +#include <linux/pci.h>
>>>> #include <linux/pfn.h>
>>>> #include <linux/xarray.h>
>>>> #include <linux/virtio_anchor.h>
>>>> @@ -292,12 +293,55 @@ static const struct dma_map_ops xen_grant_dma_ops = {
>>>> .dma_supported = xen_grant_dma_supported,
>>>> };
>>>>
>>>> +static struct device_node *xen_dt_get_pci_host_node(struct device *dev)
>>>> +{
>>>> + struct pci_dev *pdev = to_pci_dev(dev);
>>>> + struct pci_bus *bus = pdev->bus;
>>>> +
>>>> + /* Walk up to the root bus to look for PCI Host controller */
>>>> + while (!pci_is_root_bus(bus))
>>>> + bus = bus->parent;
>>>> +
>>>> + return of_node_get(bus->bridge->parent->of_node);
>>>> +}
>>>> +
>>>> +static struct device_node *xen_dt_get_node(struct device *dev)
>>>> +{
>>>> + if (dev_is_pci(dev))
>>>> + return xen_dt_get_pci_host_node(dev);
>>>> +
>>>> + return of_node_get(dev->of_node);
>>>> +}
>>>> +
>>>> +static int xen_dt_map_id(struct device *dev, struct device_node **iommu_np,
>>>> + u32 *sid)
>>>> +{
>>>> + struct pci_dev *pdev = to_pci_dev(dev);
>>>> + u32 rid = PCI_DEVID(pdev->bus->number, pdev->devfn);
>>>> + struct device_node *host_np;
>>>> + int ret;
>>>> +
>>>> + host_np = xen_dt_get_pci_host_node(dev);
>>>> + if (!host_np)
>>>> + return -ENODEV;
>>>> +
>>>> + ret = of_map_id(host_np, rid, "iommu-map", "iommu-map-mask", iommu_np, sid);
>>>> + of_node_put(host_np);
>>>> +
>>>> + return ret;
>>>> +}
>>>> +
>>>> static bool xen_is_dt_grant_dma_device(struct device *dev)
>>>> {
>>>> - struct device_node *iommu_np;
>>>> + struct device_node *iommu_np = NULL;
>>>> bool has_iommu;
>>>>
>>>> - iommu_np = of_parse_phandle(dev->of_node, "iommus", 0);
>>>> + if (dev_is_pci(dev)) {
>>>> + if (xen_dt_map_id(dev, &iommu_np, NULL))
>>>> + return false;
>>>> + } else
>>>> + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0);
>>>> +
>>>> has_iommu = iommu_np &&
>>>> of_device_is_compatible(iommu_np, "xen,grant-dma");
>>>> of_node_put(iommu_np);
>>>> @@ -307,9 +351,17 @@ static bool xen_is_dt_grant_dma_device(struct device *dev)
>>>>
>>>> bool xen_is_grant_dma_device(struct device *dev)
>>>> {
>>>> + struct device_node *np;
>>>> +
>>>> /* XXX Handle only DT devices for now */
>>>> - if (dev->of_node)
>>>> - return xen_is_dt_grant_dma_device(dev);
>>>> + np = xen_dt_get_node(dev);
>>>> + if (np) {
>>>> + bool ret;
>>>> +
>>>> + ret = xen_is_dt_grant_dma_device(dev);
>>>> + of_node_put(np);
>>>> + return ret;
>>>> + }
>>>>
>>>> return false;
>>>> }
>>>> @@ -325,12 +377,19 @@ bool xen_virtio_mem_acc(struct virtio_device *dev)
>>>> static int xen_dt_grant_init_backend_domid(struct device *dev,
>>>> struct xen_grant_dma_data *data)
>>>> {
>>>> - struct of_phandle_args iommu_spec;
>>>> + struct of_phandle_args iommu_spec = { .args_count = 1 };
>>>>
>>>> - if (of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells",
>>>> - 0, &iommu_spec)) {
>>>> - dev_err(dev, "Cannot parse iommus property\n");
>>>> - return -ESRCH;
>>>> + if (dev_is_pci(dev)) {
>>>> + if (xen_dt_map_id(dev, &iommu_spec.np, iommu_spec.args)) {
>>>> + dev_err(dev, "Cannot translate ID\n");
>>>> + return -ESRCH;
>>>> + }
>>>> + } else {
>>>> + if (of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells",
>>>> + 0, &iommu_spec)) {
>>>> + dev_err(dev, "Cannot parse iommus property\n");
>>>> + return -ESRCH;
>>>> + }
>>>> }
>>>>
>>>> if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") ||
>>>> @@ -354,6 +413,7 @@ static int xen_dt_grant_init_backend_domid(struct device *dev,
>>>> void xen_grant_setup_dma_ops(struct device *dev)
>>>> {
>>>> struct xen_grant_dma_data *data;
>>>> + struct device_node *np;
>>>>
>>>> data = find_xen_grant_dma_data(dev);
>>>> if (data) {
>>>> @@ -365,8 +425,13 @@ void xen_grant_setup_dma_ops(struct device *dev)
>>>> if (!data)
>>>> goto err;
>>>>
>>>> - if (dev->of_node) {
>>>> - if (xen_dt_grant_init_backend_domid(dev, data))
>>>> + np = xen_dt_get_node(dev);
>>>> + if (np) {
>>>> + int ret;
>>>> +
>>>> + ret = xen_dt_grant_init_backend_domid(dev, data);
>>>> + of_node_put(np);
>>>> + if (ret)
>>>> goto err;
>>>> } else if (IS_ENABLED(CONFIG_XEN_VIRTIO_FORCE_GRANT)) {
>>>> dev_info(dev, "Using dom0 as backend\n");
>>>> --
>>>> 2.25.1
>>>>
>> --
>> Regards,
>>
>> Oleksandr Tyshchenko
>>
--
Regards,

Oleksandr Tyshchenko