Re: [PATCH v2 1/2] PCI/ATS: Add a helper function to configure ATS STU of a PF

From: Ganapatrao Kulkarni
Date: Wed Mar 15 2023 - 10:30:59 EST




On 14-03-2023 11:42 pm, Sathyanarayanan Kuppuswamy wrote:


On 3/14/23 10:10 AM, Bjorn Helgaas wrote:
On Tue, Mar 14, 2023 at 09:50:06AM -0700, Sathyanarayanan Kuppuswamy wrote:
On 3/14/23 9:02 AM, Bjorn Helgaas wrote:
On Tue, Mar 14, 2023 at 08:06:07PM +0530, Ganapatrao Kulkarni wrote:
On 14-03-2023 06:22 pm, Sathyanarayanan Kuppuswamy wrote:
On 3/14/23 3:08 AM, Ganapatrao Kulkarni wrote:
On 14-03-2023 04:00 am, Sathyanarayanan Kuppuswamy wrote:
On 3/13/23 2:12 PM, Bjorn Helgaas wrote:
On Mon, Feb 27, 2023 at 08:21:36PM -0800, Ganapatrao Kulkarni wrote:
As per PCI specification (PCI Express Base Specification
Revision 6.0, Section 10.5) both PF and VFs of a PCI EP
are permitted to be enabled independently for ATS
capability, however the STU(Smallest Translation Unit) is
shared between PF and VFs. For VFs, it is hardwired to
Zero and the associated PF's value applies to VFs.

In the current code, the STU is being configured while
enabling the PF ATS. Hence, it is not able to enable ATS
for VFs, if it is not enabled on the associated PF
already.

Adding a function pci_ats_stu_configure(), which can be
called to configure the STU during PF enumeration. Latter
enumerations of VFs can successfully enable ATS
independently.

@@ -46,6 +46,35 @@ bool pci_ats_supported(struct pci_dev *dev)
  }
  EXPORT_SYMBOL_GPL(pci_ats_supported);
  +/**
+ * pci_ats_stu_configure - Configure STU of a PF.
+ * @dev: the PCI device
+ * @ps: the IOMMU page shift
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_ats_stu_configure(struct pci_dev *dev, int ps)
+{
+    u16 ctrl;
+
+    if (dev->ats_enabled || dev->is_virtfn)
+        return 0;

I might return an error for the VF case on the assumption
that it's likely an error in the caller.  I guess one could
argue that it simplifies the caller if it doesn't have to
check for PF vs VF.  But the fact that STU is shared between
PF and VFs is an important part of understanding how ATS
works, so the caller should be aware of the distinction
anyway.

I have already asked this question. But let me repeat it.

We don't have any checks for the PF case here. That means you
can re-configure the STU as many times as you want until ATS
is enabled in PF. So, if there are active VFs which uses this
STU, can PF re-configure the STU at will?

IMO, Since STU is shared, programming it multiple times is not expected from callers code do it, however we can add below check to allow to program STU once from a PF.

diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index 1611bfa1d5da..f7bb01068e18 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -60,6 +60,10 @@ int pci_ats_stu_configure(struct pci_dev *dev, int ps)
        if (dev->ats_enabled || dev->is_virtfn)
                return 0;

+       /* Configured already */
+       if (dev->ats_stu)
+               return 0;

Theoretically, you can re-configure STU as long as no one is using
it. Instead of this check, is there a way to check whether there
are active VMs which enables ATS?

Yes I agree, there is no limitation on how many times you write STU
bits, but practically it is happening while PF is enumerated.

The usage of function pci_ats_stu_configure is almost
similar(subset) to pci_enable_ats and only difference is one does
ATS enable + STU program and another does only STU program.

What would you think of removing the STU update feature from
pci_enable_ats() so it always fails if pci_ats_stu_configure() has not
been called, even when called on the PF, e.g.,

if (ps != pci_physfn(dev)->ats_stu)
return -EINVAL;

If we are removing the STU update from pci_enable_ats(), why
even allow passing "ps (page shift)" parameter? IMO, we can assume that
for STU reconfigure, users will call pci_ats_stu_configure().

The reason to pass "ps" would be to verify that the STU the caller
plans to use matches the actual STU.

Do we really need to verify it? My thinking is, by introducing
pci_ats_stu_configure() we are already trying to decouple the STU config
from pci_enable_ats(). So why again check for it when enabling ATS?

AFAIK, we are not decoupling STU from pci_enable_ats, there is a case where ATS is not enabled on PF (host kernel booted in passthrough) where as ATS gets enabled on VF while it is attached to a VM(through vfio driver). In such case, during PF enumeration, pci_ats_stu_configure is called to program STU and allow ATS to be enabled on VFs.

In normal case, PF enables ATS and STU is programmed and no issue while enabling the VFs.



Since zero is a valid STU, enabling ATS can be decoupled from STU
update.

pci_read_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, &ctrl);
ctrl |= PCI_ATS_CTRL_ENABLE;
pci_write_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, ctrl);

Would probably also have to set "dev->ats_stu = 0" in
pci_disable_ats() to allow the possibility of calling
pci_ats_stu_configure() again.

IMO, I dont think, there is any need to find how many active VMs
with attached VFs and it is not done for pci_enable_ats as well.

Enabling or disabling ATS in a PF or VF has no effect on other
functions.

But changing STU while a VF has ATS enabled would definitely break any
user of that VF, so if it's practical to verify that no VFs have ATS
enabled, I think we should.

I also think it is better to check for a ats_enabled status of VF before
configuring the STU.

May be something like below (untested),

static int is_ats_enabled_in_vf(struct pci_dev *dev)
{
struct pci_sriov *iov = dev->sriov;
struct pci_dev *vdev;

if (dev->is_virtfn)
return -EINVAL;

for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
vdev = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus),
pci_iov_virtfn_bus(dev, i),
pci_iov_virtfn_devfn(dev, i));

I would try hard to avoid pci_get_domain_bus_and_slot(). That's
expensive (searches *all* PCI devs with for_each_pci_dev()) and
requires dealing with reference counts.

Maybe an atomic count in the PF of VFs with ATS enabled.

if (vdev && vdev->ats_enabled)
return 1;
}

return 0;

}

int pci_ats_stu_configure(struct pci_dev *dev, int ps)
{
...
if (is_ats_enabled_in_vf(dev))
return -EBUSY;


Also the caller has the requirement to call either
pci_ats_stu_configure or pci_enable_ats while enumerating the PF.

        if (!pci_ats_supported(dev))
                return -EINVAL;

+
+    if (!pci_ats_supported(dev))
+        return -EINVAL;
+
+    if (ps < PCI_ATS_MIN_STU)
+        return -EINVAL;
+
+    dev->ats_stu = ps;
+    pci_read_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, &ctrl);
+    ctrl |= PCI_ATS_CTRL_STU(dev->ats_stu - PCI_ATS_MIN_STU);
+    pci_write_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, ctrl);
+
+    return 0;
+}
+EXPORT_SYMBOL_GPL(pci_ats_stu_configure);
+
  /**
   * pci_enable_ats - enable the ATS capability
   * @dev: the PCI device
@@ -68,8 +97,8 @@ int pci_enable_ats(struct pci_dev *dev, int ps)
          return -EINVAL;
        /*
-     * Note that enabling ATS on a VF fails unless it's already enabled
-     * with the same STU on the PF.
+     * Note that enabling ATS on a VF fails unless it's already
+     * configured with the same STU on the PF.
       */
      ctrl = PCI_ATS_CTRL_ENABLE;
      if (dev->is_virtfn) {
diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
index df54cd5b15db..7d62a92aaf23 100644
--- a/include/linux/pci-ats.h
+++ b/include/linux/pci-ats.h
@@ -8,6 +8,7 @@
  /* Address Translation Service */
  bool pci_ats_supported(struct pci_dev *dev);
  int pci_enable_ats(struct pci_dev *dev, int ps);
+int pci_ats_stu_configure(struct pci_dev *dev, int ps);
  void pci_disable_ats(struct pci_dev *dev);
  int pci_ats_queue_depth(struct pci_dev *dev);
  int pci_ats_page_aligned(struct pci_dev *dev);
@@ -16,6 +17,8 @@ static inline bool pci_ats_supported(struct pci_dev *d)
  { return false; }
  static inline int pci_enable_ats(struct pci_dev *d, int ps)
  { return -ENODEV; }
+static inline int pci_ats_stu_configure(struct pci_dev *d, int ps)
+{ return -ENODEV; }
  static inline void pci_disable_ats(struct pci_dev *d) { }
  static inline int pci_ats_queue_depth(struct pci_dev *d)
  { return -ENODEV; }

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Thanks,
Ganapat