On Mon, Feb 06, 2023 at 02:27:41PM +0300, Evgenii Shatokhin wrote:
Hi Sergey,
First of all, thank you for the detailed explanation. It is clearer now what
is going on and why it is that way.
On 04.02.2023 01:12, Serge Semin wrote:
Hi Evgenii
On Wed, Feb 01, 2023 at 04:54:55PM +0300, Evgenii Shatokhin wrote:
On 31.01.2023 15:42, Robin Murphy wrote:
On 2023-01-31 12:29, Evgenii Shatokhin wrote:
Hi,
On 26.08.2022 02:54, Will McVicker wrote:
Hi All,
I've update patch 2/2 to address Robin's suggestions. This includes:
* Dropping the while-loop for retrying with a 64-bit mask in favor of
retrying within the error if-statement.
* Using an int for the DMA mask instead of a bool and ternary
operation.
Thanks again for the reviews and sorry for the extra revision today!
Hopefully this is the last one :) If not, I'd be fine to submit
patch 1/2
without 2/2 to avoid resending patch 1/2 for future revisions of patch
2/2
(unless I don't need to do that anyway).
The first patch of the series made it into the mainline kernel, but, it
seems, the second one ("PCI: dwc: Add support for 64-bit MSI target
address") did not. As of 6.2-rc6, it is still missing.
Was it intentionally dropped because of some issues or, perhaps, just by
accident? If it was by accident, could you please queue it for inclusion
into mainline again?
Yes, it was dropped due to the PCI_MSI_FLAGS_64BIT usage apparently
being incorrect, and some other open debate (which all happened on the
v5 thread):
https://lore.kernel.org/linux-pci/YzVTmy9MWh+AjshC@lpieralisi/
I see. If I understand it correctly, the problem was that
PCI_MSI_FLAGS_64BIT flag did not guarantee that 64-bit mask could be used
for that particular allocation. Right?
William was trying to utilize for only software cause. Setting
PCI_MSI_FLAGS_64BIT didn't actually change the hardware behavior.
He could have as well provided just a driver private capability
flag. (see below for a more detailed problem description)
The DMA mask issues have now been sorted out,
I suppose, you mean https://lore.kernel.org/all/20230113171409.30470-26-Sergey.Semin@xxxxxxxxxxxxxxxxxxxx/?
Well, the way the DMA-mask issue has been solved was a bit of the
hacky. I wouldn't call it a fully proper solution. The problem with
pointlessly allocating physical memory for the iMSI-RX engine (it
doesn't perform any DMA) and artificially restricting the coherent-DMA
mask is still there. The patch in the subject was a compromise in
order to at least permit unrestricted streaming DMAs but limiting the
coherent DMAs for the MSI setup to work properly for all peripheral
devices.
It still breaks our particular case when the SoC has no 32-bit-addressable
RAM. We'd set DMA masks to DMA_BIT_MASK(36) in the platform-specific driver
before calling dw_pcie_host_init(). However, dw_pcie_msi_host_init() resets
it to 32-bit, tries dmam_alloc_coherent() and fails.
Yeah. That's another problem with the implemented approach. But are
your sure the driver had worked even before this patch? AFAICS the
driver allocated the MSI-targeted page from DMA32 zone before this
modification. So the allocation must have failed on your platform too.
You are right. I did not notice earlier that the kernel based on 6.0-stable
we used before did actually contain our SoC-specific workaround for this.
Without that custom patch, initialization of PCIe host does not work. So,
yes, the problem was present earlier too.
With 36-bit masks, the kernel seems to play well with the devices in our
case.
I saw your comment in https://lore.kernel.org/linux-pci/4dc31a63-00b1-f379-c5ac-7dc9425937f4@xxxxxxx/
that drivers should always explicitly set their masks.
Is it a really bad idea to check the current coherent mask's bits in
dw_pcie_msi_host_init() and if it is more than 32 - just issue a warning
rather than reset it to 32-bit unconditionally? That would help in our case.
Or, perhaps, there is a better workaround.
The problem isn't in the value the mask is set to. The problem is
two-leveled, but is mainly connected with the PCIe device detected on
the PCIe bus. There are some of them which can't send MSI TLPs to the
64-bit addresses. Since we can't predict whether such devices exist on
the bus beforehand the LLDD probe is performed together with the
MSI-engine initialization, the solution was to just restrict the MSIs
base address to be allocated within the lowest 4GB. Moreover as I said
above the iMSI-RX engine doesn't actually cause any DMA thus there is
no need in any memory allocation. Instead reserving some PCIe-bus
space/DWORD for MSIs would be enough. Alas the PCIe-subsystem doesn't
provide a way to do so. That's why we have what you see in the driver:
DMA mask restriction and coherent DMA memory allocation.
So, if I understand you correctly, what is needed here is a small area of
PCIe address space accessible to any of the connected PCIe devices. As the
kernel does not know in advance, which restrictions the devices have, it
tries to allocate 32-bit-addressable memory, suitable for DMA. This way, it
would be OK for any attached PCIe device. Right?
Right. The restriction is the 64-bit MSI capability. If any PCIe
peripheral device has no PCI_MSI_64_BIT_ADDR_CAP flag set and the MSI
target address is selected from the space above 4GB then such device
MSIs won't be handled.
Note as I said above no DMA actually performed if at least one MSI
vector is enabled. The driver just needs a DWORD within the PCIe bus
space for the MSI MWr TLPs target address and EP+vector data. The MSI
TLP data is decoded by the iMSI-RX engine in order to set the
corresponding flag in the MSI IRQ status register. Such TLPs won't be
passed to the master AXI-bus.
If only we had a way to auto-detected the PCIe-bus space with no
physical memory behind it and take out a DWORD from it to initialize
the iMSI-RX engine we could have immediately got rid from the mask
setting operation and the memory allocation. It would have solved your
problem too.
Yes, it would solve our issue too. I do not know, however, if a generic
solution is possible here, but I am no expert in PCIe.
Currently the kernel PCIe subsystem doesn't provide a way to reserve a
range within the PCIe bus memory with no physical RAM behind and left
unused during the BARs resource initialization. Implementing such
functionality (perhaps in the framework of the P2P module or based on
it) would give the generic solution.
For now, we are probably better off with SoC-specific patches, when we know
which PCIe devices can possibly be used and what their restrictions are.
Since you know that there is no any RAM below 4GB and you have
matching CPU and PCIe address spaces, then you can just take any
address below 4GB and use it to initialize the MSI-target address
(dw_pcie_rp.msi_data). But make sure that the peripheral PCIe-devices
don't use it for something application-specific (like accessing CPU
MMIO devices mapped to that base address). That seems like the most
universal solution for your case.
-Serge(y)
-Serge(y)
Looking forward to your comments.
so you, or Will, or anyone
else interested should be free to rework this on top of linux-next
(although at this point, more realistically on top of 6.3-rc1 in a few
weeks).
Thanks,
Robin.
Support for 64-bit MSI target addresses is needed for some of our SoCs.
I ran into a situation when there was no available RAM in ZONE_DMA32
during initialization of PCIe host. Hence, dmam_alloc_coherent() failed
in dw_pcie_msi_host_init() and initialization failed with -ENOMEM:
[ 0.374834] dw-pcie 4000000.pcie0: host bridge /soc/pcie0@4000000
ranges:
[ 0.375813] dw-pcie 4000000.pcie0: MEM
0x0041000000..0x004fffffff -> 0x0041000000
[ 0.376171] dw-pcie 4000000.pcie0: IB MEM
0x0400000000..0x07ffffffff -> 0x0400000000
[ 0.377914] dw-pcie 4000000.pcie0: Failed to alloc and map MSI data
[ 0.378191] dw-pcie 4000000.pcie0: Failed to initialize host
[ 0.378255] dw-pcie: probe of 4000000.pcie0 failed with error -12
Mainline kernel 6.2-rc6 was used in that test.
The hardware supports 64-bit target addresses, so the patch "PCI: dwc:
Add support for 64-bit MSI target address" should help with this
particular failure.
Thanks,
Will
Will McVicker (2):
PCI: dwc: Drop dependency on ZONE_DMA32
v6:
* Retrying DMA allocation with 64-bit mask within the error
if-statement.
* Use an int for the DMA mask instead of a bool and ternary operation.
v5:
* Updated patch 2/2 to first try with a 32-bit DMA mask. On failure,
retry with a 64-bit mask if supported.
v4:
* Updated commit descriptions.
* Renamed msi_64b -> msi_64bit.
* Dropped msi_64bit ternary use.
* Dropped export of dw_pcie_msi_capabilities.
v3:
* Switched to a managed DMA allocation.
* Simplified the DMA allocation cleanup.
* Dropped msi_page from struct dw_pcie_rp.
* Allocating a u64 instead of a full page.
v2:
* Fixed build error caught by kernel test robot
* Fixed error handling reported by Isaac Manjarres
PCI: dwc: Add support for 64-bit MSI target address
.../pci/controller/dwc/pcie-designware-host.c | 43 +++++++++----------
drivers/pci/controller/dwc/pcie-designware.c | 8 ++++
drivers/pci/controller/dwc/pcie-designware.h | 2 +-
3 files changed, 30 insertions(+), 23 deletions(-)
base-commit: 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
Thank you in advance.
Regards,
Evgenii