[PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

From: Rick Wertenbroek
Date: Tue Feb 14 2023 - 09:10:46 EST


This is a series of patches that fixes the PCIe endpoint controller driver
for the Rockchip RK3399 SoC. The driver was introduced in
cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
The original driver had issues and would not allow for the RK3399 to
operate in PCIe endpoint mode correctly. This patch series fixes that so
that the PCIe core controller of the RK3399 SoC can now act as a PCIe
endpoint. This is v2 of the patch series and addresses the concerns that
were raised during the review of the first version.

Thank you in advance for reviewing these changes and hopefully
getting this merged. Having a functional PCIe endpoint controller
driver for the RK3399 would allow to develop further PCIe endpoint
functions through the Linux PCIe endpoint framework using this SoC.

Problem: The Rockchip RK3399 PCIe endpoint controller driver introduced in
cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
did not work.

Summary of problems with the driver :

* Missing dtsi entry
* Could not update Device ID (DID)
* The endpoint could not be configured by a host computer because the
endpoint kept sending Configuration Request Retry Status (CRS) messages
* The kernel would sometimes hang on probe due to access to registers in
a clock domain of which the PLLs were not locked
* The memory window mapping and address translation mechanism had
conflicting mappings and did not follow the technical reference manual
as to how the address translation should be done
* Legacy IRQs were not generated by the endpoint
* Message Signaled interrupts (MSI) were not generated by the endpoint

The problems have been addressed and validated through tests (see below).

Summary of changes :

This patch series is composed of 9 patches that do the following :
* Remove writes to unused registers in the PCIe core register space.
The registers that were written to is marked "unused" and read
only in the technical reference manual of the RK3399 SoC.
* Write PCI Device ID (DID) to correct register, the DID was written to
a read only register and therefore would not update the DID.
* Assert PCI Configuration Enable bit after probe so that it would stop
sending Configuration Request Retry Status (CRS) messages to the
host once configured, without this the host would retry until
timeout and cancel the PCI configuration.
* Add poll and timeout to wait for PHY PLLs to be locked, this
is the only patch that also applies to the root complex function
of the PCIe core controller, without this the kernel would
sometimes access registers in the PHY PLL clock domain when the PLLs
were not yet locked and the system would hang. This was hackily solved
in other non mainline patches (e.g., in armbian) with a "msleep()"
that was added after PHY PLL configuration but without realizing
why it was needed. A poll with timeout seems like a sane approach.
* Add dtsi entry for RK3399 PCIe endpoint core. The new entry is
in "disabled" status by default, so unless it is explicitly enabled
it will not conflict with the PCIe root complex controller entry.
Developers that will enable it would know that the root complex function
then must be disabled, this can be done in the board level DTS.
* Fix window mapping and address translation for endpoint. The window
mapping and address translation did not follow the technical reference
manual and a single memory region was used which resulted in conflicting
address translations for memory allocated in that region. The current
patch allows to allocate up to 32 memory windows with 1MB pages.
* Fix legacy IRQ generation for RK3399 PCIe endpoint core, the legacy IRQs
were not sent by the device because their generation did not follow the
instructions in the technical reference manual. They now work.
* Use u32 variable to access 32-bit registers, u16 variables were used to
access and manipulate data of 32-bit registers, this would lead to
overflows e.g., when left shifting more than 16 bits.
* Add parameter check for RK3399 PCIe endpoint core set_msi(), return
-EINVAL when incompatible parameters are passed.

Validation on real hardware:

This patch series has been tested with kernel 6.0.19 (and 5.19)
on real hardware, a FriendlyElec NanoPC-T4 RK3399 based single computer
board connected to a host computer through PCIe x1 and x4. The PCIe
endpoint test function driver was loaded on the SoC and the PCIe endpoint
test driver was loaded on the host computer. The following tests were
executed through this setup :

* enumeration of the PCIe endpoint device (lspci)
lspci -vvv
* validation of PCI header and capabilities
setpci and lspci -xxxx
* device was recognized by host computer dans PCIe endpoint test driver
was loaded
lspci -v states "Kernel modules: pci_endpoint_test"
* tested the BARs 0-5
sudo /usr/bin/pcitest -b 0
...
sudo /usr/bin/pcitest -b 5
* tested legacy interrupt through the test driver
sudo /usr/bin/pcitest -i 0
sudo /usr/bin/pcitest -l
* tested MSI interrupt through the test driver
sudo /usr/bin/pcitest -i 1
sudo /usr/bin/pcitest -m 1
* tested read/write to and from host through the test driver with checksum
sudo /usr/bin/pcitest -r -s 1024
sudo /usr/bin/pcitest -w -s 1024
* tested read/write with DMA enabled (all read/write tests also did IRQ)
sudo /usr/bin/pcitest -r -d -s 8192
sudo /usr/bin/pcitest -w -d -s 8192

Commands used on the SoC to launch the endpoint function (configfs) :

modprobe -i pci-epf-test
mkdir -p /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0
echo 0xb500 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/deviceid
echo 0x104c > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/vendorid
echo 16 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/msi_interrupts
ln -s /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 \
/sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/
echo 1 > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/start

Note: to enable the endpoint controller on the board the file :
arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts
Was edited to set the status of &pcie0 to "disabled" and &pcie0_ep
to "okay". This is not submitted as a patch because most users
will use the PCIe core controller in host (root complex) mode
rather than endpoint mode.

I have tested and confirmed all basic functionality required for the
endpoint with the test driver and tools. With the previous state of
the driver the device would not even be enumerated by the host
computer (mainly because of CRS messages being sent back to the root
complex) and tests would not pass (driver would not even be loaded
because DID was not set correctly) and then only the BAR test would
pass. Now all tests pass as stated above.

Best regards
Rick

Rick Wertenbroek (9):
PCI: rockchip: Remove writes to unused registers
PCI: rockchip: Write PCI Device ID to correct register
PCI: rockchip: Assert PCI Configuration Enable bit after probe
PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked
arm64: dts: rockchip: Add dtsi entry for RK3399 PCIe endpoint core
PCI: rockchip: Fix window mapping and address translation for endpoint
PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core
PCI: rockchip: Use u32 variable to access 32-bit registers
PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core
set_msi()

arch/arm64/boot/dts/rockchip/rk3399.dtsi | 23 ++++
drivers/pci/controller/pcie-rockchip-ep.c | 143 ++++++++++------------
drivers/pci/controller/pcie-rockchip.c | 16 +++
drivers/pci/controller/pcie-rockchip.h | 36 ++++--
4 files changed, 128 insertions(+), 90 deletions(-)

--
2.25.1