Re: [PATCH RFC 0/2] Generate device tree node for pci devices

From: Sonal Santan
Date: Fri Oct 07 2022 - 18:45:32 EST


On 10/6/22 08:10, Rob Herring wrote:
On Fri, Sep 30, 2022 at 2:29 PM Sonal Santan <sonal.santan@xxxxxxx> wrote:

On 9/26/22 15:44, Rob Herring wrote:
On Fri, Sep 16, 2022 at 6:15 PM Frank Rowand <frowand.list@xxxxxxxxx> wrote:

On 8/29/22 16:43, Lizhi Hou wrote:
This patch series introduces OF overlay support for PCI devices which
primarily addresses two use cases. First, it provides a data driven method
to describe hardware peripherals that are present in a PCI endpoint and
hence can be accessed by the PCI host. An example device is Xilinx/AMD
Alveo PCIe accelerators. Second, it allows reuse of a OF compatible
driver -- often used in SoC platforms -- in a PCI host based system. An
example device is Microchip LAN9662 Ethernet Controller.

This patch series consolidates previous efforts to define such an
infrastructure:
https://lore.kernel.org/lkml/20220305052304.726050-1-lizhi.hou@xxxxxxxxxx/
https://lore.kernel.org/lkml/20220427094502.456111-1-clement.leger@xxxxxxxxxxx/

Normally, the PCI core discovers PCI devices and their BARs using the
PCI enumeration process. However, the process does not provide a way to
discover the hardware peripherals that are present in a PCI device, and
which can be accessed through the PCI BARs. Also, the enumeration process
does not provide a way to associate MSI-X vectors of a PCI device with the
hardware peripherals that are present in the device. PCI device drivers
often use header files to describe the hardware peripherals and their
resources as there is no standard data driven way to do so. This patch> series proposes to use flattened device tree blob to describe the
peripherals in a data driven way.

Based on previous discussion, using
device tree overlay is the best way to unflatten the blob and populate
platform devices.

I still do not agree with this statement. The device tree overlay
implementation is very incomplete and should not be used until it
becomes more complete. No need to debate this right now, but I don't want
to let this go unchallenged.

Then we should remove overlay support. The only way it becomes more
complete is having actual users.

But really, whether this is the right solution to the problem is
independent of the state of kernel overlay support.

If there is no base system device tree on an ACPI based system, then I
am not convinced that a mixed ACPI / device tree implementation is
good architecture.

Most/all of this series is needed for a DT system in which the PCI
devices are not populated in the DT.

I might be more supportive of using a device tree
description of a PCI device in a detached device tree (not linked to
the system device tree, but instead freestanding). Unfortunately the
device tree functions assume a single system devicetree, with no concept
of a freestanding tree (eg, if a NULL device tree node is provided to
a function or macro, it often defaults to the root of the system device
tree). I need to go look at whether the flag OF_DETACHED handles this,
or if it could be leveraged to do so.

Instead of worrying about a theoretical problem, we should see if
there is an actual problem for a user.

I'm not so worried about DT functions themselves, but places which
have 'if ACPI ... else (DT) ...' paths.


Bringing this thread back into focus. Any thoughts on how to move forward?

Reviewers raise concerns/issues and the submitters work to address
them or explain why they aren't an issue. The submitter has to push
things forward. That's how the process works.

We are working on updating the patch set to address the feedback. The design is still based on device tree overlay infrastructure.

As I noted, much of this is needed on a DT system with PCI device not
described in DT. So you could split out any ACPI system support to
avoid that concern for example. Enabling others to exercise these
patches may help too. Perhaps use QEMU to create some imaginary
device.
To verify this patch set, in addition to a x86_64/ACPI based system, we also have an AARCH64/DT QEMU setup where we have attached a physical Alveo device. We haven't run into any ACPI or DTO issues so far.

Perhaps we can introduce this feature in a phased manner where we first enable DT based platforms and then enable ACPI based platforms?

-Sonal

Rob