Re: [PATCH v2 0/6] iommu/exynos: Convert to a module

From: Marek Szyprowski
Date: Mon Mar 13 2023 - 11:52:40 EST


Hi Saravana,

I'm sorry for the delay again, but finally I managed to get back to this
topic.

On 07.02.2023 04:32, Saravana Kannan wrote:
> On Fri, Nov 11, 2022 at 5:30 AM Sam Protsenko
> <semen.protsenko@xxxxxxxxxx> wrote:
>> On Thu, 10 Nov 2022 at 15:36, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote:
>>
>> [snip]
> Hi Marek and Sam,
>
> I'm replying to both of your comments in this email.
>
>>> I've finally made Exynos IOMMU working as a module on Exynos5433 based
>>> TM2e board. It looks that this will be a bit longer journey that I've
>>> initially thought. I've posted a simple update of the fix for the driver
>>> initialization sequence, but the real problem is in the platform driver
>>> framework and OF helpers.
>>>
>>> Basically to get it working as a module I had to apply the following
>>> changes:
>>>
>>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>>> index 3dda62503102..f6921f5fcab6 100644
>>> --- a/drivers/base/dd.c
>>> +++ b/drivers/base/dd.c
>>> @@ -257,7 +257,7 @@ static int deferred_devs_show(struct seq_file *s,
>>> void *data)
>>> DEFINE_SHOW_ATTRIBUTE(deferred_devs);
>>>
>>> #ifdef CONFIG_MODULES
>>> -int driver_deferred_probe_timeout = 10;
>>> +int driver_deferred_probe_timeout = 30;
>>> #else
>>> int driver_deferred_probe_timeout;
>>> #endif
>>> diff --git a/drivers/of/property.c b/drivers/of/property.c
>>> index 967f79b59016..e5df6672fee6 100644
>>> --- a/drivers/of/property.c
>>> +++ b/drivers/of/property.c
>>> @@ -1384,7 +1384,7 @@ static struct device_node *parse_interrupts(struct
>>> device_node *np,
>>> static const struct supplier_bindings of_supplier_bindings[] = {
>>> { .parse_prop = parse_clocks, },
>>> { .parse_prop = parse_interconnects, },
>>> - { .parse_prop = parse_iommus, .optional = true, },
>>> + { .parse_prop = parse_iommus, },
>>> { .parse_prop = parse_iommu_maps, .optional = true, },
>>> { .parse_prop = parse_mboxes, },
>>> { .parse_prop = parse_io_channels, },
>>>
>>> Without that a really nasty things happened.
> I have a command line option to do this without code changes. Use
> fw_devlink.strict=1. That makes all optional properties into mandatory
> ones.
>
> I sent out a series[1] that tried to make fw_devlink.strict=1 the
> default and then use the timeout behavior (more details) to handle
> cases where iommu and dmas (or any other supplier) are optional on a
> specific board. The cover letter of [1] should give some more context.

I'm for removing the optional properties or making them dependent on the
whole subsystem availability (see my next comments).


>>> Initialization of the built-in drivers and loading modules takes time,
>>> so the default 10s deferred probe timeout is not enough to ensure that
>>> the built-in driver won't be probed before the Exynos IOMMU driver is
>>> loaded.
> The 10 second is the minimum delay from the time we hit late_initcall.
> If a driver is registered before the 10s expires, then the timer will
> be extended by another 10s. This behavior landed sometime around the
> end of May 2022. So it should have been in your tree when you tested
> this. I'm surprised this isn't sufficient for your case. Is there
> really a 10s gap in your boot sequence where no module is being loaded
> and then IOMMU modules get loaded later on? I'm kinda surprised by
> this. Is it this long because some serial UART is enabled and it's
> slowing down boot? Or something else?
>
> I'm not saying your case isn't valid or we shouldn't extend the
> timeout. I'm just trying to understand why the current timer behavior
> wasn't able to cover your case.


Well, I have almost all possible CONFIG_DEBUG_* options enabled in
exynos_defconfig, as well as kernel messages routed to UART console, so
all my test system are really slow, but I do this intentionally. I
suspect that over half of the issues I've reported were found because of
such 'unusual' config. Loading modules also takes time in such case, so
maybe the timeout should be extended also by each loaded module?


>> Yeah, the whole time-based sync looks nasty... I remember coming
>> across the slides by Andrzej Hajda called "Deferred Problem" [1], but
>> I guess the proposed solution was never applied. Just hope that
>> increasing the timeout is upstreamable solution.
>>
>> [1] https://events19.linuxfoundation.org/wp-content/uploads/2017/12/Deferred-Problem-Issues-With-Complex-Dependencies-Between-Devices-in-Linux-Kernel-Andrzej-Hajda-Samsung.pdf
> Sam, I kinda skimmed the slides right now. Looks like it talks about
> device links and why they aren't sufficient and makes an alternate
> proposal. fw_devlink is a solution that uses device links and I think
> addresses a lot of the issues that were raised about device links.
> There's still a bunch of TODOs left, but I think the end goal is the
> same. I'm hoping to keep chipping away at it. For now, I've tried to
> make the timer a bit more smart about detecting when modules are
> getting loaded and extending the timer. fw_devlink also enables
> something called sync_state() that's invaluable on a fully modular
> system (search lore for references to that to get some idea).
>
> The slides talk about a solution that will allow devices to probe with
> limited functionality with whatever suppliers are available and then
> reprobe as more suppliers are available. I'm not sure how well that'll
> work across the board. It's going to be a bit weird if your phone
> display goes off and then comes on again because an IOMMU driver got
> loaded (and it can now do DRM playback). For now, I'm not going to
> focus on that option because there are enough existing issues/TODOs to
> work on for fw_devlink.
>
>>> The second change fixes the problem that driver core probes Exynos IOMMU
>>> controllers in parallel to probing the master devices, what results in
>>> calling exynos_iommu_of_xlate() and exynos_iommu_probe_device() even on
>>> the partially initialized IOMMU controllers or initializing the dma_ops
>>> under the already probed and working master device. This was easy to
>>> observe especially on the master devices with multiple IOMMU
>>> controllers. I wasn't able to solve this concurrency/race issues inside
>>> the Exynos IOMMU driver.
>>>
>>> Frankly speaking I don't know what is the rationale for making the
>>> 'iommus' property optional, but this simply doesn't work well with IOMMU
>>> driver being a module. CCed Saravana and Rob for this.
>>>
>> The patch which makes 'iommus' optional doesn't provide much of
>> insight on reasons in commit message either.
> This was the commit text:
>
> Not all DT bindings are mandatory bindings. Add support for optional DT
> bindings and mark iommus, iommu-map, dmas as optional DT bindings.
>
> I thought it was obvious enough, but I guess I could have done better.
> Geert convinced me that iommu's aren't always necessary and devices
> could work perfectly well without them or dmas. And he has a bunch of
> boards like that. So I went with adding optional and then introducing
> fw_devlink.strict.

Well, that depends heavily on the kernel config and the intentions
behind it. Indeed one might have IOMMU driver not selected in the config
(for various reasons, like forcing drivers to use contiguous memory
buffers for max performance) and system should boot fine. That time one
didn't consider the IOMMUs being compiled as modules. The main problem
is how to distinguish between the first case (intentionally no IOMMU
driver or even no IOMMU support) and the modularized case (driver not
yet loaded). I assume that one can skip waiting for IOMMUs only if the
whole IOMMU subsystem is not compiled in, otherwise the IOMMU driver can
still come in and it cannot properly start operating if client device
drivers already started their operation.

DMAs, although similar a bit, are imho a bit different. Drivers usually
have some kind of fallback if DMA is not yet available, but usually
drivers can switch their operation on-fly between PIO and DMA (and some
even do that intentionally and use PIO for very small transfers).

> However, at this point in time, I believe none of them should be
> marked as optional because technically any property can be optional
> depending on what the firmware has set up and what the driver does. We
> should figure this out at runtime on a board level -- which is what
> [1] is trying to do. Yeah, not very pretty, but there hasn't been a
> better solution that's not "have userspace tell us it's done loading
> modules" (that's a "kernel depends on userspace to work correctly"
> thing that no one likes). I've fixed some of the issues raised in [1]
> in a fw_devlink improvement series[2] and I plan on continuing to work
> on this until hopefully [1] can land.
>
>>> Without fixing the above issues, I would add a warning that compiling
>>> the driver as a module leads to serious issues.
>>>
>> Nice catch! It doesn't reproduce on my platform, alas. Can I expect
>> you to submit those patches? If so, I'll probably just wait for those
>> to be applied, and then re-send my modularization series on top of it.
>> Does that sounds reasonable?
> For now, maybe we could add a config to enable fw_devlink.strict=1 by
> default and then select it if you make specific iommu drivers into
> modules? And then Geert won't set it for his driver, but you can set
> it for your driver?

I think the best we can do now is to either add a kconfig option to
force strict fw_devlinks (so Exynos IOMMU can be selected as a module
only if that one is selected) or do a runtime check during the Exynos
IOMMU initalization - and fail with appropriate error message if
fw_devlink.strict!=1. What do you think?


Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland