Re: [PATCH v1 0/5] Solve postboot supplier cleanup and optimize probe ordering

From: Rob Herring
Date: Fri May 24 2019 - 09:07:55 EST

Next message: Josh Poimboeuf: "Re: Getting empty callchain from perf_callchain_kernel()"
Previous message: Vince Weaver: "Re: [PATCH 1/2] perf/x86: Disable non generic regs for software/probe events"
In reply to: Saravana Kannan: "Re: [PATCH v1 0/5] Solve postboot supplier cleanup and optimize probe ordering"
Next in thread: Saravana Kannan: "Re: [PATCH v1 0/5] Solve postboot supplier cleanup and optimize probe ordering"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, May 23, 2019 at 8:01 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> Add a generic "depends-on" property that allows specifying mandatory
> functional dependencies between devices. Add device-links after the
> devices are created (but before they are probed) by looking at this
> "depends-on" property.

The DT already has dependency information. A node with 'clocks'
property has its dependency right there. We should use that. We don't
need to duplicate the information.

> This property is used instead of existing DT properties that specify
> phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This
> is because not all resources referred to by existing DT properties are
> mandatory functional dependencies. Some devices/drivers might be able
> to operate with reduced functionality when some of the resources
> aren't available. For example, a device could operate in polling mode
> if no IRQ is available, a device could skip doing power management if
> clock or voltage control isn't available and they are left on, etc.

Yeah, but none of these examples are typically what you'd want to
happen. These cases are a property of the OS, not the DT. For example,
until recently, If you added pinctrl bindings to your DT, the kernel
would no longer boot because it would be looking for pinctrl driver.
That's wrong because the DT should not be coupled to the OS like that.
Adding this property will cause the same problem.

> So, adding mandatory functional dependency links between devices by
> looking at referred phandles in DT properties won't work as it would
> prevent probing devices that could be probed. By having an explicit
> depends-on property, we can handle these cases correctly.
>
> Having functional dependencies explicitly called out in DT and
> automatically added before the devices are probed, provides the
> following benefits:
>
> - Optimizes device probe order and avoids the useless work of
> attempting probes of devices that will not probe successfully
> (because their suppliers aren't present or haven't probed yet).
>
> For example, in a commonly available mobile SoC, registering just
> one consumer device's driver at an initcall level earlier than the
> supplier device's driver causes 11 failed probe attempts before the
> consumer device probes successfully. This was with a kernel with all
> the drivers statically compiled in. This problem gets a lot worse if
> all the drivers are loaded as modules without direct symbol
> dependencies.

Do you have data on how much time is spent. Past 'smarter probing'
attempts have not shown a significant difference.

> - Supplier devices like clock providers, regulators providers, etc
> need to keep the resources they provide active and at a particular
> state(s) during boot up even if their current set of consumers don't
> request the resource to be active. This is because the rest of the
> consumers might not have probed yet and turning off the resource
> before all the consumers have probed could lead to a hang or
> undesired user experience.

We already know generally what devices are dependencies because you
just listed them. Why don't we make the kernel smarter by
instantiating these core devices/drivers first instead of relying on
initcall and link order.

> Some frameworks (Eg: regulator) handle this today by turning off
> "unused" resources at late_initcall_sync and hoping all the devices
> have probed by then. This is not a valid assumption for systems with
> loadable modules. Other frameworks (Eg: clock) just don't handle
> this due to the lack of a clear signal for when they can turn off
> resources. This leads to downstream hacks to handle cases like this
> that can easily be solved in the upstream kernel.

IMO, we should get rid of this auto disabling.

Rob

Next message: Josh Poimboeuf: "Re: Getting empty callchain from perf_callchain_kernel()"
Previous message: Vince Weaver: "Re: [PATCH 1/2] perf/x86: Disable non generic regs for software/probe events"
In reply to: Saravana Kannan: "Re: [PATCH v1 0/5] Solve postboot supplier cleanup and optimize probe ordering"
Next in thread: Saravana Kannan: "Re: [PATCH v1 0/5] Solve postboot supplier cleanup and optimize probe ordering"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]