Re: [PATCH v1] clk: Convert managed get functions to devm_add_action API

From: Robin Murphy
Date: Thu Dec 12 2019 - 16:07:58 EST


On 2019-12-12 7:10 pm, Dmitry Torokhov wrote:
On Thu, Dec 12, 2019 at 06:15:16PM +0000, Robin Murphy wrote:
On 12/12/2019 4:59 pm, Marc Gonzalez wrote:
On 12/12/2019 15:47, Robin Murphy wrote:

On 12/12/2019 1:53 pm, Marc Gonzalez wrote:

On 11/12/2019 23:28, Dmitry Torokhov wrote:

On Wed, Dec 11, 2019 at 05:17:28PM +0100, Marc Gonzalez wrote:

What is the rationale for the devm_add_action API?

For one-off and maybe complex unwind actions in drivers that wish to use
devm API (as mixing devm and manual release is verboten). Also is often
used when some core subsystem does not provide enough devm APIs.

Thanks for the insight, Dmitry. Thanks to Robin too.

This is what I understand so far:

devm_add_action() is nice because it hides/factorizes the complexity
of the devres API, but it incurs a small storage overhead of one
pointer per call, which makes it unfit for frequently used actions,
such as clk_get.

Is that correct?

My question is: why not design the API without the small overhead?

Probably because on most architectures, ARCH_KMALLOC_MINALIGN is at
least as big as two pointers anyway, so this "overhead" should mostly be
free in practice. Plus the devres API is almost entirely about being
able to write simple robust code, rather than absolute efficiency - I
mean, struct devres itself is already 5 pointers large at the absolute
minimum ;)

(3 pointers: 1 list_head + 1 function pointer)

Ah yes, I failed to mentally preprocess the debug config :)

I'm confused. The first patch was criticized for potentially adding
an extra pointer for every devm_clk_get (e.g. 800 bytes on a 64-bit
platform with 100 clocks).

I'm not sure it was a criticism so much as an observation of an aspect that
deserved consideration (certainly it was on my part, and I read Dmitry's "It
might still, ..." as implying the same). I'd say by this point it has been
thoroughly considered, and personally I'm now happy with the conclusion that
the kind of embedded platforms that will have many dozens of clocks are also
the kind that will tend to have enough padding to make it moot, and thus the
code simplification probably is worthwhile overall.

I wonder if we could actually avoid allocating the data with
ARCH_KMALLOC_MINALIGN in all the cases. It is definitely needed for the
devm_k*alloc() group of functions as they are direct replacement for
k*alloc() APIs that give users aligned memory, but for other data
structures (clocks, regulators, etc, etc) it is not required.

That's a very good point - perhaps something like this (only done properly)?

Robin.

diff --git a/drivers/base/devres.c b/drivers/base/devres.c
index 0bbb328bd17f..2382f963abbe 100644
--- a/drivers/base/devres.c
+++ b/drivers/base/devres.c
@@ -26,14 +26,7 @@ struct devres_node {

struct devres {
struct devres_node node;
- /*
- * Some archs want to perform DMA into kmalloc caches
- * and need a guaranteed alignment larger than
- * the alignment of a 64-bit integer.
- * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
- * buffer alignment as if it was allocated by plain kmalloc().
- */
- u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
+ u8 data[];
};

struct devres_group {
@@ -810,6 +803,17 @@ static int devm_kmalloc_match(struct device *dev, void *res, void *data)
void * devm_kmalloc(struct device *dev, size_t size, gfp_t gfp)
{
struct devres *dr;
+ size_t align;
+
+ /*
+ * Some archs want to perform DMA into kmalloc caches
+ * and need a guaranteed alignment larger than
+ * the alignment of a 64-bit integer.
+ * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
+ * buffer alignment as if it was allocated by plain kmalloc().
+ */
+ align = (ARCH_KMALLOC_MINALIGN - sizeof(*dr)) % ARCH_KMALLOC_MINALIGN;
+ size += align;

/* use raw alloc_dr for kmalloc caller tracing */
dr = alloc_dr(devm_kmalloc_release, size, gfp, dev_to_node(dev));
@@ -822,7 +826,7 @@ void * devm_kmalloc(struct device *dev, size_t size, gfp_t gfp)
*/
set_node_dbginfo(&dr->node, "devm_kzalloc_release", size);
devres_add(dev, dr->data);
- return dr->data;
+ return dr->data + align;
}
EXPORT_SYMBOL_GPL(devm_kmalloc);