Re: [PATCH v1] clk: Convert managed get functions to devm_add_action API
From: Dmitry Torokhov
Date: Thu Dec 12 2019 - 19:16:35 EST
On Thu, Dec 12, 2019 at 09:08:04PM +0000, Robin Murphy wrote:
> On 2019-12-12 7:10 pm, Dmitry Torokhov wrote:
> > On Thu, Dec 12, 2019 at 06:15:16PM +0000, Robin Murphy wrote:
> > > On 12/12/2019 4:59 pm, Marc Gonzalez wrote:
> > > > On 12/12/2019 15:47, Robin Murphy wrote:
> > > >
> > > > > On 12/12/2019 1:53 pm, Marc Gonzalez wrote:
> > > > >
> > > > > > On 11/12/2019 23:28, Dmitry Torokhov wrote:
> > > > > >
> > > > > > > On Wed, Dec 11, 2019 at 05:17:28PM +0100, Marc Gonzalez wrote:
> > > > > > >
> > > > > > > > What is the rationale for the devm_add_action API?
> > > > > > >
> > > > > > > For one-off and maybe complex unwind actions in drivers that wish to use
> > > > > > > devm API (as mixing devm and manual release is verboten). Also is often
> > > > > > > used when some core subsystem does not provide enough devm APIs.
> > > > > >
> > > > > > Thanks for the insight, Dmitry. Thanks to Robin too.
> > > > > >
> > > > > > This is what I understand so far:
> > > > > >
> > > > > > devm_add_action() is nice because it hides/factorizes the complexity
> > > > > > of the devres API, but it incurs a small storage overhead of one
> > > > > > pointer per call, which makes it unfit for frequently used actions,
> > > > > > such as clk_get.
> > > > > >
> > > > > > Is that correct?
> > > > > >
> > > > > > My question is: why not design the API without the small overhead?
> > > > >
> > > > > Probably because on most architectures, ARCH_KMALLOC_MINALIGN is at
> > > > > least as big as two pointers anyway, so this "overhead" should mostly be
> > > > > free in practice. Plus the devres API is almost entirely about being
> > > > > able to write simple robust code, rather than absolute efficiency - I
> > > > > mean, struct devres itself is already 5 pointers large at the absolute
> > > > > minimum ;)
> > > >
> > > > (3 pointers: 1 list_head + 1 function pointer)
> > >
> > > Ah yes, I failed to mentally preprocess the debug config :)
> > >
> > > > I'm confused. The first patch was criticized for potentially adding
> > > > an extra pointer for every devm_clk_get (e.g. 800 bytes on a 64-bit
> > > > platform with 100 clocks).
> > >
> > > I'm not sure it was a criticism so much as an observation of an aspect that
> > > deserved consideration (certainly it was on my part, and I read Dmitry's "It
> > > might still, ..." as implying the same). I'd say by this point it has been
> > > thoroughly considered, and personally I'm now happy with the conclusion that
> > > the kind of embedded platforms that will have many dozens of clocks are also
> > > the kind that will tend to have enough padding to make it moot, and thus the
> > > code simplification probably is worthwhile overall.
> >
> > I wonder if we could actually avoid allocating the data with
> > ARCH_KMALLOC_MINALIGN in all the cases. It is definitely needed for the
> > devm_k*alloc() group of functions as they are direct replacement for
> > k*alloc() APIs that give users aligned memory, but for other data
> > structures (clocks, regulators, etc, etc) it is not required.
>
> That's a very good point - perhaps something like this (only done properly)?
Yes, but it has to be done carefully.
>
> Robin.
>
> diff --git a/drivers/base/devres.c b/drivers/base/devres.c
> index 0bbb328bd17f..2382f963abbe 100644
> --- a/drivers/base/devres.c
> +++ b/drivers/base/devres.c
> @@ -26,14 +26,7 @@ struct devres_node {
>
> struct devres {
> struct devres_node node;
> - /*
> - * Some archs want to perform DMA into kmalloc caches
> - * and need a guaranteed alignment larger than
> - * the alignment of a 64-bit integer.
> - * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
> - * buffer alignment as if it was allocated by plain kmalloc().
> - */
> - u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
> + u8 data[];
> };
>
> struct devres_group {
> @@ -810,6 +803,17 @@ static int devm_kmalloc_match(struct device *dev, void
> *res, void *data)
> void * devm_kmalloc(struct device *dev, size_t size, gfp_t gfp)
> {
> struct devres *dr;
> + size_t align;
> +
> + /*
> + * Some archs want to perform DMA into kmalloc caches
> + * and need a guaranteed alignment larger than
> + * the alignment of a 64-bit integer.
> + * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
> + * buffer alignment as if it was allocated by plain kmalloc().
> + */
> + align = (ARCH_KMALLOC_MINALIGN - sizeof(*dr)) %
> ARCH_KMALLOC_MINALIGN;
> + size += align;
>
> /* use raw alloc_dr for kmalloc caller tracing */
> dr = alloc_dr(devm_kmalloc_release, size, gfp, dev_to_node(dev));
> @@ -822,7 +826,7 @@ void * devm_kmalloc(struct device *dev, size_t size,
> gfp_t gfp)
> */
> set_node_dbginfo(&dr->node, "devm_kzalloc_release", size);
> devres_add(dev, dr->data);
I think it has to be "devres_add(dev, dr->data + align);" here, as match
function checks the pointer passed to devm_kfree() with one stored in
devres structure.
> - return dr->data;
> + return dr->data + align;
> }
> EXPORT_SYMBOL_GPL(devm_kmalloc);
Thanks.
--
Dmitry