Re: [PATCH v10 3/5] CPM/QE: use genalloc to manage CPM/QE muram

From: Scott Wood
Date: Tue Sep 22 2015 - 20:19:50 EST


On Tue, 2015-09-22 at 03:10 -0500, Zhao Qiang-B45475 wrote:
> On Tue, Sep 22, 2015 at 06:47 AM +0800, Wood Scott-B07421 wrote:
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Tuesday, September 22, 2015 6:47 AM
> > To: Zhao Qiang-B45475
> > Cc: linux-kernel@xxxxxxxxxxxxxxx; linuxppc-dev@xxxxxxxxxxxxxxxx;
> > lauraa@xxxxxxxxxxxxxx; Xie Xiaobo-R63061; benh@xxxxxxxxxxxxxxxxxxx; Li
> > Yang-Leo-R58472; paulus@xxxxxxxxx
> > Subject: Re: [PATCH v10 3/5] CPM/QE: use genalloc to manage CPM/QE muram
> >
> > On Fri, Sep 18, 2015 at 03:15:19PM +0800, Zhao Qiang wrote:
> > > Use genalloc to manage CPM/QE muram instead of rheap.
> > >
> > > Signed-off-by: Zhao Qiang <qiang.zhao@xxxxxxxxxxxxx>
> > > ---
> > > Changes for v9:
> > > - splitted from patch 3/5, modify cpm muram management functions.
> > > Changes for v10:
> > > - modify cpm muram first, then move to qe_common
> > > - modify commit.
> > >
> > > arch/powerpc/platforms/Kconfig | 1 +
> > > arch/powerpc/sysdev/cpm_common.c | 150
> > > +++++++++++++++++++++++++++------------
> > > 2 files changed, 107 insertions(+), 44 deletions(-)
> > >
> > > diff --git a/arch/powerpc/platforms/Kconfig
> > > b/arch/powerpc/platforms/Kconfig index b7f9c40..01f98a2 100644
> > > --- a/arch/powerpc/platforms/Kconfig
> > > +++ b/arch/powerpc/platforms/Kconfig
> > > @@ -276,6 +276,7 @@ config QUICC_ENGINE
> > > bool "Freescale QUICC Engine (QE) Support"
> > > depends on FSL_SOC && PPC32
> > > select PPC_LIB_RHEAP
> > > + select GENERIC_ALLOCATOR
> > > select CRC32
> > > help
> > > The QUICC Engine (QE) is a new generation of communications diff
> > > --git a/arch/powerpc/sysdev/cpm_common.c
> > > b/arch/powerpc/sysdev/cpm_common.c
> > > index 4f78695..453d18c 100644
> > > --- a/arch/powerpc/sysdev/cpm_common.c
> > > +++ b/arch/powerpc/sysdev/cpm_common.c
> > > @@ -17,6 +17,7 @@
> > > * published by the Free Software Foundation.
> > > */
> > >
> > > +#include <linux/genalloc.h>
> > > #include <linux/init.h>
> > > #include <linux/of_device.h>
> > > #include <linux/spinlock.h>
> > > @@ -27,7 +28,6 @@
> > >
> > > #include <asm/udbg.h>
> > > #include <asm/io.h>
> > > -#include <asm/rheap.h>
> > > #include <asm/cpm.h>
> > >
> > > #include <mm/mmu_decl.h>
> > > @@ -65,14 +65,24 @@ void __init udbg_init_cpm(void) } #endif
> > >
> > > +static struct gen_pool *muram_pool;
> > > +static struct genpool_data_align muram_pool_data; static struct
> > > +genpool_data_fixed muram_pool_data_fixed;
> >
> > Why are these global? If you keep the data local to the caller (and use
> > gen_pool_alloc_data()) then you probably don't need cpm_muram_lock.
>
> Ok
>
> >
> > > static spinlock_t cpm_muram_lock;
> > > -static rh_block_t cpm_boot_muram_rh_block[16]; -static rh_info_t
> > > cpm_muram_info; static u8 __iomem *muram_vbase; static phys_addr_t
> > > muram_pbase;
> > >
> > > -/* Max address size we deal with */
> > > +struct muram_block {
> > > + struct list_head head;
> > > + unsigned long start;
> > > + int size;
> > > +};
> > > +
> > > +static LIST_HEAD(muram_block_list);
> > > +
> > > +/* max address size we deal with */
> > > #define OF_MAX_ADDR_CELLS 4
> > > +#define GENPOOL_OFFSET 4096
> >
> > Is 4096 bytes the maximum alignment you'll ever need? Wouldn't it be
> > safer to use a much larger offset?
>
> Yes, 4096 is the maximum alignment I ever need.

Still, I'd be more comfortable with a larger offset.

Better yet would be using gen_pool_add_virt() and using virtual addresses for
the allocations, similar to http://patchwork.ozlabs.org/patch/504000/

> > > int cpm_muram_init(void)
> > > {
> > > @@ -86,113 +96,165 @@ int cpm_muram_init(void)
> > > if (muram_pbase)
> > > return 0;
> > >
> > > - spin_lock_init(&cpm_muram_lock);
> >
> > Why are you eliminating the lock init, given that you're not getting rid
> > of the lock?
> >
> > > - /* initialize the info header */
> > > - rh_init(&cpm_muram_info, 1,
> > > - sizeof(cpm_boot_muram_rh_block) /
> > > - sizeof(cpm_boot_muram_rh_block[0]),
> > > - cpm_boot_muram_rh_block);
> > > -
> > > np = of_find_compatible_node(NULL, NULL, "fsl,cpm-muram-data");
> > > if (!np) {
> > > /* try legacy bindings */
> > > np = of_find_node_by_name(NULL, "data-only");
> > > if (!np) {
> > > - printk(KERN_ERR "Cannot find CPM muram data node");
> > > + pr_err("Cannot find CPM muram data node");
> > > ret = -ENODEV;
> > > goto out;
> > > }
> > > }
> > >
> > > + muram_pool = gen_pool_create(1, -1);
> >
> > Do we really need byte-granularity?
>
> It is 2byte-granularity, 4byte-granularity is needed
>
> >
> > > muram_pbase = of_translate_address(np, zero);
> > > if (muram_pbase == (phys_addr_t)OF_BAD_ADDR) {
> > > - printk(KERN_ERR "Cannot translate zero through CPM muram
> > node");
> > > + pr_err("Cannot translate zero through CPM muram node");
> > > ret = -ENODEV;
> > > - goto out;
> > > + goto err;
> > > }
> > >
> > > while (of_address_to_resource(np, i++, &r) == 0) {
> > > if (r.end > max)
> > > max = r.end;
> > > + ret = gen_pool_add(muram_pool, r.start - muram_pbase +
> > > + GENPOOL_OFFSET, resource_size(&r), -1);
> > > + if (ret) {
> > > + pr_err("QE: couldn't add muram to pool!\n");
> > > + goto err;
> > > + }
> > >
> > > - rh_attach_region(&cpm_muram_info, r.start - muram_pbase,
> > > - resource_size(&r));
> > > }
> > >
> > > muram_vbase = ioremap(muram_pbase, max - muram_pbase + 1);
> > > if (!muram_vbase) {
> > > - printk(KERN_ERR "Cannot map CPM muram");
> > > + pr_err("Cannot map QE muram");
> > > ret = -ENOMEM;
> > > + goto err;
> > > }
> > > -
> > > + goto out;
> > > +err:
> > > + gen_pool_destroy(muram_pool);
> > > out:
> > > of_node_put(np);
> > > return ret;
> >
> > Having both "err" and "out" is confusing. Instead call it "out_pool" or
> > similar.
>
> Ok
>
> > > {
> > > - int ret;
> > > +
> > > + unsigned long start;
> > > unsigned long flags;
> > > + unsigned long size_alloc = size;
> > > + struct muram_block *entry;
> > > + int end_bit;
> > > + int order = muram_pool->min_alloc_order;
> > >
> > > spin_lock_irqsave(&cpm_muram_lock, flags);
> > > - ret = rh_free(&cpm_muram_info, offset);
> > > + end_bit = (offset >> order) + ((size + (1UL << order) - 1) >>
> > order);
> > > + if ((offset + size) > (end_bit << order))
> > > + size_alloc = size + (1UL << order);
> >
> > Why do you need to do all these calculations here?
>
> So do it in gen_pool_fixed_alloc?

Could you explain why they're needed at all?

> gen_pool_fixed_alloc just
> Can see numbers of blocks. It can't do calculations in gen_pool_fixed_alloc.

Are you saying that this:

struct genpool_data_fixed {
unsigned long offset; /* The offset of the specific region */
};

refers to blocks and not bytes? And you didn't even mention that in the
comment?

It should be bytes. Actually, it should be an address, not an offset. It
should be the same value that you receive back from the allocator. And I'm
not sure how this is going to work with genalloc chunks that don't start at
zero. I think it really does need to be a separate toplevel function, and
not just an allocation algorithm (or else the allocation algorithm is going
to need more context on what chunk it's working on).

Speaking of chunks and the allocation function, I wonder what happens if you
hit "if (start_bit >= end_bit) continue;" and then enter the loop with a new
chunk and start_bit != 0...

-Scott

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/