Re: mm/percpu.c: use smarter memory allocation for struct pcpu_alloc_info (crisv32 hang)

From: Jesper Nilsson
Date: Wed Nov 22 2017 - 10:35:03 EST


On Mon, Nov 20, 2017 at 10:50:46PM -0500, Nicolas Pitre wrote:
> On Mon, 20 Nov 2017, Guenter Roeck wrote:
> > On Mon, Nov 20, 2017 at 07:28:21PM -0500, Nicolas Pitre wrote:
> > > On Mon, 20 Nov 2017, Guenter Roeck wrote:
> > >
> > > > bdata->node_min_pfn=60000 PFN_PHYS(bdata->node_min_pfn)=c0000000 start_off=536000 region=c0536000
> > >
> > > If PFN_PHYS(bdata->node_min_pfn)=c0000000 and
> > > region=c0536000 that means phys_to_virt() is a no-op.
> > >
> > No, it is |= 0x80000000
>
> Then the bootmem registration looks very fishy. If you have:
>
> > I think the problem is the 0x60000 in bdata->node_min_pfn. It is shifted
> > left by PFN_PHYS, making it 0xc0000000, which in my understanding is
> > a virtual address.
>
> Exact.
>
> #define __pa(x) ((unsigned long)(x) & 0x7fffffff)
> #define __va(x) ((void *)((unsigned long)(x) | 0x80000000))
>
> With that, the only possible physical address range you may have is
> 0x40000000 - 0x7fffffff, and it better start at 0x40000000. If that's
> not where your RAM is then something is wrong.
>
> This is in fact a very bad idea to define __va() and __pa() using
> bitwise operations as this hides mistakes like defining physical RAM
> address at 0xc0000000. Instead, it should look like:
>
> #define __pa(x) ((unsigned long)(x) - 0x80000000)
> #define __va(x) ((void *)((unsigned long)(x) + 0x80000000))
>
> This way, bad physical RAM address definitions will be caught
> immediately.
>
> > That doesn't seem to be easy to fix. It seems there is a mixup of physical
> > and virtual addresses in the architecture.
>
> Well... I don't think there is much else to say other than this needs
> fixing.

The memory map for the ETRAX FS has the SDRAM mapped at both 0x40000000-0x7fffffff
and 0xc0000000-0xffffffff, and the difference is cached and non-cached.
That is actively (ab)used in the port, unfortunately, allthough I'm
uncertain if this is the problem in this case.

I get the same behaviour in my QEMU, but I've not been able to make
sense of anything yet...

> Nicolas

/^JN - Jesper Nilsson
--
Jesper Nilsson -- jesper.nilsson@xxxxxxxx