Re: [PATCH 1/2] percpu: km: remove SMP check

From: Dennis Zhou
Date: Wed Feb 27 2019 - 11:41:34 EST


On Wed, Feb 27, 2019 at 01:02:16PM +0000, Peng Fan wrote:
> Hi Dennis
>
> > -----Original Message-----
> > From: Dennis Zhou [mailto:dennis@xxxxxxxxxx]
> > Sent: 2019å2æ27æ 1:04
> > To: Christopher Lameter <cl@xxxxxxxxx>
> > Cc: Peng Fan <peng.fan@xxxxxxx>; tj@xxxxxxxxxx; linux-mm@xxxxxxxxx;
> > linux-kernel@xxxxxxxxxxxxxxx; van.freenix@xxxxxxxxx
> > Subject: Re: [PATCH 1/2] percpu: km: remove SMP check
> >
> > On Tue, Feb 26, 2019 at 03:16:44PM +0000, Christopher Lameter wrote:
> > > On Mon, 25 Feb 2019, Dennis Zhou wrote:
> > >
> > > > > @@ -27,7 +27,7 @@
> > > > > * chunk size is not aligned. percpu-km code will whine about it.
> > > > > */
> > > > >
> > > > > -#if defined(CONFIG_SMP) &&
> > > > > defined(CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK)
> > > > > +#if defined(CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK)
> > > > > #error "contiguous percpu allocation is incompatible with paged first
> > chunk"
> > > > > #endif
> > > > >
> > > > > --
> > > > > 2.16.4
> > > > >
> > > >
> > > > Hi,
> > > >
> > > > I think keeping CONFIG_SMP makes this easier to remember
> > > > dependencies rather than having to dig into the config. So this is a NACK
> > from me.
> > >
> > > But it simplifies the code and makes it easier to read.
> > >
> > >
> >
> > I think the check isn't quite right after looking at it a little longer.
> > Looking at x86, I believe you can compile it with !SMP and
> > CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK will still be set. This should
> > still work because x86 has an MMU.
>
> You are right, x86 could boots up with NEED_PER_CPU_PAGE_FIRST_CHUNK
> =y and SMP=n. Tested with qemu, info as below:
>
> / # zcat /proc/config.gz | grep NEED_PER_CPU_KM
> CONFIG_NEED_PER_CPU_KM=y
> / # zcat /proc/config.gz | grep SMP
> CONFIG_BROKEN_ON_SMP=y
> # CONFIG_SMP is not set
> CONFIG_GENERIC_SMP_IDLE_THREAD=y
> / # zcat /proc/config.gz | grep NEED_PER_CPU_PAGE_FIRST_CHUNK
> CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
> / # cat /proc/cpuinfo
> processor : 0
> vendor_id : AuthenticAMD
> cpu family : 6
> model : 6
> model name : QEMU Virtual CPU version 2.5+
> stepping : 3
> cpu MHz : 3192.613
> cache size : 512 KB
> fpu : yes
> fpu_exception : yes
> cpuid level : 13
> wp : yes
> flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm nopl cpuid pni cx16 hypervisor lahf_lm svm 3dnowprefetl
> bugs : fxsave_leak sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
> bogomips : 6385.22
> TLB size : 1024 4K pages
> clflush size : 64
> cache_alignment : 64
> address sizes : 42 bits physical, 48 bits virtual
> power management:
>
>
> But from the comments in this file:
> "
> * - CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK must not be defined. It's
> * not compatible with PER_CPU_KM. EMBED_FIRST_CHUNK should work
> * fine.
> "
>
> I did not read into details why it is not allowed, but x86 could still work with KM
> and NEED_PER_CPU_PAGE_FIRST_CHUNK.
>

The first chunk requires special handling on SMP to bring the static
variables into the percpu address space. On UP, identity mapping makes
static variables indistinguishable by aligning the percpu address space
and the virtual adress space. The percpu-km allocator allocates full
chunks at a time to deal with NOMMU archs. So the difference is if the
virtual address space is the same as the physical.

> >
> > I think more correctly it would be something like below, but I don't have the
> > time to fully verify it right now.
> >
> > Thanks,
> > Dennis
> >
> > ---
> > diff --git a/mm/percpu-km.c b/mm/percpu-km.c index
> > 0f643dc2dc65..69ccad7d9807 100644
> > --- a/mm/percpu-km.c
> > +++ b/mm/percpu-km.c
> > @@ -27,7 +27,7 @@
> > * chunk size is not aligned. percpu-km code will whine about it.
> > */
> >
> > -#if defined(CONFIG_SMP) &&
> > defined(CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK)
> > +#if !defined(CONFIG_MMU) &&
> > +defined(CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK)
> > #error "contiguous percpu allocation is incompatible with paged first chunk"
> > #endif
> >
>
> Acked-by: Peng Fan <peng.fan@xxxxxxx>
>
> Thanks,
> Peng

While this change may seem right to me. Verification would be to double
check other architectures too. x86 just happened to be a counter example
I had in mind. Unless someone reports this as being an issue or someone
takes the time to validate this more thoroughly than my cursory look.
I think the risk of this outweighs the benefit. This may be something I
fix in the future when I have more time. This would also involve making
sure the comments are consistent.

Thanks,
Dennis