Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

From: Uladzislau Rezki
Date: Fri Jul 03 2020 - 18:22:33 EST


On Fri, Jul 03, 2020 at 10:51:57PM +0100, Matthew Wilcox wrote:
> On Fri, Jul 03, 2020 at 11:20:47PM +0200, Uladzislau Rezki wrote:
> > Some background:
> > Actually i have been thinking about making vmalloc address space to
> > be per-CPU, i.e. divide it to per-CPU address space making an allocation
> > lock-less. It will eliminate a high lock contention. When i have done
> > a prototype i noticed and realized that there is a silly issue with
> > nr_cpu_ids on some systems.
>
> vfree() may happen on a different CPU from the one which called vmalloc(),
> so I'm not sure you're going to get as large a win as you think you will.
>
Hmm.. According to my tests the difference is approximately 7x/10x but
i also need to say as of now those tests are draft. Indeed vfree() can
be done on different CPU, but i do not think it is a big issue. The main
goal is to make the vmalloc() to be scaled to number of CPUs in a system.
Because as number of CPUs increase as tight an allocation becomes.

Doing vfree() on another CPU would be kind of noise(critical section is
short), whereas other ones will be able to do progress because of their
own locks.

--
Vlad Rezki