Re: [PATCH] tcp: use kvmalloc_array() to allocate table_perturb

From: Eric Dumazet
Date: Tue Jun 07 2022 - 00:04:08 EST


On Mon, Jun 6, 2022 at 8:56 PM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote:
>
> On Tue, Jun 7, 2022 at 12:13 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >
> > On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> > >
> > > On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote:
> > > >
> > > > In our server, there may be no high order (>= 6) memory since we reserve
> > > > lots of HugeTLB pages when booting. Then the system panic. So use
> > > > kvmalloc_array() to allocate table_perturb.
> > > >
> > > > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
> > >
> > > Please add a Fixes: tag and CC original author ?
> > >
>
> Will do.
>
> > > Thanks.
> >
> > Also using alloc_large_system_hash() might be a better option anyway,
> > spreading pages on multiple nodes on NUMA hosts.
>
> Using alloc_large_system_hash() LGTM, but
> I didn't see where the memory is allocated on multi-node
> in alloc_large_system_hash() or vmalloc_huge(), what I
> missed here?

This is done by default. You do not have to do anything special. Just
call alloc_large_system_hash().

For instance, on two socket system:

# grep alloc_large_system_hash /proc/vmallocinfo
0x000000005536618c-0x00000000a4ae0198 12288
alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
0x000000003beddc38-0x0000000092b61b54 12288
alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
0x0000000092b61b54-0x000000005c33d7fb 12288
alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
0x000000004c0588af-0x0000000012cf548f 12288
alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
0x000000008d50035e-0x00000000f434e297 266240
alloc_large_system_hash+0x1df/0x2f0 pages=64 vmalloc N0=32 N1=32
0x00000000fe631da3-0x00000000b60e95b8 268439552
alloc_large_system_hash+0x1df/0x2f0 pages=65536 vmalloc vpages
N0=32768 N1=32768
0x00000000b60e95b8-0x0000000062eb7a11 528384
alloc_large_system_hash+0x1df/0x2f0 pages=128 vmalloc N0=64 N1=64
0x0000000062eb7a11-0x000000005408af10 134221824
alloc_large_system_hash+0x1df/0x2f0 pages=32768 vmalloc vpages
N0=16384 N1=16384
0x000000005408af10-0x0000000054fb99eb 4198400
alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
N1=512
0x0000000054fb99eb-0x00000000a130e604 4198400
alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
N1=512
0x00000000a130e604-0x00000000e6e62c85 4198400
alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
N1=512
0x00000000e6e62c85-0x000000005ca0ef7c 2101248
alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
0x000000005ca0ef7c-0x000000003bfe757f 1052672
alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128
0x000000003bfe757f-0x00000000bf49fcbd 4198400
alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
N1=512
0x00000000bf49fcbd-0x00000000902de200 1052672
alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128
0x00000000902de200-0x00000000c3d2821a 2101248
alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
0x00000000c3d2821a-0x000000002ddc68f6 2101248
alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256

You can see N0=X and N1=X meaning pages are evenly spread among the two nodes.