Re: [PATCH] tcp: use kvmalloc_array() to allocate table_perturb
From: Muchun Song
Date: Tue Jun 07 2022 - 00:16:17 EST
On Tue, Jun 7, 2022 at 12:03 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> On Mon, Jun 6, 2022 at 8:56 PM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote:
> >
> > On Tue, Jun 7, 2022 at 12:13 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> > >
> > > On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> > > >
> > > > On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote:
> > > > >
> > > > > In our server, there may be no high order (>= 6) memory since we reserve
> > > > > lots of HugeTLB pages when booting. Then the system panic. So use
> > > > > kvmalloc_array() to allocate table_perturb.
> > > > >
> > > > > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
> > > >
> > > > Please add a Fixes: tag and CC original author ?
> > > >
> >
> > Will do.
> >
> > > > Thanks.
> > >
> > > Also using alloc_large_system_hash() might be a better option anyway,
> > > spreading pages on multiple nodes on NUMA hosts.
> >
> > Using alloc_large_system_hash() LGTM, but
> > I didn't see where the memory is allocated on multi-node
> > in alloc_large_system_hash() or vmalloc_huge(), what I
> > missed here?
>
> This is done by default. You do not have to do anything special. Just
> call alloc_large_system_hash().
>
> For instance, on two socket system:
>
> # grep alloc_large_system_hash /proc/vmallocinfo
> 0x000000005536618c-0x00000000a4ae0198 12288
> alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
> 0x000000003beddc38-0x0000000092b61b54 12288
> alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
> 0x0000000092b61b54-0x000000005c33d7fb 12288
> alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
> 0x000000004c0588af-0x0000000012cf548f 12288
> alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
> 0x000000008d50035e-0x00000000f434e297 266240
> alloc_large_system_hash+0x1df/0x2f0 pages=64 vmalloc N0=32 N1=32
> 0x00000000fe631da3-0x00000000b60e95b8 268439552
> alloc_large_system_hash+0x1df/0x2f0 pages=65536 vmalloc vpages
> N0=32768 N1=32768
> 0x00000000b60e95b8-0x0000000062eb7a11 528384
> alloc_large_system_hash+0x1df/0x2f0 pages=128 vmalloc N0=64 N1=64
> 0x0000000062eb7a11-0x000000005408af10 134221824
> alloc_large_system_hash+0x1df/0x2f0 pages=32768 vmalloc vpages
> N0=16384 N1=16384
> 0x000000005408af10-0x0000000054fb99eb 4198400
> alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
> N1=512
> 0x0000000054fb99eb-0x00000000a130e604 4198400
> alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
> N1=512
> 0x00000000a130e604-0x00000000e6e62c85 4198400
> alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
> N1=512
> 0x00000000e6e62c85-0x000000005ca0ef7c 2101248
> alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
> 0x000000005ca0ef7c-0x000000003bfe757f 1052672
> alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128
> 0x000000003bfe757f-0x00000000bf49fcbd 4198400
> alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
> N1=512
> 0x00000000bf49fcbd-0x00000000902de200 1052672
> alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128
> 0x00000000902de200-0x00000000c3d2821a 2101248
> alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
> 0x00000000c3d2821a-0x000000002ddc68f6 2101248
> alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
>
> You can see N0=X and N1=X meaning pages are evenly spread among the two nodes.
Thanks a lot. Really helpful information.