Re: [PATCH v16 net-next 01/23] net/tcp: Prepare tcp_md5sig_pool for TCP-AO

From: Eric Dumazet
Date: Thu Dec 21 2023 - 09:32:17 EST


On Mon, Oct 23, 2023 at 9:22 PM Dmitry Safonov <dima@xxxxxxxxxx> wrote:
>
> TCP-AO, similarly to TCP-MD5, needs to allocate tfms on a slow-path,
> which is setsockopt() and use crypto ahash requests on fast paths,
> which are RX/TX softirqs. Also, it needs a temporary/scratch buffer
> for preparing the hash.
>
> Rework tcp_md5sig_pool in order to support other hashing algorithms
> than MD5. It will make it possible to share pre-allocated crypto_ahash
> descriptors and scratch area between all TCP hash users.
>
> Internally tcp_sigpool calls crypto_clone_ahash() API over pre-allocated
> crypto ahash tfm. Kudos to Herbert, who provided this new crypto API.
>
> I was a little concerned over GFP_ATOMIC allocations of ahash and
> crypto_request in RX/TX (see tcp_sigpool_start()), so I benchmarked both
> "backends" with different algorithms, using patched version of iperf3[2].
> On my laptop with i7-7600U @ 2.80GHz:
>
> clone-tfm per-CPU-requests
> TCP-MD5 2.25 Gbits/sec 2.30 Gbits/sec
> TCP-AO(hmac(sha1)) 2.53 Gbits/sec 2.54 Gbits/sec
> TCP-AO(hmac(sha512)) 1.67 Gbits/sec 1.64 Gbits/sec
> TCP-AO(hmac(sha384)) 1.77 Gbits/sec 1.80 Gbits/sec
> TCP-AO(hmac(sha224)) 1.29 Gbits/sec 1.30 Gbits/sec
> TCP-AO(hmac(sha3-512)) 481 Mbits/sec 480 Mbits/sec
> TCP-AO(hmac(md5)) 2.07 Gbits/sec 2.12 Gbits/sec
> TCP-AO(hmac(rmd160)) 1.01 Gbits/sec 995 Mbits/sec
> TCP-AO(cmac(aes128)) [not supporetd yet] 2.11 Gbits/sec
>
> So, it seems that my concerns don't have strong grounds and per-CPU
> crypto_request allocation can be dropped/removed from tcp_sigpool once
> ciphers get crypto_clone_ahash() support.
>
> [1]: https://lore.kernel.org/all/ZDefxOq6Ax0JeTRH@xxxxxxxxxxxxxxxxxxx/T/#u
> [2]: https://github.com/0x7f454c46/iperf/tree/tcp-md5-ao
> Signed-off-by: Dmitry Safonov <dima@xxxxxxxxxx>
> Reviewed-by: Steen Hegelund <Steen.Hegelund@xxxxxxxxxxxxx>
> Acked-by: David Ahern <dsahern@xxxxxxxxxx>
>

...

> +int tcp_sigpool_alloc_ahash(const char *alg, size_t scratch_size)
> +{
> + int i, ret;
> +
> + /* slow-path */
> + mutex_lock(&cpool_mutex);
> + ret = sigpool_reserve_scratch(scratch_size);
> + if (ret)
> + goto out;
> + for (i = 0; i < cpool_populated; i++) {
> + if (!cpool[i].alg)
> + continue;
> + if (strcmp(cpool[i].alg, alg))
> + continue;
> +
> + if (kref_read(&cpool[i].kref) > 0)
> + kref_get(&cpool[i].kref);

This sequence is racy.

You must use kref_get_unless_zero().

> + else
> + kref_init(&cpool[i].kref);
> + ret = i;
> + goto out;
> + }
> +
> +

syzbot reported:

refcount_t: addition on 0; use-after-free.
WARNING: CPU: 2 PID: 31702 at lib/refcount.c:25
refcount_warn_saturate+0x1ca/0x210 lib/refcount.c:25
Modules linked in:
CPU: 2 PID: 31702 Comm: syz-executor.3 Not tainted
6.7.0-rc6-syzkaller-00044-g1a44b0073b92 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:refcount_warn_saturate+0x1ca/0x210 lib/refcount.c:25
Code: ff 89 de e8 58 a3 25 fd 84 db 0f 85 e6 fe ff ff e8 1b a8 25 fd
c6 05 9a 88 a1 0a 01 90 48 c7 c7 00 9d 2e 8b e8 b7 ec eb fc 90 <0f> 0b
90 90 e9 c3 fe ff ff e8 f8 a7 25 fd c6 05 75 88 a1 0a 01 90
RSP: 0018:ffffc900296df850 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9002c40a000
RDX: 0000000000040000 RSI: ffffffff814db526 RDI: 0000000000000001
RBP: ffffffff92b5b7b0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000010
R13: ffffffff92b5b7b0 R14: 0000000000000001 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88802c800000(0063) knlGS:00000000f7efdb40
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000f7354000 CR3: 0000000050ee3000 CR4: 0000000000350ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__refcount_add include/linux/refcount.h:199 [inline]
__refcount_inc include/linux/refcount.h:250 [inline]
refcount_inc include/linux/refcount.h:267 [inline]
kref_get include/linux/kref.h:45 [inline]
tcp_sigpool_alloc_ahash+0x9cb/0xce0 net/ipv4/tcp_sigpool.c:166
tcp_md5_alloc_sigpool+0x1b/0x40 net/ipv4/tcp.c:4379
tcp_md5_do_add+0x192/0x460 net/ipv4/tcp_ipv4.c:1403
tcp_v6_parse_md5_keys+0x68d/0x860 net/ipv6/tcp_ipv6.c:676
do_tcp_setsockopt+0x1302/0x2880 net/ipv4/tcp.c:3644
tcp_setsockopt+0xd4/0x100 net/ipv4/tcp.c:3726
do_sock_setsockopt+0x222/0x470 net/socket.c:2311
__sys_setsockopt+0x1a6/0x270 net/socket.c:2334
__do_sys_setsockopt net/socket.c:2343 [inline]
__se_sys_setsockopt net/socket.c:2340 [inline]
__ia32_sys_setsockopt+0xbc/0x150 net/socket.c:2340
do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
__do_fast_syscall_32+0x62/0xe0 arch/x86/entry/common.c:321
do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:346
entry_SYSENTER_compat_after_hwframe+0x70/0x7a
RIP: 0023:0xf7f02579
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00
00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a
59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f7efd5ac EFLAGS: 00000292 ORIG_RAX: 000000000000016e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000006
RDX: 000000000000000e RSI: 0000000020000000 RDI: 00000000000000d8
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000