"netfs: Can't donate prior to front"

From: Max Kellermann
Date: Fri Feb 07 2025 - 13:42:41 EST


Hi,

the following crash occurs with 6.13.1 on our servers every 20 minutes or so:

netfs: Can't donate prior to front
R=00070d30[3] s=9a000-9bfff 0/2000/2000
folio: 98000-9bfff
donated: prev=0 next=0
s=9a000 av=2000 part=2000
------------[ cut here ]------------
kernel BUG at fs/netfs/read_collect.c:315!
Oops: invalid opcode: 0000 [#1] SMP PTI
CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Not tainted 6.13.1-cm4all2-hp #416
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 11/23/2021
RIP: 0010:netfs_consume_read_data.isra.0+0xa72/0xab0
Code: 48 89 ea 31 f6 48 c7 c7 bb 7a d0 ae e8 b7 d2 d1 ff 48 8b 4c 24
20 4c 89 e2 48 c7 c7 d7 7a d0 ae 48 8b 74 24 18 e8 9e d2 d1 ff <0f> 0b
4c 89 ef 48 89 54 24 10 4c 89 44 24 08 e8 1a 4e b5 00 48 c7
RSP: 0018:ffffb434cc448db0 EFLAGS: 00010246
RAX: 0000000000000019 RBX: ffff8fa63d9cbec0 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8fbb1f9db840
RBP: 0000000000000000 R08: 00000000ffffbfff R09: 0000000000000001
R10: 0000000000000003 R11: ffff8fd31f6a0000 R12: 0000000000002000
R13: ffff8fa5350aaee8 R14: 0000000000004000 R15: ffff8fa5350aaee8
FS: 0000000000000000(0000) GS:ffff8fbb1f9c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9c5000ef48 CR3: 0000000bcee2e001 CR4: 00000000001706f0
Call Trace:
<IRQ>
? die+0x32/0x80
? do_trap+0xd8/0x100
? do_error_trap+0x65/0x80
? netfs_consume_read_data.isra.0+0xa72/0xab0
? exc_invalid_op+0x4c/0x60
? netfs_consume_read_data.isra.0+0xa72/0xab0
? asm_exc_invalid_op+0x16/0x20
? netfs_consume_read_data.isra.0+0xa72/0xab0
? __pfx_cachefiles_read_complete+0x10/0x10
netfs_read_subreq_terminated+0x22d/0x370
cachefiles_read_complete+0x48/0xf0
iomap_dio_bio_end_io+0x125/0x160
blk_update_request+0xea/0x3e0
scsi_end_request+0x27/0x190
scsi_io_completion+0x43/0x6c0
blk_complete_reqs+0x40/0x50
handle_softirqs+0xd1/0x280
irq_exit_rcu+0x91/0xb0
common_interrupt+0x79/0xa0
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40
RIP: 0010:cpuidle_enter_state+0xba/0x3b0
Code: 00 e8 ea 86 1c ff e8 45 f7 ff ff 8b 53 04 49 89 c5 0f 1f 44 00
00 31 ff e8 73 b9 1b ff 45 84 ff 0f 85 f8 01 00 00 fb 45 85 f6 <0f> 88
46 01 00 00 48 8b 04 24 49 63 ce 48 6b d1 68 49 29 c5 48 89
RSP: 0018:ffffb434c018be98 EFLAGS: 00000202
RAX: ffff8fbb1f9c0000 RBX: ffffd41cbe7e3448 RCX: 000000000000001f
RDX: 0000000000000007 RSI: 000000003149acb2 RDI: 0000000000000000
RBP: 0000000000000004 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000004 R11: 000000000000001f R12: ffffffffaf660060
R13: 0000030f6179fa73 R14: 0000000000000004 R15: 0000000000000000
? cpuidle_enter_state+0xad/0x3b0
cpuidle_enter+0x29/0x40
do_idle+0x19c/0x200
cpu_startup_entry+0x25/0x30
start_secondary+0xf3/0x100
common_startup_64+0x13e/0x148
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---

This is a server with heavy NFS traffic (with fscache enabled).

Please help - and let me know if you need more information.

Max