Re: [PATCH] [PATCH V5] mqueue: introduce new do_mq_timedreceive2() [ mq_peek syscall] for non-destructive receive and inspection,fix minor issue,prepared doc.
From: Geert Uytterhoeven
Date: Fri Mar 06 2026 - 04:41:57 EST
Hi Mathura,
CC Arnd
Thanks for your patch!
On Fri, 6 Mar 2026 at 08:50, Mathura_Kumar <academic1mathura@xxxxxxxxx> wrote:
> Change from v3 to v5:
>
> -Styles issue fixed
> -Doc was prepared/modified in Documentation/userspace-api/ipc.rst,
> -Corrected file references
> -I am squashing previous four commit in this latest one for convenience in applying
> and merging
> -risc-v related macro handling error fixed
> -some struct definition removed from ipc/mqueue.c for better compiler control flow
>
> Required Attention:-As per mail from bot
What does this mean?
>
> kernel test robot noticed the following build errors:
> [auto build test ERROR on shuah-kselftest/next]
> [also build test ERROR on shuah-kselftest/fixes next-20260304]
> [cannot apply to arnd-asm-generic/master linus/master tip/x86/asm v6.16-rc1]
>
> Short Description:
> POSIX message queues currently lack a mechanism to read
> a message without removing it from the queue. This is a
> long-standing limitation,when we require inspection of queue state
> without altering it.
>
> Modifying existing mq_receive() semantics via additional
> flags was considered. However, altering behavior of an
> existing syscall risks breaking backward compatibility
> for applications relying on current semantics. Since
> mq_receive() guarantees message removal, changing this
> contract is not safe.
>
> To preserve ABI stability, this patch introduces a new
> system call that performs a non-destructive receive
> operation (peek). The existing behavior remains unchanged.
>
> Design considerations:
>
> Two approaches for copying message data to userspace
> were evaluated:
>
> 1) Refcount-based message lifecycle handling
> - This can help us Avoids intermediate temp kernel copy
> - Extends message lifetime
> -But this may increase writer starvation under heavy load and
> add unnecessary complication on priority management and
> delay more time to free space in inode due refcount may prevent
>
> 2) Temporary kernel buffer copy
> - Copies message into a bounded kernel buffer
> - Reduces time message remains locked
> - Improves fairness under write-heavy workloads
> - Simpler lifetime management
>
> My implementation adopts the temporary buffer approach
> to minimize starvation and reduce locking complexity.
> The design allows future transition if refcounting is
> deemed preferable.
>
> Architecture support: ALL
>
> Testing:
> - 15+ functional test cases
> - Multi-threaded producer/consumer scenarios
> - concurrent pop and peek
> - Edge cases: empty queue, FIFO
> invalid flags, signal interruption etc.
>
> Use Case:
>
> 1) Observability in distributed systems
> e.g Monitoring tools can inspect queue contents without interfering with
> normal processing
>
> 2) Check pointing system can have a look into queue without consuming and can store messages
>
> 3) Resource-aware processing
>
> 4) Selective Consumption for Specialized Workers
>
> Signed-off-by: Mathura_Kumar <academic1mathura@xxxxxxxxx>
> --- a/Documentation/userspace-api/index.rst
> +++ b/Documentation/userspace-api/index.rst
> --- /dev/null
> +++ b/Documentation/userspace-api/ipc.rst
> @@ -0,0 +1,273 @@
> +1) Overview
> +-----------
> +
> +POSIX message queues on Linux provide mq_receive() and mq_timedreceive()
> +for consuming messages from a queue. Both interfaces require the caller
> +to pass the message buffer, length, and priority pointer as individual
> +arguments to the system call. This imposes a fixed calling convention
> +that cannot be extended without breaking the ABI.
> +
> +mq_timedreceive2() introduces a new system call entry point that accepts
> +message buffer parameters via a struct argument rather than as individual
> +syscall arguments. This frees the remaining syscall argument slots for
> +new functionality flags and a message index, enabling non-destructive
> +peek and indexed access semantics that are not possible with the
> +original interface.
> +
> +Two variants are provided:
> +
> + mq_timedreceive2() - primary variant, 64-bit time (Y2038-safe)
> + mq_timedreceive2_time32() - 32-bit time variant for legacy and compat
What is the rationale behind adding a new syscall that is not Y2038-safe?
> --- a/arch/arm/tools/syscall.tbl
> +++ b/arch/arm/tools/syscall.tbl
> @@ -486,3 +486,5 @@
> 469 common file_setattr sys_file_setattr
> 470 common listns sys_listns
> 471 common rseq_slice_yield sys_rseq_slice_yield
> +472 common mq_timedreceive2 sys_mq_timedreceive2_time32
> +473 common mq_timedreceive2_time64 sys_mq_timedreceive2
> diff --git a/arch/arm64/tools/syscall_32.tbl b/arch/arm64/tools/syscall_32.tbl
> index 62d93d88e0fe..02603704231e 100644
> --- a/arch/arm64/tools/syscall_32.tbl
> +++ b/arch/arm64/tools/syscall_32.tbl
> @@ -483,3 +483,5 @@
> 469 common file_setattr sys_file_setattr
> 470 common listns sys_listns
> 471 common rseq_slice_yield sys_rseq_slice_yield
> +472 common mq_timedreceive2_time64 sys_mq_timedreceive2
> +473 common mq_timedreceive2 sys_mq_timedreceive2_time32
> diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
> index 248934257101..b67f23edda11 100644
> --- a/arch/m68k/kernel/syscalls/syscall.tbl
> +++ b/arch/m68k/kernel/syscalls/syscall.tbl
> @@ -471,3 +471,6 @@
> 469 common file_setattr sys_file_setattr
> 470 common listns sys_listns
> 471 common rseq_slice_yield sys_rseq_slice_yield
> +472 common mq_timedreceive2 sys_mq_timedreceive2_time32
> +473 common mq_timedreceive2_time64 sys_mq_timedreceive2
> +
> diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
> index 223d26303627..89fb60006a99 100644
> --- a/arch/microblaze/kernel/syscalls/syscall.tbl
> +++ b/arch/microblaze/kernel/syscalls/syscall.tbl
> @@ -477,3 +477,6 @@
> 469 common file_setattr sys_file_setattr
> 470 common listns sys_listns
> 471 common rseq_slice_yield sys_rseq_slice_yield
> +472 common mq_timedreceive2 sys_mq_timedreceive2_time32
> +473 common mq_timedreceive2_time64 sys_mq_timedreceive2
> +
> diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
> index 7430714e2b8f..fee830be67a6 100644
> --- a/arch/mips/kernel/syscalls/syscall_n32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
> @@ -410,3 +410,5 @@
> 469 n32 file_setattr sys_file_setattr
> 470 n32 listns sys_listns
> 471 n32 rseq_slice_yield sys_rseq_slice_yield
> +472 n32 mq_timedreceive2 sys_mq_timedreceive2_time32
> +473 n32 mq_timedreceive2_time64 sys_mq_timedreceive2
So these receive two variants...
> diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
> index 630aab9e5425..75de6ee2df94 100644
> --- a/arch/mips/kernel/syscalls/syscall_n64.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
> @@ -386,3 +386,4 @@
> 469 n64 file_setattr sys_file_setattr
> 470 n64 listns sys_listns
> 471 n64 rseq_slice_yield sys_rseq_slice_yield
> +472 n64 mq_timedreceive2 sys_mq_timedreceive2
MIPS64 receives only one?
> diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
> index 128653112284..8694a2d2f084 100644
> --- a/arch/mips/kernel/syscalls/syscall_o32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
> @@ -459,3 +459,5 @@
> 469 o32 file_setattr sys_file_setattr
> 470 o32 listns sys_listns
> 471 o32 rseq_slice_yield sys_rseq_slice_yield
> +472 o32 mq_timedreceive2 sys_mq_timedreceive2_time32
> +473 o32 mq_timedreceive2_time64 sys_mq_timedreceive2
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
> index 4fcc7c58a105..fd90a2f500b6 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -562,3 +562,6 @@
> 469 common file_setattr sys_file_setattr
> 470 common listns sys_listns
> 471 nospu rseq_slice_yield sys_rseq_slice_yield
> +472 32 mq_timedreceive2 sys_mq_timedreceive2_time32
> +473 64 mq_timedreceive2 sys_mq_timedreceive2
> +474 32 mq_timedreceive2_time64 sys_mq_timedreceive2 sys_mq_timedreceive2
PowerPC receives three variants?
Oh, the first two are for 32-bit vs. 64-bit, so they should use the
same syscall number.
More of this below...
Furthermore, this breaks the "new syscalls use the same number on
most architectures"-rule. Next free slot is now 472, 473, or 474,
depending on architecture.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds