Re: [GIT PULL] ring-buffer: Updates for v6.15
From: Steven Rostedt
Date: Thu Mar 27 2025 - 23:01:17 EST
On Thu, 27 Mar 2025 19:19:33 -0700
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Thu, 27 Mar 2025 at 19:17, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > But NO, that is *not* what is used by the code.
> >
> > You literally use "struct page *" whenever you want to mmap it.
> >
> > If the *only* thing that was used was the virtual address, this
> > wouldn't be a discussion.
>
> Just to clarify: if you don't mmap this into user space, then that's
> fine. Then you can keep just the kernel virtual address.
>
> And I already told you that that is one option: just don't mmap.
When I found that the physical memory mapped code when mapped to user
space was causing crashes as the virt_to_page() code didn't work with
vmap(), that was exactly what I did, and mentioned that it would be an
easy fix, but I guess not in a way you liked.
https://lore.kernel.org/all/20250215144719.404616dc@xxxxxxxxxxxxxxxxx/
- Prevent mmap()ing persistent ring buffer
The persistent ring buffer uses vmap() to map the persistent memory.
Currently, the mmap() logic only uses virt_to_page() to get the page
from the ring buffer memory and use that to map to user space. This works
because a normal ring buffer uses alloc_page() to allocate its memory.
But because the persistent ring buffer use vmap() it causes a kernel
crash. Fixing this to work with vmap() is not hard, but since mmap() on
persistent memory buffers never worked, just have the mmap() return
-ENODEV (what was returned before mmap() for persistent memory ring
buffers, as they never supported mmap. Normal buffers will still allow
mmap(). Implementing mmap() for persistent memory ring buffers can wait
till the next merge window.
>
> But as long as you insist on mmaping the buffer into user space, you
> follow the rules. And the rules are that you don't play games. You do
> this RIGHT, or you don't do it at all.
Are you OK with the idea of moving the mapping code into the ring
buffer code (which I think is cleaner) and then keeping track of that
to get to the pages with the helper function?
struct page *rb_get_page(struct trace_buffer *buffer, unsigned long addr)
{
if (buffer->flags & RB_FL_PHYSICAL) {
addr -= buffer->vmap_start;
addr += buffer->phys_start;
return pfn_to_page(addr >> PAGE_SHIFT);
}
return virt_to_page(addr);
}
It doesn't keep track of the pages themselves, as then I would need to
keep track of both the virtual and physical addresses. But since the
pages are either allocated via page_alloc() and converted by
page_address(), or is mapped on a physically contiguous memory area,
where the physical address can be easily calculated.
Or is this too hacky for you?
>
> And you don't lie about things and claim that the only thing that is
> used is the kernel virtual address.
I didn't say the "only thing" I said most of the code. The memory
mapping to user space is very new, and used for tooling that is doing
streaming, where it consumes the data without saving it. It is only
using it for calculations, where memory mapping is faster than reading
as is doesn't have the overhead of a copy. But as this is new, it feels
like an exception and not the rule. Hence, why I said most of the code
doesn't care about struct page. Maybe you see this as lying, but to me,
it's simply my point of view.
-- Steve