Re: [PATCH v2 1/2] tracing: ring-buffer: Have the ring buffer code do the vmap of physical memory
From: Jann Horn
Date: Mon Mar 31 2025 - 20:10:00 EST
On Tue, Apr 1, 2025 at 1:41 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> On Mon, 31 Mar 2025 14:42:38 -0700
> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> > .. and *after* you've given it back to the memory allocator, and it
> > gets allocated using the page allocators, at that point ahead and use
> > 'struct page' as much as you want.
> >
> > Before that, don't. Even if it might work. Because you didn't allocate
> > it as a struct page, and for all you know it might be treated as a
> > different hotplug memory zone or whatever when given back.
>
> Hmm, so if we need to map this memory to user space memory, then I can't
> use the method from this patch series, if I have to avoid struct page.
>
> Should I then be using vm_iomap_memory() passing in the physical address?
For mapping random physical memory ranges into userspace, we have
helpers like remap_pfn_range() (the easy option, for use in an mmap
handler, in case you want to want to map one contiguous physical
region into userspace) and vmf_insert_pfn() (for use in a page fault
handler, in case you want to map random physical pages into userspace
on demand).
> As for architectures that do not have user/kernel data cache coherency, how
> does one flush the page when there's an update on the kernel side so that
> the user side doesn't see stale data?
flush_kernel_vmap_range() (and invalidate_kernel_vmap_range() for the
other direction) might be what you want... I found those by going
backwards from an arch-specific cache-flushing implementation.
> As the code currently uses flush_dcache_folio(), I'm guessing there's an
> easy way to create a folio that points to physical memory that's not part
> of the memory allocator?
Creating your own folio structs sounds like a bad idea; folio structs
are supposed to be in specific kernel memory regions. For example,
conversions from folio* to physical address can involve pointer
arithmetic on the folio*, or they can involve reading members of the
pointed-to folio.