On 2025-04-02 05:21, Mike Rapoport wrote:
On Tue, Apr 01, 2025 at 06:58:12PM -0400, Steven Rostedt wrote:
From: Steven Rostedt <rostedt@xxxxxxxxxxx>
Enforce that the address and the size of the memory used by the persistent
ring buffer is page aligned. Also update the documentation to reflect this
requirement.
I've been loosely following this thread, and I'm confused about one
thing.
AFAIU the goal is to have the ftrace persistent ring buffer written to
through a memory range mapped by vmap_page_range(), and userspace maps
the buffer with its own virtual mappings.
With respect to architectures with aliasing dcache, is the plan:
A) To make sure all persistent ring buffer mappings are aligned on
SHMLBA:
Quoting "Documentation/core-api/cachetlb.rst":
Is your port susceptible to virtual aliasing in its D-cache?
Well, if your D-cache is virtually indexed, is larger in size than
PAGE_SIZE, and does not prevent multiple cache lines for the same
physical address from existing at once, you have this problem.
If your D-cache has this problem, first define asm/shmparam.h SHMLBA
properly, it should essentially be the size of your virtually
addressed D-cache (or if the size is variable, the largest possible
size). This setting will force the SYSv IPC layer to only allow user
processes to mmap shared memory at address which are a multiple of
this value.
or
B) to flush both the kernel and userspace mappings when a ring buffer
page is handed over from writer to reader ?
I've seen both approaches being discussed in the recent threads, with
some participants recommending approach (A), but then the code
revisions that follow take approach (B).
AFAIU, it we are aiming for approach (A), then I'm missing where
vmap_page_range() guarantees that the _kernel_ virtual mapping is
SHMLBA aligned. AFAIU, only user mappings are aligned on SHMLBA.
And if we aiming towards approach (A), then the explicit flushing
is not needed when handing over pages from writer to reader.
Please let me know if I'm missing something,
Thanks,
Mathieu
Link: https://lore.kernel.org/all/CAHk- =whUOfVucfJRt7E0AH+GV41ELmS4wJqxHDnui6Giddfkzw@xxxxxxxxxxxxxx/
Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
---
Documentation/admin-guide/kernel-parameters.txt | 2 ++
Documentation/trace/debugging.rst | 2 ++
kernel/trace/trace.c | 12 ++++++++++++
3 files changed, 16 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/ Documentation/admin-guide/kernel-parameters.txt
index 3435a062a208..f904fd8481bd 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -7266,6 +7266,8 @@
This is just one of many ways that can clear memory. Make sure your system
keeps the content of memory across reboots before relying on this option.
+ NB: Both the mapped address and size must be page aligned for the architecture.
+
See also Documentation/trace/debugging.rst
diff --git a/Documentation/trace/debugging.rst b/Documentation/trace/ debugging.rst
index 54fb16239d70..d54bc500af80 100644
--- a/Documentation/trace/debugging.rst
+++ b/Documentation/trace/debugging.rst
@@ -136,6 +136,8 @@ kernel, so only the same kernel is guaranteed to work if the mapping is
preserved. Switching to a different kernel version may find a different
layout and mark the buffer as invalid.
+NB: Both the mapped address and size must be page aligned for the architecture.
+
Using trace_printk() in the boot instance
-----------------------------------------
By default, the content of trace_printk() goes into the top level tracing
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index de6d7f0e6206..de9c237e5826 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -10788,6 +10788,18 @@ __init static void enable_instances(void)
}
if (start) {
+ /* Start and size must be page aligned */
+ if (start & ~PAGE_MASK) {
+ pr_warn("Tracing: mapping start addr %lx is not page aligned\n",
+ (unsigned long)start);
+ continue;
+ }
+ if (size & ~PAGE_MASK) {
+ pr_warn("Tracing: mapping size %lx is not page aligned\n",
+ (unsigned long)size);
+ continue;
+ }
Better use %pa for printing physical address as on 32-bit systems
phys_addr_t may be unsigned long long:
pr_warn("Tracing: mapping size %pa is not page aligned\n", &size);
+
addr = map_pages(start, size);
if (addr) {
pr_info("Tracing: mapped boot instance %s at physical memory %pa of size 0x%lx\n",
--
2.47.2