Re: [PATCH] tracing: eprobe: read the complete FILTER_PTR_STRING pointer
From: Martin Kaiser
Date: Wed Jun 17 2026 - 04:37:38 EST
Hiramatsu-san,
thank you for reviewing my patch.
Thus wrote Masami Hiramatsu (mhiramat@xxxxxxxxxx):
> Ah, this is a bit complicated. It seems to work with sched_switch event
> as commit f04dec93466a ("tracing/eprobes: Fix reading of string fields"):
> echo 'e:sw sched/sched_switch comm=$next_comm:string' > dynamic_events
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
> sh-162 [002] d..3. 54.027213: sw: (sched.sched_switch) comm="swapper/2"
> <idle>-0 [007] d..3. 54.034573: sw: (sched.sched_switch) comm="rcu_preempt"
> rcu_preempt-15 [007] d..3. 54.034589: sw: (sched.sched_switch) comm="swapper/7"
> Maybe comm is stored as a fixed string information in the event record?
Yes, this example does not execute my change.
> /sys/kernel/tracing # cat events/sched/sched_switch/format
> name: sched_switch
> ID: 254
> format:
> field:unsigned short common_type; offset:0; size:2; signed:0;
> field:unsigned char common_flags; offset:2; size:1; signed:0;
> field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
> field:int common_pid; offset:4; size:4; signed:1;
> field:char prev_comm[16]; offset:8; size:16; signed:0;
> field:pid_t prev_pid; offset:24; size:4; signed:1;
> field:int prev_prio; offset:28; size:4; signed:1;
> field:long prev_state; offset:32; size:8; signed:1;
> field:char next_comm[16]; offset:40; size:16; signed:0;
> field:pid_t next_pid; offset:56; size:4; signed:1;
> field:int next_prio; offset:60; size:4; signed:1;
> But the filename is a pointer.
> /sys/kernel/tracing # cat events/syscalls/sys_enter_openat/format
> name: sys_enter_openat
> ID: 705
> format:
> field:unsigned short common_type; offset:0; size:2; signed:0;
> field:unsigned char common_flags; offset:2; size:1; signed:0;
> field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
> field:int common_pid; offset:4; size:4; signed:1;
> field:int __syscall_nr; offset:8; size:4; signed:1;
> field:int dfd; offset:16; size:8; signed:0;
> field:const char * filename; offset:24; size:8; signed:0;
> field:int flags; offset:32; size:8; signed:0;
> field:umode_t mode; offset:40; size:8; signed:0;
> field:__data_loc char[] __filename_val; offset:48; size:4; signed:0;
> In this case, the filename field should use __data_loc directly instead of
> pointing data on the ring buffer.
> Can you try
> echo 'e syscalls.sys_enter_openat $__filename_val:string' > \
> /sys/kernel/tracing/dynamic_events
> Instead?
This field is working as expected.
I still believe that the handling of FILTER_PTR_STRING is not correct. The
pointer is stored in the ringbuffer as unsigned long and read as a char. This
gives us a truncated pointer that cannot be dereferenced.
> I think better solution is fixing sycall tracer.
I would say that syscall trace is doing the right thing. The ringbuffer entry
is a struct syscall_trace_enter, the syscall arguments are unsigned longs.
They are written in ftrace_syscall_enter, this looks correct to me.
A const char * syscall argument is using FILTER_PTR_STRING, the unsigned long
argument from the ringbuffer is read as a char and then converted to a
truncated pointer.
Thanks,
Martin