Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release
From: Steven Rostedt
Date: Wed Aug 21 2024 - 16:02:55 EST
On Tue, 20 Aug 2024 15:00:01 +0200
tglozar@xxxxxxxxxx wrote:
> From: Tomas Glozar <tglozar@xxxxxxxxxx>
>
> When running timerlat with a userspace workload (NO_OSNOISE_WORKLOAD),
> NULL pointer dereference can be triggered by sending consequent SIGINT
> and SIGTERM signals to the workload process. That then causes
> timerlat_fd_release to be called twice in a row, and the second time,
> hrtimer_cancel is called on a zeroed hrtimer struct, causing the NULL
> dereference.
>
> This can be reproduced using rtla:
> ```
> $ while true; do rtla timerlat top -u -q & PID=$!; sleep 5; \
> kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done
> [1] 1675
> [1]+ Aborted (SIGTERM) rtla timerlat top -u -q
> [1] 1688
> client_loop: send disconnect: Broken pipe
> ```
> triggering the bug:
I'm able to reproduce this with the above. Unfortunately, I can still
reproduce it after applying this patch :-(
Looking at the code, the logic for handling the kthread seems off. I'll
spend a little time to see if I can figure it out.
-- Steve