Re: [PATCH] kfence: Replace local_clock() with ktime_get_boot_fast_ns()

From: Juntong Deng
Date: Wed Nov 22 2023 - 16:36:18 EST


On 2023/11/23 4:35, Marco Elver wrote:
On Wed, 22 Nov 2023 at 21:01, Juntong Deng <juntong.deng@xxxxxxxxxxx> wrote:

The time obtained by local_clock() is the local CPU time, which may
drift between CPUs and is not suitable for comparison across CPUs.

It is possible for allocation and free to occur on different CPUs,
and using local_clock() to record timestamps may cause confusion.

The same problem exists with printk logging.

ktime_get_boot_fast_ns() is based on clock sources and can be used
reliably and accurately for comparison across CPUs.

You may be right here, however, the choice of local_clock() was
deliberate: it's the same timestamp source that printk uses.

Also, on systems where there is drift, the arch selects
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK (like on x86) and the drift is
generally bounded.

Signed-off-by: Juntong Deng <juntong.deng@xxxxxxxxxxx>
---
mm/kfence/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 3872528d0963..041c03394193 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -295,7 +295,7 @@ metadata_update_state(struct kfence_metadata *meta, enum kfence_object_state nex
track->num_stack_entries = num_stack_entries;
track->pid = task_pid_nr(current);
track->cpu = raw_smp_processor_id();
- track->ts_nsec = local_clock(); /* Same source as printk timestamps. */
+ track->ts_nsec = ktime_get_boot_fast_ns();

You have ignored the comment placed here - now it's no longer the same
source as printk timestamps. I think not being able to correlate
information from KFENCE reports with timestamps in lines from printk
is worse.

For now, I have to Nack: Unless you can prove that
ktime_get_boot_fast_ns() can still be correlated with timestamps from
printk timestamps, I think this change only trades one problem for
another.

Thanks,
-- Marco

Honestly, the possibility of accurately matching a message in the printk
log by the timestamp in the kfence report is very low, since allocation
and free do not directly correspond to a certain event.

Since time drifts across CPUs, timestamps may be different even if
allocation and free can correspond to a certain event.

If we really need to find the relevant printk logs by the timestamps in
the kfence report, all we can do is to look for messages that are within
a certain time range.

If we are looking for messages in a certain time range, there is not
much difference between local_clock() and ktime_get_boot_fast_ns().

Also, this patch is in preparation for my next patch.

My next patch is to show the PID, CPU number, and timestamp when the
error occurred, in this case time drift from different CPUs can
cause confusion.

For example, use-after-free caused by a subtle race condition, in which
the time between the free and the error occur will be very close.

Time drift from different CPUs may cause it to appear in the report that
the error timestamp precedes the free timestamp.