Re: [PATCH] vfs: Add a trace point in the mark_inode_dirty function

From: Ingo Molnar
Date: Thu Nov 12 2009 - 02:22:39 EST



* Kok, Auke <auke-jan.h.kok@xxxxxxxxx> wrote:

> If you already know what the file object is, sure. We're interested in
> the case where we have no clue what the file object actually is to
> begin with. Having a trace with a random inode number pop up and then
> disappear into thin air won't help much at all, especially if we can't
> map it back to something "real" on disk. in time.

Yep.

It's similar to PID/comm tracing, which we already do consistently for
all major task events such as fork/exit, sleep/wakeup/context-switch,
etc.

By the 'use inode numbers' argument it should be perfectly fine to only
trace the physical PID itself, and look up the comm later in /proc, or
to add a syscall to do it.

In reality it's not fine. Not just the unnecessary overhead (you have to
look up something you already had) - but also that tasks will exit in
high-freq workloads (so the comm is lost), the PID might not match up
anymore, tasks can change their comm, etc.

The most important principle with event logging is that we want the most
high quality information and we want to a trustable and simple data
source: so for tasks we want the PID and the comm, and for files we want
the top name component and perhaps also the inode number (plus a
filesystem id), captured when the event happened.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/