Re: [RFC PATCH 1/3] Unified trace buffer
From: Linus Torvalds
Date: Wed Sep 24 2008 - 16:57:20 EST
On Wed, 24 Sep 2008, Mathieu Desnoyers wrote:
>
> [...] Those will likely be low event-rate
> situations where it is useful to take a bigger snapshot of a problematic
> condition, but still to have it synchronized with the rest of the trace
> data. e.g. :
>
> - Writing a whole video frame into the trace upon video card glitch.
> - Writing a jumbo frame (up to 9000 bytes) into the buffer when a
> network card error is detected or when some iptables rules (LOG, TRACE
> ?) are reached.
> - Dumping a kernel stack (potentially 8KB) in a single event when a
> kernel OOPS is reached.
> - Dumping a userspace process stack into the trace upon SIGILL, SIGSEGV
> and friends.
But these are _all_ things that would be much better off with a "allocate
a separate buffer, and just add a pointer to the trace".
Why? If for no other reason than the fact that we don't even want to spend
lots of time to (atomically) have to copy the big data into the trace
buffer!
Just allocate the buffer and fill it in (maybe it's pre-allocated already,
like when a network packet event happens!) and do all of that
independently of the low-level trace code. And then add the trace with the
pointer.
We want the low-level trace code to be useful for things like interrupt
events etc, which makes it a _disaster_ to try to add huge buffers
directly to the ring buffer. You also don't want to allocate a
multi-megabyte ring buffer for some odd case that happens rarely, when you
can allocate the big memory users dynamically.
So limiting a trace entry to 4kB does not mean that you can't add more
than 4kB to the trace - it just means that you need to have a "data
indirection" trace type. Nothing more, nothing less.
[ And btw - you'd need that *anyway* for other reasons. You also don't
want to have any length fields have to be 32-bit lengths etc - the
length field of the trace buffer entry should be something really small
like 8 or 16 bits, or even be implicit in the type for some basic event
types, so that a trace event doesn't necessarily waste any bits at ALL
on the length field ]
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/