Re: Random scheduler/unaligned accesses crashes with perf lockevents on sparc 64
From: Steven Rostedt
Date: Tue Apr 06 2010 - 09:41:35 EST
It's best to send to my rostedt@xxxxxxxxxxx account, just like it is
best to send to your davem@xxxxxxxxxxxxx ;-)
On Mon, 2010-04-05 at 19:15 -0700, David Miller wrote:
> From: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> Date: Mon, 5 Apr 2010 21:40:58 +0200
> > It happens without CONFIG_FUNCTION_TRACER as well (but it happens
> > when the function tracer runs). And I hadn't your
> > perf_arch_save_caller_regs() when I triggered this.
> I think there's still something wrong with the ring buffer stuff on
> architectures like sparc64.
> Stephen, I'm looking at the 8-byte alignment fix that was made a few
> weeks ago, commit:
> commit 2271048d1b3b0aabf83d25b29c20646dcabedc05
> Author: Steven Rostedt <srostedt@xxxxxxxxxx>
> Date: Thu Mar 18 17:54:19 2010 -0400
> ring-buffer: Do 8 byte alignment for 64 bit that can not handle 4 byte align
> and I'm not so sure it's completely correct.
> Originally, the ring buffer entries determine where the entry data
> resides (either &event->array or &event->array) based upon the
> Beforehand, in all cases:
> 1) If length could be encoded into event->type_len (ie. <=
> RB_MAX_SMALL_DATA) then event->type_len holds the length
> and the event data starts at &event->array
> 2) Otherwise (length > RB_MAX_SMALL_DATA) the length is
> encoded into event->array and the event data starts at
> But now, there is a new semantic when CONFIG_64BIT is true and
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is false (which isn't the right
> test btw, f.e. sparc 32-bit needs this handling just like sparc 64-bit
> does since it uses full 64-bit loads and stores to access u64 objects
> and thus will crash without proper alignment, the correct test should
> be just CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS being false).
OK, so the a 64 bit word still needs 64 bit alignment when storing to a
I wonder if we should just have a special copy in this case for the
events and remove this patch in the ring buffer. That is:
And have in !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS be:
#define __assgin_word(dest, src) \
memcpy(&(dest), &(src), sizeof(src));
This would fix it for all.
> This new semantic is:
> 1) Entries always encode the length in ->array and ->type_len
> is set to zero.
> And then there are special cases like events of type
> RINGBUF_TYPE_PADDING which, even though ->type_len is non-zero, encode
> a length field in ->array which is used by the ring buffer
> iterators such as rb_event_length(), but this only applies only if
> event->time_delta is non-zero. (Phew!)
The RINGBUF_TYPE_PADDING is used to either fill the rest of the sub
buffer, where alignment does not matter, or to replace a deleted event
that had another event after, it, which should already be aligned.
> The commit adjusts the code in rb_calculate_event_length() to force 8
> byte chunks when RB_FORCE_8BYTE_ALIGNMENT is set. It also adjusted
> the rb_update_event() logic so that it unconditionally uses
> event->array for the length on such platforms.
> However I don't see any logic added to ring_buffer_event_length()
> to handle this forcing. That alone can't explain the crashes
> Frederic and I are seeing, since only oprofile seems to use that
> helper function, but I can just imagine there might be other
> subtle bugs linering after the above commit.
> Anyways, that's just the inital potential problem I've discovered.
> I'll start auditing the rest of this code.
> I wonder if there's a simpler way to implement this alignment fix such
> that we don't have to constantly make sure scores of locations in
> ring_buffer.c get this magic exception case correct.
> We should probably also BUILD_BUG_ON() if BUF_PAGE_HDR_SIZE is not
> a multiple of the necessary alignment, since the ring buffer
> entries start at the end of that.
> Also I noticed (painfully :-) that 2.6.33 needs a backport of this
> alignment fix too, so we should submit it to -stable (once we sift
> out all the bugs of course).
What about removing the logic from the ring buffer and moving it to the
TRACE_EVENT() macros as I suggested above?
We would probably need a way to read the buffers too.
I also know that Mathieu has some strange tricks to force alignment but
I'm still not convinced it would make things any less fragile than what
is already there.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/