Re: [PATCH 2/6] tracing: increase size of number of possibleevents
From: Steven Rostedt
Date: Mon May 04 2009 - 10:12:57 EST
On Mon, 4 May 2009, Li Zefan wrote:
> >>>>> # cat /debug/tracing/trace_pipe
> >>>>>
> >>>>> console 2:
> >>>>> while (1) {
> >>>>> insmod trace-events-sample.ko
> >>>>> echo foo_bar > /debug/tracing/set_event
> >>>>> rmmod trace-events-sample.ko
> >>>>> }
> >>>>>
> >>>>> I got this immediately:
> >>>>>
> >>>>> BUG: unable to handle kernel NULL pointer dereference at 0000006f
> >>>>> IP: [<c05210f3>] bstr_printf+0x2ce/0x302
> >>>>> ...
> >>>>> Call Trace:
> >>>>> [<c0476d12>] ? trace_seq_bprintf+0x28/0x41
> >>>>> [<c0477569>] ? trace_bprint_print+0x58/0x6c
> >>>>> [<c0472ffc>] ? print_trace_line+0x2c5/0x2df
> >>>>> [<c0428a41>] ? sub_preempt_count+0x85/0xa0
> >>>>> [<c04758cf>] ? tracing_read_pipe+0x118/0x191
> >>>>> [<c04757b7>] ? tracing_read_pipe+0x0/0x191
> >>>>> [<c04b09f9>] ? vfs_read+0x8f/0x136
> >>>>> [<c04b0da3>] ? sys_read+0x40/0x65
> >>>>> [<c0402a68>] ? sysenter_do_call+0x12/0x36
> >>>>>
> >>>>> (We can even get other crashes..)
> >>>>>
> >>>> Can you send me your full bootlog and config. I'm not able to reproduce
> >>>> this.
> >>> And the SHA1 of the HEAD of the git repo you are using.
> >>>
> >>> Thanks,
> >>>
> >> the HEAD: 9ce5424d75e56891905b77d1589924765e62059a + this patch
> >>
> >> (I commented out "hi" from the sample module)
> >
> > I found that commenting out the "hi" too helped in causing races.
> >
> >> I think it's because "type" is wrapped back to 0. I changed the test script
> >> like this:
> >>
> >> while (foo_bar.id < 65536)
> >> {
> >> insmod trace-events-sample.ko
> >> rmmod trace-events-sample.ko
> >> }
> >>
> >> for ((; ;))
> >> {
> >> insmod /home/lizf/linux-2.6-tip/samples/trace_events/*.ko
> >> echo trace-events-sample:foo_bar > /mnt/tracing/set_event
> >> sleep 1
> >> cat /mnt/tracing/trace
> >> rmmod /home/lizf/linux-2.6-tip/samples/trace_events/*.ko
> >> }
> >>
> >
> > Could you try my git repo with my latest changes?
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
> >
> > branch: rfc/debugfs
> >
>
> The bug won't be triggered with your latest changes, but it has nothing
> to do with the debugfs patch, it's because this patch:
> "tracing/events: reuse trace event ids after overflow"
>
> But that patch just hides the bug not fixes it.. I thought I explained
> the bug clearly. :(
>
> I think the bug is triggered because a wrong trace_event->trace() is called.
>
>
> Before the above patch:
> id overflowed
> =>foo_bar.id == 6 == TRACE_BPRINT
> =>echo foo_bar > set_event
> =>sleep N secs
> =>rmmod sample
> =>cat trace
> =>trace_bprint_event->trace() is called!!
> =>oops!!!
>
>
> After the above patch:
>
> Suppose id 0 ~ 65534 are used, only 65535 is left, and we have 2 different
> sample module trace_event subsystems foo and bar.
> =>insmod foo.ko, and foo.id == 65535
> =>echo foo > set_event
> =>sleep N secs
> =>rmmod foo.ko
> =>insmod bar.ko, and bar.id == 65535
> =>cat trace
> =>bar->trace() will be called on events which were generated by foo!
>
> So is it possible/sane to discard corresponding events from ring buffer when
> we unload a module?
>
> Correct me if I'm wrong..
OK, I understand what you are saying.
You are saying that if we reuse the event id, then there is a case that
the event can be used by another module, and if we have old data in the
ring buffer the other module's callback will be called with the previous
module's data.
Right?
A module should not be storing pointers into the ring buffer. We do
that with the trace_print, but I think we were suppose to do something
special if we detect that it is a module being called.
Frederic?
If the module only stores data and never a pointer, then the worst thing
stale data would do is show junk.
Scanning the ring buffer and discarding the module data can be very racy.
We would need to disable the ring buffer while doing this. Probably the
best thing to do is to simpy reset the ring buffers if it detects that a
module is being unloaded and has callbacks. The data in the ring buffer is
garbage anyway, so might as well reset it.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/