Re: [PATCH v2] trace: tracing_event_filter: fast path when no subsystem filters
From: Nick Lowell
Date: Thu Oct 12 2023 - 16:37:53 EST
On Wed, Oct 4, 2023 at 10:52 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Wed, 4 Oct 2023 10:39:33 -0400
> Nick Lowell <nicholas.lowell@xxxxxxxxx> wrote:
>
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 5 PID: 944 at kernel/trace/trace_events_filter.c:2423 apply_subsystem_event_filter+0x18c/0x5e0
> > > Modules linked in:
> > > CPU: 5 PID: 944 Comm: trace-cmd Not tainted 6.6.0-rc4-test-00009-gff7cd7446fe5 #102
> > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > > RIP: 0010:apply_subsystem_event_filter+0x18c/0x5e0
> > > Code: 44 24 08 00 00 00 00 48 8b 6d 00 4c 39 f5 75 bc 48 8b 44 24 18 4c 8b 60 18 4c 89 e5 45 84 ff 75 14 48 85 ed 0f 84 37 ff ff ff <0f> 0b eb 10 e8 4b be fd ff eb b0 4d 85 e4 0f 84 a3 02 00 00 48 8b
> > > RSP: 0018:ffff9b4941607db8 EFLAGS: 00010286
> > > RAX: ffff8b2780a77280 RBX: ffff8b2780a77400 RCX: 0000000000000000
> > > RDX: 0000000000000000 RSI: ffff8b2781c11c38 RDI: ffff8b2781c11c38
> > > RBP: ffff8b28df449030 R08: ffff8b2781c11c38 R09: 0000000000000000
> > > R10: ffff8b2781c11c38 R11: 0000000000000000 R12: ffff8b28df449030
> > > R13: ffffffffaaf64de0 R14: ffffffffaaf66bb8 R15: 0000000000000000
> > > FS: 00007fd221def3c0(0000) GS:ffff8b28f7d40000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 000056117c93e160 CR3: 000000010173a003 CR4: 0000000000170ee0
> > > Call Trace:
> > > <TASK>
> > > ? apply_subsystem_event_filter+0x18c/0x5e0
> > > ? __warn+0x81/0x130
> > > ? apply_subsystem_event_filter+0x18c/0x5e0
> > > ? report_bug+0x191/0x1c0
> > > ? handle_bug+0x3c/0x80
> > > ? exc_invalid_op+0x17/0x70
> > > ? asm_exc_invalid_op+0x1a/0x20
> > > ? apply_subsystem_event_filter+0x18c/0x5e0
> > > ? apply_subsystem_event_filter+0x5b/0x5e0
> > > ? __check_object_size+0x25b/0x2c0
> > > subsystem_filter_write+0x41/0x70
> > > vfs_write+0xf2/0x440
> > > ? kmem_cache_free+0x22/0x350
> > > ksys_write+0x6f/0xf0
> > > do_syscall_64+0x3f/0xc0
> > > entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> > > RIP: 0033:0x7fd221ee7ae0
> > >
> > > -- Steve
> >
> > Is this just informative indicating that there are issues with how
> > filters are being used or are you saying there is something else I
> > need to do before this patch is approved?
> > What version of trace-cmd is that using?
>
> Not sure if it matters, but the above was with trace-cmd v3.2.
>
> So, I guess we need to look a bit deeper at the change.
>
> > @@ -2411,7 +2418,12 @@ int apply_subsystem_event_filter(struct trace_subsystem_dir *dir,
> > }
> >
> > if (!strcmp(strstrip(filter_string), "0")) {
> > - filter_free_subsystem_preds(dir, tr);
> > + /* If nothing was freed, we do not need to sync */
> > + if (!filter_free_subsystem_preds(dir, tr)) {
> > + if(!(WARN_ON_ONCE(system->filter)))
> > + goto out_unlock;
>
> When do we want to skip the below?
>
> The original version just did the "goto out_unlock" before the
> "system->filter" check, and that would have caused a memory leak, or just
> left the "system->filter" around when unneeded.
>
> In other words, the patch is incorrect in general then.
>
> > + }
> > +
> > remove_filter_string(system->filter);
> > filter = system->filter;
> > system->filter = NULL;
>
> I believe, what you really want here is simply:
>
> bool sync;
>
> [..]
>
> if (!strcmp(strstrip(filter_string), "0")) {
> + sync = filter_free_subsystem_preds(dir, tr);
> remove_filter_string(system->filter);
> filter = system->filter;
> system->filter = NULL;
> - /* Ensure all filters are no longer used */
> - tracepoint_synchronize_unregister();
> + if (sync) {
> + /* Ensure all filters are no longer used */
> + tracepoint_synchronize_unregister();
> + }
> filter_free_subsystem_filters(dir, tr);
>
I really appreciate the continued feedback. I was able to reproduce.
I think I'm understanding better but still need some help.
I am actually wondering if remove_filter_string(system->filter) should
also return bool as an OR'd input for sync.
Should it be something like this?
if (!strcmp(strstrip(filter_string), "0")) {
- filter_free_subsystem_preds(dir, tr);
- remove_filter_string(system->filter);
+ bool sync;
+
+ sync = filter_free_subsystem_preds(dir, tr);
+ sync = sync || remove_filter_string(system->filter);
filter = system->filter;
system->filter = NULL;
- /* Ensure all filters are no longer used */
- tracepoint_synchronize_unregister();
+ /* If nothing was freed, we do not need to sync */
+ if(sync) {
+ /* Ensure all filters are no longer used */
+ tracepoint_synchronize_unregister();
+ }
filter_free_subsystem_filters(dir, tr);
__free_filter(filter);
goto out_unlock;
> Maybe even pass in "sync" to the filter_free_subsystem_filters() to make
> sure there were nothing to be freed, and do the WARN_ON_ONCE() then.
>
> __free_filter(filter);
> goto out_unlock;
> }
>
> -- Steve
I'm not sure if I see the reasoning for the WARN_ON_ONCE() in
filter_free_subsystem_filters()
because it ends up checking the same if(!filter) just like
filter_free_subsystem_preds() did earlier. It doesn't
seem to do anything with system->filter. I actually wonder if !sync,
could filter_free_subsystem_filters()
be skipped altogether. Help me if I'm missing something.
Thanks again,
Nick