Re: [PATCH 2/5] mm: remove unlikly NULL from kfree

From: Steven Rostedt
Date: Wed Mar 25 2009 - 11:08:49 EST



On Wed, 25 Mar 2009, Steven Rostedt wrote:

>
> On Wed, 25 Mar 2009, Pekka Enberg wrote:
>
> > Hi Steven,
> >
> > On Wed, 25 Mar 2009, Pekka Enberg wrote:
> > > > OK, so according to Steven, audit_syscall_exit() is one such call-site
> > > > that shows up in the traces. I don't really understand what's going on
> > > > there but if it is sane, maybe that warrants the removal of unlikely()
> > > > from kfree(). Hmm?
> >
> > On Wed, 2009-03-25 at 10:47 -0400, Steven Rostedt wrote:
> > > After disabling AUDIT_SYSCALLS I have this:
> > >
> > > # cat /debug/tracing/trace | sort -u
> > >
> > > record_nulls: ptr=(null) (ext3_get_acl+0x1e0/0x3f0 [ext3])
> > > record_nulls: ptr=(null) (free_bitmap+0x29/0x70)
> > > record_nulls: ptr=(null) (free_tty_struct+0x1d/0x40)
> > > record_nulls: ptr=(null) (ftrace_graph_exit_task+0x1e/0x20)
> > > record_nulls: ptr=(null) (inet_sock_destruct+0x1cb/0x2a0)
> > > record_nulls: ptr=(null) (ip_cork_release+0x24/0x50)
> > > record_nulls: ptr=(null) (keyctl_join_session_keyring+0x5a/0x70)
> > > record_nulls: ptr=(null) (key_user_lookup+0x183/0x220)
> > > record_nulls: ptr=(null) (kobject_set_name_vargs+0x43/0x50)
> > > record_nulls: ptr=(null) (netlink_release+0x1a4/0x2f0)
> > > record_nulls: ptr=(null) (release_sysfs_dirent+0x20/0xc0)
> > > record_nulls: ptr=(null) (sysfs_open_file+0x1c8/0x3e0)
> > > record_nulls: ptr=(null) (tty_write+0x16a/0x290)
> > >
> > > I added a hook to only record when NULL is passed into kfree.
> > >
> > > Also note, that after disabling AUDIT_SYSCALLS I now only have roughly 7%
> > > NULL hit rate. Still, unlikely is probably not a benefit here.
> >
> > Thanks for doing this. Do you mean that 93% hit ratio is not enough to
> > be a performance gain?
>
> I think it was Christoph Lameter (good you Cc'd him) told me that anything
> less that 99% is not worthy of a (un)likely macro.
>
> I honestly don't know.

I think the theory is that gcc and the CPU can handle normal branch
predictions well. But if you use one of the prediction macros, gcc
(and some archs) behaves differently, such that taking the wrong branch
can cost more than the time saved with all the other correct hits.

Again, I'm not sure. I haven't done the bench marks. Perhaps someone else
is more apt at knowing the details here.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/