Re: [PATCH] de_thread: Use tsk not current.
From: Andreas Mohr
Date: Wed Jul 12 2006 - 04:50:26 EST
Hi,
On Tue, Jul 11, 2006 at 11:08:33PM +1000, Nick Piggin wrote:
> Andrew Morton wrote:
> >On Mon, 10 Jul 2006 22:42:25 -0600
> >ebiederm@xxxxxxxxxxxx (Eric W. Biederman) wrote:
> >
> >
> >>Ingo Oeser pointed out that because current expands to an inline
> >>function it
> >>is more space efficient and somewhat faster to simply keep a cached copy
> >>of
> >>current in another variable. This patch implements that for the
> >>de_thread
> >>function.
> >>
> >>- if (thread_group_empty(current))
> >>+ if (thread_group_empty(tsk))
> >>- if (unlikely(current->group_leader == child_reaper))
> >>- child_reaper = current;
> >>+ if (unlikely(tsk->group_leader == child_reaper))
> >>+ child_reaper = tsk;
> >>- zap_other_threads(current);
> >>+ zap_other_threads(tsk);
> >> read_unlock(&tasklist_lock);
> >>...
> >
> >
> >This saves nearly 100 bytes of text on gcc-4.1.0/i686.
>
> Why can't current be a pure function, I wonder?
Most likely due to compiler issues:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17972
(which turns out to deal with current_thread_info() specifically!)
http://www.cs.helsinki.fi/linux/linux-kernel/2003-22/0265.html
However as I've been interested in this issue since I noticed
AGI stalls (pipeline stalls) in oprofile output on x86 recently
due to very non-parallel "mov %esp,%eax; and $0xffffe000,%eax"
sequence (but couldn't think of a way to get rid of this),
I'm going to verify what adding pure does with my >= 4.0.0 gcc on my x86 box.
I'm expecting quite some *general* performance improvement in the
low percentage range here...
(since this would be much more than simply merging multiple "current"
into a local stack variable, since *every* current_thread_info() call
would benefit from this and current_tread_info() is used all over the
place in high-profile call sites)
If this works for >= 4.0.0, then I'll try to add conditional support
for pure etc. in our compiler version dependent infrastructure headers.
Plus, we also access current_thread_info() related things within loops
in some cases; would be much better then to store it in a stack variable
outside, methinks. Or could this conflict with aggressive preemption???
Andreas Mohr
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/