Re: Add prefetch switch stack hook in scheduler function

From: Ingo Molnar
Date: Thu Jul 28 2005 - 05:12:37 EST



* Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:

> >i'm not sure what you mean by prefetching next->timestamp, it's an
> >inline field to 'next', in the first cacheline of it, which we've
> >already used so it's present. (If you mean the value of next->timestamp,
> >that has no address meaning at all so would lead to unpredictable
> >results on some arches.)
> >
>
> No, I meant the cacheline holding the field of course. I guess I could
> have looked for a field further down, but even so, ->timestamp might
> be 96 bytes into the structure on a 64-bit arch, which may or may not
> be the first cacheline... but you get the idea.

i'd rather wait with that one. A prefetch can generate a prefetch of the
next cacheline too, so it might or might not make sense to prefetch both
cachelines explicitly.

Also, explicity prefetching &next->timestamp is ugly because it includes
a number of bad assumptions. If multi-cacheline prefetching becomes
necessary then this needs to be implemented via a generic extension to
prefetch.h:

prefetch_area(void *first_addr, void *last_addr)

(or as addr,len)

This way an architecture can decide how it prefetches continuous
cachelines _without knowing anything about the structure being
prefetched_.

Also, the generic code using the prefetches could this way designate
'cache-hot' areas without having to know about cacheline size or
prefetching tactics of the architecture. (prefetch_area() could be done
compile-time so it would be equivalent to a single-cacheline prefetch on
arches with larger cachelines.)

> > i'd like to keep generic bits in generic code, and only move things
> > to per-arch include files if absolutely necessary. next->mm is
> > generic.
>
> Yeah, then a specific field _within_ next->mm or thread_info may want
> to be fetched. In short, I don't see any argument why we shouldn't
> call the function prefetch_task().

it's a fundamental thing: we _dont_ want to push generic code into
architectures, and there's nothing per-arch about next->mm.

thread_info is really small on most arches, and even if they are not,
the most important data is always grouped to the top of a structure.
(this is a pretty fundamental concept throughout the kernel)

> Secondly, I don't really like your prefetch(kernel_stack()) function
> because it doesn't really give architectures enough control over
> exactly what cachelines they get in memory.

my point is, it comes down to concrete examples, it may or may not make
sense to do things per-arch.

If this was per-arch most arches wouldnt get this benefit, and the
propagation of any improvements would likely stick in whatever arch did
that. We've seen that happen too many times to not repeat this mistake
again. We've got 24 architectures (and counting), so any code 'left to
the architecture to control' will propagate to other architectures only
very slowly.

try to look at it from another angle: probably only because i insisted
on doing this in the generic code and not per-arch did we end up
discovering the potential generic need to prefetch next->thread_info and
next->mm.

> prefetching and memory access patterns of all this stuff are fairly
> architecture specific.

true of course, but from this it does not follow at all that we should
move all task-switch related prefetching into a prefetch_task() call!

> [...] I see nothing wrong with having a prefetch_task() call.
> (Although I agree things like thread_info->flags and next->mm can be
> done in generic code).

great that we now agree wrt. thread_info and next->mm. My remaining
point is, once we prefetch ->thread_info, ->mm and the kernel stack,
nothing else significant remains! (It's still very much possible that
something needs to be prefetched per-arch, but i'd like to see a robust
case be made for it, instead of your global 'it might happen' argument.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/