Re: [PATCH V2 0/4] sched: add new 'book' scheduling domain

From: Peter Zijlstra
Date: Wed Sep 01 2010 - 06:06:59 EST


On Tue, 2010-08-31 at 10:28 +0200, Heiko Carstens wrote:
> This patch set adds (yet) another scheduling domain to the scheduler. The
> reason for this is that the recent (s390) z196 architecture has four cache
> levels and uniform memory access (sort of -- see below).
> The cpu/cache/memory hierarchy is as follows:
>
> Each cpu has its private L1 (64KB I-cache + 128KB D-cache) and L2 (1.5MB)
> cache.
> A core consists of four cpus with a 24MB shared L3 cache.
> A book consists of six cores with a 192MB shared L4 cache.
>
> The z196 architecture has no SMT.
> Also the statement that we have uniform memory access is not entirely
> correct. Actually the machine uses memory striping, so it "looks" like
> we have UMA until the next slice of memory gets accessed.
> However there is no interface which tells us which piece of memory is local
> or remote. So we (have to) simplify and assume that the cost of each memory
> access with L4 cache miss is the same.
>
> In order to somehow use the information about the cache hierarchy so that
> the scheduler can make some decisions that improves cache hits I added the
> 'BOOK' scheduling domain between the MC and CPU domains.

Took the patches, but the description of the main patch is a bit
wanting, it implies books are useful for NUMA like things when there
isn't any information on where the node boundaries are, which isn't what
you say here, which is that a book is the L4 cache level.

<rant>
Ideally we'd kill all the sd->level stuff and rework the domain creation
like outlined before and simply go by sd->flags domain properties. At
that point you can simply tag this as yet another cache level.
</rant>



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/