* Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
You're probably mostly right, but I really don't know if I'd start
with the assumption that threads don't share anything. I think they're
very likely to share memory and cache.
it all depends on the workload i guess, but generally if the application
scales well then the threads only share data in a read-mostly manner -
hence we can balance at creation time.
if the application does not scale well then balancing too early cannot
make the app perform much worse.
things like JVMs tend to want good balancing - they really are userspace
simulations of separate contexts with little sharing and good overall
scalability of the architecture.
Also, these additional system wide balance points don't come for free
if you attach them to common operations (as opposed to the slow
the implementation in sched2.patch does not take this into account yet. There are a number of things we can do about the 500 CPUs case. Eg. only
do the balance search towards the next N nodes/cpus (tunable via a