Re: [PATCH 1/5] cpuset memory spread basic implementation

From: Ingo Molnar
Date: Mon Feb 06 2006 - 01:17:34 EST



* Andrew Morton <akpm@xxxxxxxx> wrote:

> I think the bottom line here is that the kernel just cannot predict
> the future and it will need help from the applications and/or
> administrators to be able to do optimal things. For that,
> finer-grained one-knob-per-concept controls would be better.

yep. The cleanest would be to let tasks identify the fundamental access
pattern with different granularity. I'm wondering whether it would be
enough to simply extend madvise and fadvise to 'task' scope as well, and
change the pagecache allocation pattern to 'spread out' pages on NUMA,
if POSIX_FADV_RANDOM / MADV_RANDOM is specified.

hence 'global' workloads could set the per-task [and perhaps per-cpuset]
access-pattern default to POSIX_FADV_RANDOM, while 'local' workloads
could set it to POSIX_FADV_SEQUENTIAL [or leave it at the default].

another API solution: perhaps there should be a per-mountpoint
fadvise/madvise hint? Thus the database in question could set the access
pattern for the object itself. (or an ACL tag could achieve the same)
That approach would have the advantage of being quite finegrained, and
would limit the 'interleaving' strategy to the affected objects alone.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/