Re: [patch] x86, mm: pass in 'total' to __copy_from_user_*nocache()

From: Nick Piggin
Date: Mon Mar 02 2009 - 23:30:53 EST


On Tuesday 03 March 2009 08:25:43 Ingo Molnar wrote:
> * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > We can play games in the kernel. We do know how many sockets
> > there are. We do know the cache size. We _could_ try to make
> > an educated guess at whether the next user of the data will be
> > DMA or not. So there are unquestionably heuristics we could
> > apply, but I also do suspect that they'd inevitably be pretty
> > arbitrary.
>
> There's a higher-level meta-code argument to consider as well,
> and i find it equally important: finding this rather obvious and
> easy to measure performance regression caused by the
> introduction of MOVNT took us two years.
>
> That's _way_ too long - and adding more heuristics and more
> complexity will just increase this latency. (as we will create
> small niches of special cases with special workloads where we
> might or might not regress)
>
> So, if any such change is done, we can only do it if we have the
> latency of performance-regression-finding on the order of days
> or at most weaks - not years.

Something like temporal vs nontemporal stores, into the pagecache,
is a fundamental tradeoff that will depend on userspace access
pattern that the kernel can't know about. For the kernel side of
the equation, we could query the underlying backing store to see
whether it is going to do any CPU operations on the data before
writeout, eg checksumming or encryption etc. but even then, the
latency between dirtying the pagecache and initiating the writeout
means that it is probably not in cache any longer at the time of
writeout anyway. So the primary factor really is userspace access
patterns I think.

In situations like that, I think the only way to really "solve"
the problem is to provide a way for userspace to ask for temporal
or nontemporal access. Yeah this is just passing the buck, and
probably most apps that try to use this will do no better job than
the kernel :) But it allows ones that really care to give the
kernel much better information.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/