Re: [profile] amortize atomic hit count increments

From: William Lee Irwin III
Date: Tue Sep 14 2004 - 01:11:41 EST


On Mon, 13 Sep 2004 22:32:18 -0700 William Lee Irwin III wrote:
>> There is also an unusual facet to this; the TLB overhead of a loop like:
[...]
>> is very large and caused "effective nontermination", otherwise known as
>> "exhausting the user's patience", on SGI's systems after about half an
>> hour. So some TLB overhead amortization is necessary for this to be
>> feasible. I suspect iterating over pages of the profile buffer and
>> storing intermediate results for a page full of profile buffer hits
>> in a buffer page may suffice though I've not tried it.

On Mon, Sep 13, 2004 at 10:49:43PM -0700, David S. Miller wrote:
> I bet that, like we found out about page tables on 64-bit, these
> profile buffers are sparsely populated with hits. So perhaps a
> per-cpu bitmap that indicates regions that might have any hits
> at all, allowing large amounts of skipping and thus amortizing the
> scan cost.

Well, that would speed it up, but the catastrophe was avoided in the
older patches by just processing all the hits for one cpu at a time,
and the buffering methods above for your suggested accounting
structures likely work well enough the overhead of processing unused
portions of the bitmap can be ignored. I don't really want to go about
addressing performance issues besides effective or actual
nontermination for this code, and would rather leave highly efficient
methods to oprofile (in fact, some others believe that even bugfixes
for such issues should be ignored for kernel/profile.c, contrary to my
notion that it shouldn't crash systems regardless of their size).


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/