Re: [patch] jiffies wraparound [Re: 2.1.125 Show stopper list: Draft]

Jim Gettys (jg@pa.dec.com)
Wed, 28 Oct 1998 13:03:05 -0800


Linus, believe the claim. TLB misses can make a significant performance
difference for graphics. And on lots of architectures; this is particularly
true on architectures where TLB misses invoke code to do the fill.

Here is an example. Imagine you are drawing a vertical line (a common
application). On a 32 bit pixmap of 1280 size (screen size), you will
be touching addresses that are over 5kbytes apart; this makes each reference
in a separate page (if the page size is 4096). As soon as the line is
longer than the number of entries you have, you take one TLB miss per
pixel drawn. Ugh... You can spend much/most of your time in TLB misses
in pathological cases (the pathology is related to the number of entries
you have).

This was modeled carefully during Alpha design, in reaction to observed
significant measured graphics performance problems on MIPS R2000. It is one of
the reasons that large entries started appearing in a number of architecures.
I don't remember the exact details, but a very significant amount of our
CPU time was going to TLB misses. It depended strongly upon the graphics
primative being executed (for example, a vertical line marches through
memory at a great rate). This was particularly true for dumb frame buffer
implementations. It comes up today more often when you are doing graphics
operations to off screen memory, where typical graphics accelarator chips
are unable to work (at the time, it also was a problem on screen as dumb frame
buffers were common). If you are interested further, I can talk to Joel
McCormack and dredge up more details.

They've also been very useful to avoid misses in shared libraries.

But this has nothing to do with basic page size reported by the
kernel, so I agree with your statement that it has nothing to do with
the basic page table size and exporting it is very silly.

Its is something that only a loader, or a graphics device device
driver, might even want or need to know. But even this is questionable.

If I'm implementing graphics in the CPU software, I might want to be able
to specially get my hands on the right entry to map where I'm munching
on bytes.

Best of all, however, would be if the system could detect such
strange behavior; but this can be more trouble than it is worth.

I think the madvise() call already covers this case quite well; if you
get advised that a large block of memory is MADV_RANDOM or MADV_SEQUENTIAL,
as appropriate, then the system might try to use a large entry. So this
is how/where it is at all visible to users, and there, it is only advice.
And having the OS support these can be a significant win.

- Jim

--
Jim Gettys
Digital Industry Standards and Consortia
Compaq Computer Corporation
Visting Scientist, World Wide Web Consortium, M.I.T.
http://www.w3.org/People/Gettys/
jg@w3.org, jg@pa.dec.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/