Re: 4MB pages and framebuffer access, x11perf results, 2.1.125

Jim Gettys (jg@pa.dec.com)
Fri, 30 Oct 1998 06:41:52 -0800


> > Whether it is worth doing a general solution, and what such an
> implementation
> > should look like, and whether more than the hack solution is appropriate,
> > I don't know off hand (and I'm completely ignorant of x86 paging hardware).
> > If there is interest, I can try to find out if for Digital UNIX on Alpha
> > it has been worth doing more than hack 1) (I think that we also may use
> > the large TLB entries for shared libraries, if memory serves).
> This would mean that shared libraries couldn't be swapped page-by-page but
> only as a large chunk. That sounds like a serious memory waste and IO overhead
> if only parts of the shared libraries are used. Or do I miss something here?
>
> I could imagine 4MB pages in userspace could be useful for lots of other
> applications though.

Basic system shared libaries typically never get swapped out. You arrange
that the standard set of libraries all map closely together, and map them
all with the large TLB entry. Libc and friends, for example, better never
get swapped out.

And I don't remember for sure if DU actually implements such a thing.
As I said, I'm willing to try to find out to what extent large TLB entries
have been exploited in Digital UNIX; most of my information dates from
early Alpha architecture documents, well before implementations had time
to be optimized.

What has actually turned out useful in practice with good returns is the
question here, balancing difficulty of implementation in a cross platform
system with return on investment.

Mapping frame buffers statically to avoid TLB misses (one way or the other)
is old hat, was implemented to solve an observed performance problem,
with measurable benefits, and has been done for a better part of a decade;
I'm sure that is worth implementing.

What is less clear is the cost/benefit of more general schemes.

I can imagine all sorts of possible uses; how many of them have significant
performance benefit is much less clear. Real data is always better than
handwaving. Some probably do, and some probably don't; with different
amounts of implementation complexity involved for each. For frame buffers,
the case was made in the late 1980's with hard data (very much like that
presented in this thread; for more general schemes, I don't have data
at hand. I suggest that the next step for any discussion of implementing
a more general scheme is some research in the literature and community
to find out how much bang is available for how many bucks.

- Jim

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/