Re: quicklists confuse meminfo

From: Jeremy Fitzhardinge
Date: Mon Mar 10 2008 - 13:32:11 EST


Hugh Dickins wrote:
On Mon, 10 Mar 2008, Andi Kleen wrote:
Christoph Lameter <clameter@xxxxxxx> writes:
Zeroed pages however will not address the issue of having initialized pgd (which seems to be what i386 needs).
pgd is tiny on i386 PAE (4 * 16 bytes). Are you sure reinitializing that
is a serious issue? ...

It used to be tiny (32 aligned bytes), then 2.6.22's quicklist enlarged
that to a whole (lowmem) page. I think we were all too busy with other
stuff to protest loudly enough about that bloat.

Yes, I was surprised about the pgd page allocation. It's currently necessary for Xen (but I think I can fix that easily enough). But I was especially surprised it was imposed for everyone.

If the quicklists are going, it'd be good for PAE to go back to a
kmem_cache of 32-byte entries as in 2.6.21 - I think Ingo's patch is
still using a whole page there.

+1. We'd still need to maintain a list to link all the pgds together, but I think we can just allocate that out of the cache too (either the same object or a separate pgd list cache).

Or have sl?b alignment changes, or virtualization issues (locking
per underlying struct page?), made a kmem_cache awkward there now?
I think only Xen has a constraint. At the moment we rely on page-sized pgds for two things:

1. Xen marks the whole page as being of pgd-type, and so it can't
have non-pgd contents (as the page would be RO, and any contents
would be validated as pgd entries). However I can fix that by
maintaining a separate per-cpu pgd page, and just copy the four
entries over when cr3 is reloaded. This would move the
Xen-specific requirements into the Xen code without affecting the
rest of the kernel.
2. We still need to maintain a list of pgds, as I discussed above.

So from my perspective there are no insoluble problems, and I'd fully support the transition.

I've been working on unifying the pgalloc stuff, and the quicklist (i386) vs non-quicklist (x86-64) use has been a bit of a thorn in my side. Of course there'll still be the page-sized pgd vs non-page-sized pgd, but we just have to live with that.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/