Re: [PATCH] MM: Support more pagesizes for MAP_HUGETLB/SHM_HUGETLB v6

From: Andi Kleen
Date: Mon Oct 22 2012 - 22:28:36 EST

On Tue, Oct 23, 2012 at 12:44:24PM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2012-10-22 at 17:53 +0200, Michael Kerrisk (man-pages) wrote:
> > This is all seems to make an awful muck of the API...
> .../...
> > There seems to be a reasonable argument here for an mmap3() with a
> > 64-bit flags argument...
> I tend to agree. There's a similar issue happening when we try to shovel

Could you comment on the expect range of page sizes on PPC?

I looked at this again and I don't think we have anywhere near true 28 flags
so far. The man page currently only lists 16 (including MAP_UNUS^INITIALIZED)

So I don't see why I can't have 6 bits from that.

I have no idea why the MAP_UNINITIALIZED flag was put into this strange
location anyways instead of directly after the existing flags or just
into one of the unused slots.

I suppose I could put my bits before it, there's plenty of space.

Existing flags on x86:

#define MAP_SHARED 0x01 /* Share changes */
#define MAP_PRIVATE 0x02 /* Changes are private */

4 unused
8 unused

#define MAP_FIXED 0x10 /* Interpret addr exactly */
#define MAP_ANONYMOUS 0x20 /* don't use a file */

0x40 unused

#define MAP_GROWSDOWN 0x0100 /* stack-like segment */

0x200 unused
0x400 unused

#define MAP_DENYWRITE 0x0800 /* ETXTBSY */
#define MAP_EXECUTABLE 0x1000 /* mark it as an executable */
#define MAP_LOCKED 0x2000 /* pages are locked */
#define MAP_NORESERVE 0x4000 /* don't check for reservations */
#define MAP_POPULATE 0x8000 /* populate (prefault) pagetables */
#define MAP_NONBLOCK 0x10000 /* do not block on IO */
#define MAP_STACK 0x20000 /* give out an address that is best suited for process/thread stacks */
#define MAP_HUGETLB 0x40000 /* create a huge page mapping */

/* all free here: 6 bits for me? 0x80000..0x1000000 */

# define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be uninitialized */

/* more free bits. */

Overall it seems there's no real shortage of bits.

> things into protection bits, like we do with SAO (strong access
> ordering) and want to do with per-page endian on embedded.

mprotect already does this.

Unless someone finds a good reason why this can't work I'll just move
the range to 0x80000..0x1000000.

ak@xxxxxxxxxxxxxxx -- Speaking for myself only
