Re: [PATCH v4 1/9] introduce __pfn_t for scatterlists and pmem

From: Dan Williams
Date: Fri Jun 05 2015 - 18:13:01 EST


On Fri, Jun 5, 2015 at 2:37 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Jun 5, 2015 at 2:19 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>> +enum {
>> +#if BITS_PER_LONG == 64
>> + PFN_SHIFT = 3,
>> + /* device-pfn not covered by memmap */
>> + PFN_DEV = (1UL << 2),
>> +#else
>> + PFN_SHIFT = 2,
>> +#endif
>> + PFN_MASK = (1UL << PFN_SHIFT) - 1,
>> + PFN_SG_CHAIN = (1UL << 0),
>> + PFN_SG_LAST = (1UL << 1),
>> +};
>
> Ugh. Just make PFN_SHIFT unconditional. Make it 2, unconditionally.
> Or, if you want to have more bits, make it three unconditionally, and
> make 'struct page' just be at least 8-byte aligned even on 32-bit.
>
> Even on 32-bit architectures, there's plenty of bits. There's no
> reason to "pack" this optimally. Remember: it's a page frame number,
> so there's the page size shifting going on in physical memory, and
> even if you shift the PFN by 3 - or four of five - bits
> unconditionally (rather than try to shift it by some minimal number),
> you're covering a *lot* of physical memory.

It is a page frame number, but page_to_pfn_t() just stores the value
of the struct page pointer directly, so we really only have the
pointer alignment bits. I do this so that kmap_atomic_pfn_t() can
optionally call kmap_atomic() if the pfn is mapped.

>
> Say you're a 32-bit architecture with a 4k page size, and you lose
> three bits to "type" bits. You still have 32+12-3=41 bits of physical
> address space. Which is way more than realistic for a 32-bit
> architecture anyway, even with PAE (or PXE or whatever ARM calls it).
> Not that I see persistent memory being all that relevant on 32-bit
> hardware anyway.
>
> So I think if you actually do want that third bit, you're better off
> just marking "struct page" as being __aligned__((8)) and getting the
> three bits unconditionally. Just make the rule be that mem_map[] has
> to be 8-byte aligned.
>
> Even 16-byte alignment would probably be fine. No?
>

Ooh, that's great, I was already lamenting the fact that I had run out
of bits. One of the reasons to go to 16-byte alignment is to have
another bit to further qualify the pfn as persistent memory not just
un-mapped memory. The rationale would be to generate, and verify
proper usage of, __pmem annotated pointers.

...but I'm still waiting for someone to tell me I'm needlessly
complicating things with a __pmem annotation [1].

[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-June/001087.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/