Re: [RFC PATCH] mm: support CONFIG_ZONE_DEVICE + CONFIG_ZONE_DMA

From: Dan Williams
Date: Tue Jan 26 2016 - 16:48:22 EST


On Tue, Jan 26, 2016 at 1:42 PM, Vlastimil Babka <vbabka@xxxxxxx> wrote:
> On 26.1.2016 1:06, Dan Williams wrote:
>> It appears devices requiring ZONE_DMA are still prevalent (see link
>> below). For this reason the proposal to require turning off ZONE_DMA to
>> enable ZONE_DEVICE is untenable in the short term. We want a single
>> kernel image to be able to support legacy devices as well as next
>> generation persistent memory platforms.
>>
>> Towards this end, alias ZONE_DMA and ZONE_DEVICE to work around needing
>> to maintain a unique zone number for ZONE_DEVICE. Record the geometry
>> of ZONE_DMA at init (->init_spanned_pages) and use that information in
>> is_zone_device_page() to differentiate pages allocated via
>> devm_memremap_pages() vs true ZONE_DMA pages. Otherwise, use the
>> simpler definition of is_zone_device_page() when ZONE_DMA is turned off.
>>
>> Note that this also teaches the memory hot remove path that the zone may
>> not have sections for all pfn spans (->zone_dyn_start_pfn).
>>
>> A user visible implication of this change is potentially an unexpectedly
>> high "spanned" value in /proc/zoneinfo for the DMA zone.
>
> [+CC Joonsoo, Laura]
>
> Sounds like quite a hack :(

Indeed...

> Would it be possible to extend the bits encoding
> zone? Potentially, ZONE_CMA could be added one day...

Not without impacting the ability to quickly lookup the numa node and
parent section for a page. See ZONES_WIDTH, NODES_WIDTH, and
SECTIONS_WIDTH.

My initial implementation of ZONE_DEVICE ran into this conflict when
ZONES_SHIFT is > 2, and I fell back to cannibalizing ZONE_DMA.