Re: [RFC PATCH] mm: support CONFIG_ZONE_DEVICE + CONFIG_ZONE_DMA

From: Dan Williams
Date: Tue Jan 26 2016 - 17:34:10 EST


On Tue, Jan 26, 2016 at 2:11 PM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, 25 Jan 2016 16:06:40 -0800 Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
>> It appears devices requiring ZONE_DMA are still prevalent (see link
>> below). For this reason the proposal to require turning off ZONE_DMA to
>> enable ZONE_DEVICE is untenable in the short term.
>
> More than "short term". When can we ever nuke ZONE_DMA?

I'm assuming at some point these legacy devices will die off or move
to something attached over a more capable bus like USB?

> This was a pretty big goof - the removal of ZONE_DMA whizzed straight
> past my attention, alas. In fact I never noticed the patch at all
> until I got some conflicts in -next a few weeks later (wasn't cc'ed).
> And then I didn't read the changelog closely enough.

I endeavor to never surprise you again...

To be clear the patch did not disable ZONE_DMA by default, but it was
indeed a goof to assume that ZONE_DMA was less prevalent than it turns
out to be.

>> We want a single
>> kernel image to be able to support legacy devices as well as next
>> generation persistent memory platforms.
>
> yup.
>
>> Towards this end, alias ZONE_DMA and ZONE_DEVICE to work around needing
>> to maintain a unique zone number for ZONE_DEVICE. Record the geometry
>> of ZONE_DMA at init (->init_spanned_pages) and use that information in
>> is_zone_device_page() to differentiate pages allocated via
>> devm_memremap_pages() vs true ZONE_DMA pages. Otherwise, use the
>> simpler definition of is_zone_device_page() when ZONE_DMA is turned off.
>>
>> Note that this also teaches the memory hot remove path that the zone may
>> not have sections for all pfn spans (->zone_dyn_start_pfn).
>>
>> A user visible implication of this change is potentially an unexpectedly
>> high "spanned" value in /proc/zoneinfo for the DMA zone.
>
> Well, all these icky tricks are to avoid increasing ZONES_SHIFT, yes?
> Is it possible to just use ZONES_SHIFT=3?

Last I tried I hit this warning in mm/memory.c

#warning Unfortunate NUMA and NUMA Balancing config, growing
page-frame for last_cpupid.

> Also, this "dynamically added pfn of the zone" thing is a new concept
> and I think it should be more completely documented somewhere in the
> code.

Ok, I'll take a look.