Re: [PATCH v4 2/3] common: DMA-mapping: add DMA_ATTR_NOHUGEPAGE attribute

From: Doug Anderson
Date: Fri Jan 08 2016 - 18:05:22 EST


On Fri, Jan 8, 2016 at 5:35 AM, Russell King - ARM Linux
<linux@xxxxxxxxxxxxxxxx> wrote:
> On Thu, Jan 07, 2016 at 04:36:44PM -0800, Douglas Anderson wrote:
>> This patch adds the DMA_ATTR_NOHUGEPAGE attribute to the DMA-mapping
>> subsystem.
> It would be nicer to keep things in the positive-logic sense if at all
> possible: flags that indicate "we don't want something" tend to end up
> with double or triple negatives somewhere which makes understanding
> the code much harder. It's a shame we have MADV_NOHUGEPAGE...
> That's not a strong view, but a preference.

Thanks for your thoughts. I agree that double-negatives can be confusing...

IMHO There are two problems with changing to DMA_ATTR_HUGE_PAGE, though:

1. I have to go and touch all existing DMA-mapping code to set
DMA_ATTR_HUGE_PAGE. That will be a big patchset and touch more code,
making it more likely to break something. If I have to do that I can,
but I prefer not to because changing defaults like this tends to make
for subtle bugs. One other thing to think about is that it's my
understanding that a large chunk of the ARM developers out there are
working on various differing versions of the kernel. If you've got
drivers in your tree that haven't been patched to account for the new
default but you pick my patch you won't get a compile time
error--you'll get a very subtle performance regression. Maybe we
don't care too much about those out-of-tree and old kernel folks, but
it's something to think about.

2. Personally I think of this attribute like a tristate with 3 values:

A) Do whatever you think best. Maybe allocate huge pages if it's
super easy and there are lots of huge pages around. AKA similar to
today's behavior.

B) Try extra hard to get huge pages. Maybe do some extra sorting of
pages to build up big ones. Maybe do a little extra collection.
Something like that.

C) Don't waste even en extra CPU cycle getting huge pages. I don't need them.

Encoding a tristate with bitfields basically means you need two
opposite bit definitions: DMA_ATTR_NO_HUGE_PAGE and DMA_ATTR_HUGE_PAGE
(and specifying both is an error case).

Right now I only need two states of the theoretical tristate: A) and
C). I could add "DMA_ATTR_HUGE_PAGE" in my patch, but I have no
patches queued up to use it. As Christoph Hellwig has reinforced in
his reply, folks tend not to want unused code in the kernel, so my
thoughts are that someone can add this second attribute when they have
a use for it.