Re: [RFC PATCH 4/6] mm, compaction: skip buddy pages by their order in the migrate scanner

From: Vlastimil Babka
Date: Wed Jun 11 2014 - 08:19:04 EST


On 06/11/2014 01:54 AM, David Rientjes wrote:
On Tue, 10 Jun 2014, Vlastimil Babka wrote:

I think the compiler is allowed to turn this into

if (ACCESS_ONCE(page_private(page)) > 0 &&
ACCESS_ONCE(page_private(page)) < MAX_ORDER)
low_pfn += (1UL << ACCESS_ONCE(page_private(page))) - 1;

since the inline function has a return value of unsigned long but gcc may
not do this. I think

/*
* Big fat comment describing why we're using ACCESS_ONCE(), that
* we're ok to race, and that this is meaningful only because of
* the previous PageBuddy() check.
*/
unsigned long pageblock_order = ACCESS_ONCE(page_private(page));

is better.

I've talked about it with a gcc guy and (although he didn't actually see the
code so it might be due to me not explaining it perfectly), the compiler will
inline page_order_unsafe() so that there's effectively.

unsigned long freepage_order = ACCESS_ONCE(page_private(page));

and now it cannot just replace all freepage_order occurences with new
page_private() accesses. So thanks to the inlining, the volatile qualification
propagates to where it matters. It makes sense to me, but if it's according to
standard or gcc specific, I don't know.


I hate to belabor this point, but I think gcc does treat it differently.
If you look at the assembly comparing your patch to if you do

unsigned long freepage_order = ACCESS_ONCE(page_private(page));

instead, then if you enable annotation you'll see that gcc treats the
store as page_x->D.y.private in your patch vs. MEM[(volatile long unsigned
int *)page_x + 48B] with the above.

Hm sure you compiled a version that used page_order_unsafe() and not page_order()? Because I do see:

MEM[(volatile long unsigned int *)valid_page_114 + 48B];

That's gcc 4.8.1, but our gcc guy said he tried 4.5+ and all was like this. And that it would be a gcc bug if not.
He also did a test where page_order was called twice in one function and page_order_unsafe twice in another function. page_order() was reduced to a single access in the assembly, page_order_unsafe were two accesses.

I don't have the ability to prove that all versions of gcc optimization
will not choose to reaccess page_private(page) here, but it does show that
at least gcc 4.6.3 does not consider them to be equivalents.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/