On Tue, 10 Jun 2014, Vlastimil Babka wrote:
I think the compiler is allowed to turn this into
if (ACCESS_ONCE(page_private(page)) > 0 &&
ACCESS_ONCE(page_private(page)) < MAX_ORDER)
low_pfn += (1UL << ACCESS_ONCE(page_private(page))) - 1;
since the inline function has a return value of unsigned long but gcc may
not do this. I think
/*
* Big fat comment describing why we're using ACCESS_ONCE(), that
* we're ok to race, and that this is meaningful only because of
* the previous PageBuddy() check.
*/
unsigned long pageblock_order = ACCESS_ONCE(page_private(page));
is better.
I've talked about it with a gcc guy and (although he didn't actually see the
code so it might be due to me not explaining it perfectly), the compiler will
inline page_order_unsafe() so that there's effectively.
unsigned long freepage_order = ACCESS_ONCE(page_private(page));
and now it cannot just replace all freepage_order occurences with new
page_private() accesses. So thanks to the inlining, the volatile qualification
propagates to where it matters. It makes sense to me, but if it's according to
standard or gcc specific, I don't know.
I hate to belabor this point, but I think gcc does treat it differently.
If you look at the assembly comparing your patch to if you do
unsigned long freepage_order = ACCESS_ONCE(page_private(page));
instead, then if you enable annotation you'll see that gcc treats the
store as page_x->D.y.private in your patch vs. MEM[(volatile long unsigned
int *)page_x + 48B] with the above.
I don't have the ability to prove that all versions of gcc optimization
will not choose to reaccess page_private(page) here, but it does show that
at least gcc 4.6.3 does not consider them to be equivalents.