Re: [PATCHv3 4/5] mm: make compound_head() robust

From: Paul E. McKenney
Date: Tue Aug 25 2015 - 17:20:08 EST

On Tue, Aug 25, 2015 at 10:46:44PM +0200, Vlastimil Babka wrote:
> On 25.8.2015 22:11, Paul E. McKenney wrote:
> > On Tue, Aug 25, 2015 at 09:33:54PM +0300, Kirill A. Shutemov wrote:
> >> On Tue, Aug 25, 2015 at 01:44:13PM +0200, Vlastimil Babka wrote:
> >>> On 08/21/2015 02:10 PM, Kirill A. Shutemov wrote:
> >>>> On Thu, Aug 20, 2015 at 04:36:43PM -0700, Andrew Morton wrote:
> >>>>> On Wed, 19 Aug 2015 12:21:45 +0300 "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
> >>>>>
> >>>>>> The patch introduces page->compound_head into third double word block in
> >>>>>> front of compound_dtor and compound_order. That means it shares storage
> >>>>>> space with:
> >>>>>>
> >>>>>> - page->;
> >>>>>> - page->next;
> >>>>>> - page->;
> >>>>>> - page->pmd_huge_pte;
> >>>>>>
> >>>
> >>> We should probably ask Paul about the chances that would like
> >>> to use the bit too one day?
> >>
> >> +Paul.
> >
> > The call_rcu() function does stomp that bit, but if you stop using that
> > bit before you invoke call_rcu(), no problem.
> You mean that it sets the bit 0 of during its processing?

Not at the moment, though RCU will splat if given a misaligned rcu_head
structure because of the possibility to use that bit to flag callbacks
that do nothing but free memory. If RCU needs to do that (e.g., to
promote energy efficiency), then that bit might well be set during
RCU grace-period processing.

> That's
> bad news then. It's not that we would trigger that bit when the rcu_head part of
> the union is "active". It's that pfn scanners could inspect such page at
> arbitrary time, see the bit 0 set (due to RCU processing) and think that it's a
> tail page of a compound page, and interpret the rest of the pointer as a pointer
> to the head page (to test it for flags etc).

On the other hand, if you avoid scanning rcu_head structures for pages
that are currently waiting for a grace period, no problem. RCU does
not use the rcu_head structure at all except for during the time between
when call_rcu() is invoked on that rcu_head structure and the time that
the callback is invoked.

Is there some other page state that indicates that the page is waiting
for a grace period? If so, you could simply avoid testing that bit in
that case.

Thanx, Paul

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at