Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed hugepage to userspace

From: Hugh Dickins
Date: Mon Jul 09 2012 - 19:57:49 EST


On Mon, 9 Jul 2012, Will Deacon wrote:
> On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote:
> > On Wed 04-07-12 15:32:56, Will Deacon wrote:
> > > When allocating and returning clear huge pages to userspace as a
> > > response to a fault, we may zero and return a mapping to a previously
> > > dirtied physical region (for example, it may have been written by
> > > a private mapping which was freed as a result of an ftruncate on the
> > > backing file). On architectures with Harvard caches, this can lead to
> > > I/D inconsistency since the zeroed view may not be visible to the
> > > instruction stream.
> > >
> > > This patch solves the problem by flushing the region after allocating
> > > and clearing a new huge page. Note that PowerPC avoids this issue by
> > > performing the flushing in their clear_user_page implementation to keep
> > > the loader happy, however this is closely tied to the semantics of the
> > > PG_arch_1 page flag which is architecture-specific.
> > >
> > > Acked-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> > > Signed-off-by: Will Deacon <will.deacon@xxxxxxx>
> > > ---
> > > mm/hugetlb.c | 1 +
> > > 1 files changed, 1 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index e198831..b83d026 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -2646,6 +2646,7 @@ retry:
> > > goto out;
> > > }
> > > clear_huge_page(page, address, pages_per_huge_page(h));
> > > + flush_dcache_page(page);
> > > __SetPageUptodate(page);
> >
> > Does this have to be explicit in the arch independent code?
> > It seems that ia64 uses flush_dcache_page already in the clear_user_page
>
> It would match what is done in similar situations by cow_user_page (mm/memory.c)
> and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page
> flushing (DMA bounce, ksm) so I think this is the right place for it.

I am not at all sure if you are right or not:
please let's consult linux-arch about this - now Cc'ed.

If this hugetlb_no_page() were solely mapping the hugepage into that
userspace, I would say you are wrong. It's the job of clear_huge_page()
to take the mapped address into account, and pass it down to the
architecture-specific implementation, to do whatever flushing is
needed - you should be providing that in your architecture.

In particular, notice how clear_huge_page() goes round a loop of
clear_user_highpage()s: in your patch, you're expecting the implementation
of flush_dcache_page() to notice whether or not this is a hugepage, and
flush the appropriate size.

Perhaps yours is the only architecture to need this on huge, and your
flush_dcache_page() implements it correctly; but it does seem surprising.

If I start to grep the architectures for non-empty flush_dcache_page(),
I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing
if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where
the naming suggests that I'm right, it's the architecture's responsibility
to arrange whatever flushing is needed in its copy and clear page functions.

But... this hugetlb_no_page() has a VM_MAYSHARE case below, which puts
the new page into page cache, making it accessible by other processes:
that may indeed be reason for flush_dcache_page() there - or a loop of
flush_dcache_page()s. But I worry then that in the !VM_MAYSHARE case
you would be duplicating expensive flushes: perhaps they should be
restricted to the VM_MAYSHARE block.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/