Re: problem in follow_hugetlb_page on ppc64 architecture withget_user_pages

From: David Gibson
Date: Tue Nov 06 2007 - 23:23:56 EST


On Tue, Nov 06, 2007 at 04:06:04PM +0100, Hoang-Nam Nguyen wrote:
> Hello Roland!
> > We currently see this when testing Infiniband on ppc64 with ehca +
> > hugetlbfs.
> > From reading the code this should also be an issue on other architectures.
> > Roland, Adam, are you aware of anything in this area with mellanox
> > Infiniband cards or other usages with I/O adapters?
> Below is a testcase demonstrating this problem. You need to install
> libhugetlbfs.so and run it as below:
> HUGETLB_MORECORE=yes LD_PRELOAD=libhugetlbfs.so ./hugetlb_ibtest 100
>
> This testcase does the following steps (high level desc):
> 1. malloc two buffers each of 100MB for send and recv
> 2. register them as memory regions
> 3. create queue pair QP
> 4. send data in send buffer using QP to itself (target is then recv buffer)
> 5. compare those buffers content
>
> It runs fine without libhugetlbsf. If you call it with libhugetlbfs as
> above, step 5 will fail. If you do memset() of the buffers before step 2
> (register mr), then it runs without errors.
> It appears that hugetlb_cow() is called when first write access is performed
> after mrs have been registered. That means the testcase is seeing other pages
> than the ones registered to the adapter...
>
> I was able reproduce this with mthca on 2.6.23/ppc64 and fc6/intel.

We should cut this down to the bare necessary and fold it into the
libhugetlbfs testsuite.

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/