[RFC] PAGE_RW Should be added to PAGE_COPY ?

From: Yingchao Zhou
Date: Wed Sep 13 2006 - 10:02:56 EST



The current kernel set PAGE_COPY without write bit. This will cause intermittent non-cosistent data for user-level network drivers such as Infiniband, Quadrics and Myrinet. Which has also be mentioned by Costin Iancu in the paper "HUNTing the Overlap " (PACT'05).
An example of such phenomena is the following sequences:
register a memory space BUFF for receive message,
receive message,
call mprotect(...PROT_NONE...) and mprotect(...PROT_READ|PROT_WRITE) one by one,
write into BUFF,
then receive again.
The second time received data will perhaps not be the data sent by the peer machine but the data written by itself in the 4th step.

The reson is that :
1) User-level network driver locks phy pages when memory space is registered;
2) 2 calls to mprotect change ptes in the space to PAGE_COPY, so write any page in the space will cause a page fault;
3) In the page fault handler, it goes to do_wp_page, and in it if Page Is Locked, a new page is generated and filled into the pte. So the physical page seen by the host is not the same one by the NIC.

Adding PAGE_RW to PAGE_COPY will resolve this problem.
In my option, the reason for absense of RW is to save memory by mapping all those only read pages into ZERO_PAGE. But is there really programs which make many read-ops in memory space without even initialize them?

___________________________________________________
_ Yingchao Zhou _
_ ICT, CAS _
_ (86)010-62601009 _
___________________________________________________


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/