On Thu, 7 Apr 2005, Nick Piggin wrote:
> Kumar Gala wrote:
> > ptep_get_and_clear has a signature that looks something like:
> >
> > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned
> > long addr,
> > pte_t *ptep)
> >
> > It appears that its suppose to return the pte_t pointed to by ptep
> > before its modified. Why do we bother doing this? The caller seems
> > perfectly able to dereference ptep and hold on to it. Am I missing
> > something here?
> >
>
> You need to be able to *atomically* clear the pte and retrieve the
> old value.
The effect of the clearing is that the present bit is cleared which makes
the CPU generate a fault if this pte is referenced.
The problem with replacing pte values is that the code executing is racing
with cpu mmu access to the pte (which may set bits on i386 I believe). So
if you would access the pte and then clear it later then there would be a
small window where the MMU could modify the pte. These changes would not
be detected since you later overwrite the pte.
Using ptep_get_and_clear insures that this does not happen...