Re: [patch 2.6.13-rc4] fix get_user_pages bug

From: Hugh Dickins
Date: Tue Aug 02 2005 - 12:26:52 EST


On Tue, 2 Aug 2005, Linus Torvalds wrote:
>
> Since we will have dropped the page table lock when calling
> handle_mm_fault() (which will just re-get the lock and then drop it
> again) _and_ since we don't actually mark the page dirty if it was
> writable, it's entirely possible that the VM scanner comes in and just
> drops the page from the page tables.
>
> Now, that doesn't sound so bad, but what we have then is a page that is
> marked dirty in the "struct page", but hasn't been actually dirtied yet.
> It could get written out and marked clean (can anybody say "preemptible
> kernel"?) before we ever actually do the write to the page.
>
> The thing is, we should always set the dirty bit either atomically with
> the access (normal "CPU sets the dirty bit on write") _or_ we should set
> it after the write (having kept a reference to the page).
>
> Or does anybody see anything that protects us here?
>
> Now, I don't think we can fix that race (which is probably pretty much
> impossible to hit in practice) in the 2.6.13 timeframe.

I believe this particular race has been recognized since day one of
get_user_pages, and we've always demanded that the caller must do a
SetPageDirty (I should probably say set_page_dirty) before freeing
the pages held for writing.

Which is why I was a bit puzzled to see that prior set_page_dirty
in __follow_page, which Andrew identified as for s390.

> Maybe I'll have to just accept the horrid "VM_FAULT_RACE" patch. I don't
> much like it, but..

I've not yet reached a conclusion on that,
need to think more about doing mkclean in copy_one_pte.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/