On Tue, 30 May 2006 12:54:53 +1000
Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
I guess so. Is plugging still needed now that the IO layer should
get larger requests? Disabling it might result in a small initial
request (although even that may be good for pipelining)...
Mysterious question, that. A few years ago I think Jens tried pulling unplugging
out, but some devices still want it (magneto-optical storage iirc). And I
think we did try removing it, and it caused hurt.
Otherwise, we could make set_page_dirty_lock use a weird non-unplugging
variant
Yes, that would work. In fact the number of times when direct-io actually
calls set_page_dirty_lock() is infinitesimal - I had to jump through hoops
to even test that code. The speculative
set-the-page-dirty-before-we-do-the-IO thing is very effective. So the
performace impact of making such a change would be nil.
That's for the direct-io case. Other cases might be hurt more.
Also, perhaps we could poke kblockd to do it for us.
sync_page wants to get either the current mapping, or a NULL one.
The sync_page methods must then be able to handle running into a
NULL mapping.
With splice, the mapping can change, so you can have the wrong
sync_page callback run against the page.
Oh.
The ->pin() calls in pipe_to_file and pipe_to_sendpage?Well yes, writing to a page would be the main reason to set it dirty.Whereabouts?
Is splice broken as well? I'm not sure that it always has a ref on the
inode when stealing a page.
One for Jens...
But the vma isn't going to disappear because mmap_sem is held; and theIt sounds like you think fixing the set_page_dirty_lock callers wouldn'tNo, I think it's damn impossible ;)
be too difficult? I wouldn't know (although the ptrace one should be
able to be turned into a set_page_dirty, because we're holding mmap_sem).
get_user_pages() has gotten us a random pagecache page. How do we
non-racily get at the address_space prior to locking that page?
I don't think we can.
vma should hold a ref on the inode I think?
That's true during the get_user_pages() call. Be we run
set_page_dirty_lock() much later, after IO completion.
Anyway, it is possible that most of the problems could be solved by locking
the page at the time of lookup, and unlocking it on completion/dirtying...
it's just that that would be a bit of a task.
But lock_page() requires access to the address_space. To kick the IO so we
don't wait for ever.
Can we somehow add BUG_ONs to
lock_page to ensure we've got an inode ref?
WARN_ONs.