Ok, good.. And you found another buglet.
> Only one note on filemap_nopage:
>
> if (inode->i_op->readpage(inode, page) != 0)
> goto failure; _
> if (PageError(page)) |
> goto failure; |
> if (PageUptodate(page)) |
> goto success; |
> |
> If it was block device, page is still locked at this moment and
> flags can change at any moment (race condition!).
> It is pretty silly to check them here,
> they will be !error && !uptodate (provided we have not
> super-fast device 8))
> If it was NFS, it is correct but ugly.
> More clean solution is to use PG_error bit as bit triggering
> synchronous readpage returning error code on failure,
> rather than result of page operation and check it before readpage.
No, we actually _have_ to call readpage regardless of any old
error-status, because the error might go away (maybe the error was due to
protection problems with the previous user, but now we're another person
so..).
But you're right - this is a bug, and we just need to handle it correctly
like in generic_file_read: we need to do a "wait_on_page(page)" to make
sure the IO has actually completed (look at generic_file_read() which has
the same kind of setup, and does this right - doing a synchronous read
is not a performance problem because this happens only for the error
case anyway).
In short, we shouldn't look at PageError() as being any kind of final error
condition - it can go away by retrying. Try it with NFS, first reading a file
as root (and getting a protection error back dur to root being mapped to
"nobody" on the server side), and then a bit later reading the file as the
real owner.
Now, you may consider it a bad idea to do the readpage() on other devices
than NFS, because on them the protection state doesn't really change (all
the protection checks have already been done by the time we do the
physical read). However, even there it's the prudent thing to do - and as
this is not really in the "normal" flow of control anyway it doesn't
really hurt to try to re-read the disk in the hope that the error
condition has gone away.
[ Umm, that showed another problem: brw_page() never cleared the error
flag before the operation, so even if it _did_ succeed the next time
around we wouldn't have gotten an error-free return ]
Anyway, the code worked correctly for NFS (because the NFS code does the
readpage synchronously anyway for the error condition), and while it was
broken for disk IO it didn't really matter (because the error bit was always
set regadless of the operation).
So the #9 code is slightly wrong, but it gets the right results anyway. But
if you want to have the right code, you need to add the "wait_on_page()" to
filemap_nopage(), and add a "clear_bit(PG_error, &page->flags)" to brw_page
(the same place as we clear the uptodate flag). Could you check with
those fixes?
Linus