: So the #9 code is slightly wrong, but it gets the right results anyway. But
: if you want to have the right code, you need to add the "wait_on_page()" to
: filemap_nopage(), and add a "clear_bit(PG_error, &page->flags)" to brw_page
: (the same place as we clear the uptodate flag). Could you check with
: those fixes?
Checked, it works.
I'd like to clarify my viewpoint. Hope, it can be useful.
An unlocked page can be in 4 states:
1. just inserted to page cache, not uptodate and not error,
we MUST call readpage before going to sleep the first time.
readpage handler MUST lock page or move it into one of 2-4 states
before going to sleep.
2. uptodate - OK, error flag is impossible.
3. not uptodate and not error - fatal error
4. not uptodate and error - transient error
If readpage handler does not understand error bit (currently,
buffer.c), it does not touch it and state 4 is impossible.
It is natural for block devices, that have no permission restrictions
and make retries on hardware driver level.
(It can be problem if someone will make encrypted loopback device
with per user access restrictions, it should work as NFS does)
More smart readpage handlers (currently, only NFS) lock page and then:
- if page was marked by error flag, make synchronous request,
wait for operation completion and in the case of failure return
correct error code and set error flag again.
- if page was not marked by error flag, make asynchronous request and
return 0. When operation will be completed (failed) appropriate
callback will move page to state 2 or 4 and unlock it.
Really, it is the way that NFS works now.
Do you understand now, why I proposed to rename PG_error to PG_synchronous?
I am sure that bugs in filemap.c would be impossible, if this bit
were not called PG_error 8)8)
Well, if not PG_synchronous, then PG_transient_error.
Alexey Kuznetsov.