Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR

From: Chris Mason
Date: Mon Dec 14 2015 - 19:00:24 EST


On Mon, Dec 14, 2015 at 01:33:56PM -0500, Dave Jones wrote:
> On Sat, Dec 12, 2015 at 07:07:46PM -0500, Chris Mason wrote:
> > On Sat, Dec 12, 2015 at 11:41:26AM -0800, Linus Torvalds wrote:
> > > On Sat, Dec 12, 2015 at 10:33 AM, Linus Torvalds
> > > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > Peter, did that patch also handle just plain "lock_page()" case?
> > >
> > > Looking more at it, I think this all goes back to commit 743162013d40
> > > ("sched: Remove proliferation of wait_on_bit() action functions").
> > >
> > > It looks like PeterZ's pending patch should fix this, by passing in
> > > the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going
> > > back to signal_pending_state(). PeterZ, did I follow the history of
> > > this correctly?
> >
> > Looks right to me, I found Peter's patch and have it running now. After
> > about 6 hours my patch did eventually crash again under trinity. Btrfs has a
> > very old (from 2011) bug in the error handling path that trinity is
> > banging on.
>
> Is the other bug this one ? I've hit this quite a lot over the last 12 months,
> and now that the lock_page bug is fixed this is showing up again.

Linus, I'll send this in a pull request, but just to close the loop in
this thread:

From: Chris Mason <clm@xxxxxx>
Subject: [PATCH] Btrfs: check prepare_uptodate_page() error code earlier

prepare_pages() may end up calling prepare_uptodate_page() twice if our
write only spans a single page. But if the first call returns an error,
our page will be unlocked and its not safe to call it again.

This bug goes all the way back to 2011, and it's not something commonly
hit.

While we're here, add a more explicit check for the page being truncated
away. The bare lock_page() alone is protected only by good thoughts and
i_mutex, which we're sure to regret eventually.

Reported-by: Dave Jones <dsj@xxxxxx>
Signed-off-by: Chris Mason <clm@xxxxxx>
---
fs/btrfs/file.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 72e7346..0f09526 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1291,7 +1291,8 @@ out:
* on error we return an unlocked page and the error value
* on success we return a locked page and 0
*/
-static int prepare_uptodate_page(struct page *page, u64 pos,
+static int prepare_uptodate_page(struct inode *inode,
+ struct page *page, u64 pos,
bool force_uptodate)
{
int ret = 0;
@@ -1306,6 +1307,10 @@ static int prepare_uptodate_page(struct page *page, u64 pos,
unlock_page(page);
return -EIO;
}
+ if (page->mapping != inode->i_mapping) {
+ unlock_page(page);
+ return -EAGAIN;
+ }
}
return 0;
}
@@ -1324,6 +1329,7 @@ static noinline int prepare_pages(struct inode *inode, struct page **pages,
int faili;

for (i = 0; i < num_pages; i++) {
+again:
pages[i] = find_or_create_page(inode->i_mapping, index + i,
mask | __GFP_WRITE);
if (!pages[i]) {
@@ -1333,13 +1339,17 @@ static noinline int prepare_pages(struct inode *inode, struct page **pages,
}

if (i == 0)
- err = prepare_uptodate_page(pages[i], pos,
+ err = prepare_uptodate_page(inode, pages[i], pos,
force_uptodate);
- if (i == num_pages - 1)
- err = prepare_uptodate_page(pages[i],
+ if (!err && i == num_pages - 1)
+ err = prepare_uptodate_page(inode, pages[i],
pos + write_bytes, false);
if (err) {
page_cache_release(pages[i]);
+ if (err == -EAGAIN) {
+ err = 0;
+ goto again;
+ }
faili = i - 1;
goto fail;
}
--
2.4.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/