Re: Fw: crash on x86_64 - mm related?

From: Andrew Morton
Date: Mon Jan 09 2006 - 00:12:25 EST


Linus Torvalds <torvalds@xxxxxxxx> wrote:
>
>
>
> On Sun, 8 Jan 2006, Ryan Richter wrote:
> >
> > Kernel BUG at mm/swap.c:49
>
> Well, it sure triggered.
>
> > Process taper (pid: 4501, threadinfo ffff8101453d8000, task ffff81017d0143c0)
> > Call Trace:<ffffffff8028c614>{sgl_unmap_user_pages+124}
> > <ffffffff8028834d>{release_buffering+27}
>
> and it's that same sgl_unmap_user_pages() that keeps on triggering it.
>
> Which was not what I was hoping for. I was hoping we'd see somebody _else_
> decrementing the page count below the map count, and get a new clue.
>
> However, the page flags you show later on (0x1c) ended up making me take
> notice of something. That's "dirty", and maybe it's from
>
> if (dirtied)
> SetPageDirty(page);
>
> in that same sgl_unmap_user_pages() routine.. And it strikes me that that
> is bogus.
>
> Code like that should use "set_page_dirty()", which does the appropriate
> callbacks to the filesystem for that page. I wonder if the bug is simply
> because the ST code just sets the dirty bit without telling anybody else
> about it...
>

It should be using set_page_dirty_lock(). As should st_unmap_user_pages().
I doubt if this would explain a refcounting problem though.

Ryan, It might be worth poisoning the thing, see if the completion is being
called twice:


diff -puN drivers/scsi/st.c~a drivers/scsi/st.c
--- devel/drivers/scsi/st.c~a 2006-01-08 21:11:47.000000000 -0800
+++ devel-akpm/drivers/scsi/st.c 2006-01-08 21:12:13.000000000 -0800
@@ -4482,11 +4482,12 @@ static int sgl_unmap_user_pages(struct s
struct page *page = sgl[i].page;

if (dirtied)
- SetPageDirty(page);
+ set_page_dirty_lock(page);
/* FIXME: cache flush missing for rw==READ
* FIXME: call the correct reference counting function
*/
page_cache_release(page);
+ sgl[i].page = NULL;
}

return 0;
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/