Re: WARNING: at mm/page-writeback.c:1990__set_page_dirty_nobuffers+0x13a/0x170()
From: Hugh Dickins
Date: Sat Jun 02 2012 - 03:20:38 EST
On Fri, 1 Jun 2012, Linus Torvalds wrote:
> On Fri, Jun 1, 2012 at 9:40 PM, Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> >
> > Move the lock after the loop, I think you meant.
>
> Well, I wasn't sure if anything inside the loop might need it. I don't
> *think* so, but at the same time, what protects "page_order(page)"
> (or, indeed PageBuddy()) from being stable while that loop content
> uses them?
Yes, I believe you're right, page_order(page) could supply nonsense
if it's not stabilized under zone->lock along with PageBuddy(page).
Though if this rescue_unmovable_pageblock() is just best-effort,
with a little more care we can probably avoid the lock in there.
>
> I don't understand that code at all. It does that crazy iteration over
> page, and changes "page" in random ways,
I don't think they're random ways: when buddy it uses the order to skip
that block, otherwise it goes page by page, considering a free (I guess
on pcp) page or an lru page as good for movable.
> and then finishes up with a
> totally new "page" value that is some random thing that is *after* the
> end_page thing. WHAT?
>
> The code makes no sense. It tests all those pages within the
> page-block, but then after it has done all those tests, it does the
> final
>
> set_pageblock_migratetype(..)
> move_freepages_block(..)
>
> using a page that is *beyond* the pageblock (and with the whole
> page_order() thing, who knows just how far beyond it?)
I totally missed that, thank goodness you did not. Yes, it's rubbish.
It goes to this effort to find a suitable pageblock, then chooses the
next one instead (or possibly another). Perhaps it would get even
better results using a random number generator in there.
>
> It looks entirely too much like random-monkey code to me.
Presumably it should be passing start_page instead of page
to set_pageblock_migratetype() and move_freepages_block().
But this does seem to be code of the kind, that the longer you look
at it, the more bugs you find. And I worry about what trouble it
might then cause, if it actually started to work in the way it was
intending. I don't think fixing it up is wise for -rc1.
Commit 5ceb9ce6fe9462a298bb2cd5c9f1ca6cb80a0199
("mm: compaction: handle incorrect MIGRATE_UNMOVABLE type pageblocks")
appears to revert cleanly, and I'm running with it reverted now.
I'm not saying it shouldn't come back later, but does anyone see
an argument against reverting it now?
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/