Re: inode->i_wb_list corruption.

From: Dave Jones
Date: Mon Mar 12 2012 - 19:26:40 EST


On Fri, Mar 09, 2012 at 12:08:07PM -0800, Keith Packard wrote:
> <#part sign=pgpmime>
> On Fri, 9 Mar 2012 13:00:15 -0500, Dave Jones <davej@xxxxxxxxxx> wrote:
>
> > i915_drm_thaw is a deep nest of functions though, so this is going to be
> > hard to track down where that write is coming from. Because the corruption
> > seems to happen to pages that are already allocated, we probably can't
> > even rely on DEBUG_PAGEALLOC, though it might be worth trying.
>
> I'm worried that the write is coming through the GTT, which would make
> sense as these look like pixel values. If this is on Ironlake (core
> I3-I7 first gen), we know there are issues when VT-d is enabled, and
> the work-around for that doesn't appear to be in place for the hibernate
> resume case.

Thinking about how the GTT could contain stale pointers, I came up with this scenario:

Before we begin the thaw, the initramfs sets up a framebuffer.
This causes the GTT to be setup.

- Thaw begins, hardware state still points to the GTT setup by the modesetting code.
At this point, any graphics operations are going to cause writes through
those translations. Bad news if we just wrote a bunch of thawed data there.

or..

- Thaw begins, and data is written over the GTT setup by the initramfs, but
the hardware registers still points at it, until thaw is complete, when we
reprogram the GTT registers to their pre-hibernate values.

If we could somehow set modeset=0 automatically if we detect a hibernate
partition it would probably 'solve' it, but I suspect the real answer
would be to do GTT teardown before we do a thaw.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/