Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

From: David Greaves
Date: Mon Jun 18 2007 - 15:14:44 EST


OK, just an quick ack

When I resumed tonight (having done a freeze/thaw over the suspend) some libata errors threw up during the resume and there was an eventual hard hang. Maybe I spoke to soon?

I'm going to have to do some more testing...

David Chinner wrote:
On Mon, Jun 18, 2007 at 08:49:34AM +0100, David Greaves wrote:
David Greaves wrote:
So doing:
xfs_freeze -f /scratch
sync
echo platform > /sys/power/disk
echo disk > /sys/power/state
# resume
xfs_freeze -u /scratch

Works (for now - more usage testing tonight)

Verrry interesting.
Good :)


What you were seeing was an XFS shutdown occurring because the free space
btree was corrupted. IOWs, the process of suspend/resume has resulted
in either bad data being written to disk, the correct data not being
written to disk or the cached block being corrupted in memory.
That's the kind of thing I was suspecting, yes.

If you run xfs_check on the filesystem after it has shut down after a resume,
can you tell us if it reports on-disk corruption? Note: do not run xfs_repair
to check this - it does not check the free space btrees; instead it simply
rebuilds them from scratch. If xfs_check reports an error, then run xfs_repair
to fix it up.
OK, I can try this tonight...

FWIW, I'm on record stating that "sync" is not sufficient to quiesce an XFS
filesystem for a suspend/resume to work safely and have argued that the only
safe thing to do is freeze the filesystem before suspend and thaw it after
resume. This is why I originally asked you to test that with the other problem
that you reported. Up until this point in time, there's been no evidence to
prove either side of the argument......

Cheers,

Dave.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/