Re: swsusp is at it... again

From: fchabaud@free.fr
Date: Tue Mar 05 2002 - 11:18:19 EST


Le 5 Mar, Pavel Machek a écrit :
> Hi!
>
> After about 20 resume cycles (compiled kernel with swsusp making
> machine suspend/resume) I got that nasty FS corruption, again.
>
> So...
>
> 1) Maybe your ext3 patches are not at fault.

I suspect all this come from suspension failure and immediate resume. I
have reenabled your panic ! I believe that if a task isn't stopped and
suspension is aborted (calling thaw_process and so on) something is
altered. Maybe resuming assumes implicitely a state that is not
completely reached when a task cannot be stopped.

I also made a modification in stopping task to stop normal task and then
kernel threads (I had to add a new PF_KERNTHREAD flag). Perhaps the bug
has to do with the *order* of stopping processes (I think of that
because kernel messages are written to log files: what happens if
kjournald thread is stopped and a task still writes ?)
 
> 2) Be carefull using swsusp patch. Real carefull.
>
> 3) Don't trust fsck. At this kind of corruption, e2fsck 1.19 will
> report "clean" but will not repair it, putting your fs into
> self-destruct mode. Bad bad. Its fixed on new versions. Always run
> fsck twice, second time with -f.

tune2fs -e panic
is also a good precaution at least for ext3 filesystems because all my
root inode crashes were preceded by ext3-error messages and these
messages were sometimes several hours before effective crash.

--
Florent Chabaud
http://www.ssi.gouv.fr | http://fchabaud.free.fr

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 07 2002 - 21:00:43 EST