Re: an infinite loop in ext4 in 3.14

From: Mikulas Patocka
Date: Thu Apr 17 2014 - 18:37:30 EST




On Thu, 17 Apr 2014, Theodore Ts'o wrote:

> On Thu, Apr 17, 2014 at 03:23:13PM -0400, Mikulas Patocka wrote:
> >
> > I hit a bug in ext4 - jbd2 was stuck in an infinite loop when remounting
> > the root filesystem read-only during shutdown.
>
> Is this at all repeatable?

No - it happened just once.

> I suspect what happened is that we're not
> checking the error return from jbd2_log_do_checkpoint(), and if it ran
> into an error doing the jbd2_log_do_checkpoint --- for example, if it

There were no I/O errors on the console when the lockup happened.

> wasn't able to write to the journal --- say, because __wait_cp_io()
> returned -EIO, we might be spinning in the while loop in jbd2_journal_flush:
>
> > while (!err && journal->j_checkpoint_transactions != NULL) {
>
>
> (as you suspected).
>
> I can add some error checking, but it would be interesting to know if
> you can easily reproduce the problem so we can confirm if that's what
> was really going on.

I can write a script that reboots the machine and run it overnight...

> Regards,
>
> - Ted

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/