Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphancleanup truncate fails to obtain journal handle.

From: Ashish Sangwan
Date: Fri Dec 07 2012 - 04:22:10 EST

On Thu, Dec 6, 2012 at 10:39 PM, Theodore Ts'o <tytso@xxxxxxx> wrote:
> On Thu, Dec 06, 2012 at 04:59:43PM +0530, Ashish Sangwan wrote:
>> Did you get any time to look into this patch?
>> This problem is with ext4 only as ext4_truncate does not clean the
>> orphan list unlike that of ext3_truncate.
>> Instead, in case of failure to obtain handle, orphan list cleanup is
>> done in ext4_setattr.
>> But during mount, ext4_truncate is not called via ext4_setattr and
>> hence the problem.
>> What do you think?
> In the patch description, you mentioned that this occurs when the
> there is a failure to obtain a journal handle. Is this a hypothetical
> thing that you exposed via some kind of tester which checks to see
> what happens if kmalloc() randomly fails some number of allocation
> requests? Or was it happening in real life? And if it is happening
> in real life, do we understand why it's happening, and is there
> something we should be doing to mitigate against the root cause of the
> failure?
Thanks for looking into the patch.
Yes, you are right. The situation is hypothetical.
We were checking about Ext4's behavior when it returns error at
different points in truncate/hole punch path and that's when we
stumbled on to this.
In real, we cannot re-produce this scenario without modifying Ext4's
code and hence an extra RFC added with the patch.
If you think that obtaining journal handle could never fail at mount
time, than please ignore this patch.

> The alternative to your patch is to do something similar to what ext3
> does. That is, if there are any inodes left on the orphan list, to
> iterate through them all and then clean up the orphan list. Perhaps
> we should then also call ext4_error() since technically the file
> system may very well be inconsistent (there may be allocated inodes
> holding blocks which are no longer connected the directory hierarchy,
> which e2fsck would be able to clean up). But that could potentially
> cause the system to panic or remount the file system read-only,
> depending on what the errors= behavior is set to. Which is why I go
> back to the original question; do we understand why ext4_truncate()
> was failing during orphan cleanup in the first place?
> - Ted

Right, using ext4_error() would have been a better choice than
explicitly setting RO flag.
I wonder why ext4_truncate does not clear the orphan_list in case of
failure to obtain journal handle unlike that of ext3_truncate?
We would not have to bother about explicitly clearing the orphan list
in the first place if we could just add a lable like out_notrans (as
in ext3)=>
if (IS_ERR(handle))
goto out_notrans;

* Delete the inode from orphan list so that it doesn't stay there
* forever and trigger assertion on umount.
if (inode->i_nlink)
ext3_orphan_del(NULL, inode);

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at