Re: [PATCH 09/20] ext4: Initialize timestamps limits

From: Arnd Bergmann
Date: Sat Aug 03 2019 - 16:25:03 EST


On Sat, Aug 3, 2019 at 6:03 PM Theodore Y. Ts'o <tytso@xxxxxxx> wrote:
>
> On Sat, Aug 03, 2019 at 11:30:22AM +0200, Arnd Bergmann wrote:
> >
> > I see in the ext4 code that we always try to expand i_extra_size
> > to s_want_extra_isize in ext4_mark_inode_dirty(), and that
> > s_want_extra_isize is always at least s_min_extra_isize, so
> > we constantly try to expand the inode to fit.
>
> Yes, we *try*. But we may not succeed. There may actually be a
> problem here if the cause is due to there simply is no space in the
> external xattr block, so we might try and try every time we try to
> modify that inode, and it would be a performance mess. If it's due to
> there being no room in the current transaction, then it's highly
> likely it will succeed the next time.
>
> > Did older versions of ext4 or ext3 ignore s_min_extra_isize
> > when creating inodes despite
> > EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE,
> > or is there another possibility I'm missing?
>
> s_min_extra_isize could get changed in order to make room for some new
> file system feature --- such as extended timestamps.

Ok, that explains it. I assumed s_min_extra_isize was meant to
not be modifiable, and did not find a way to change it using the
kernel or tune2fs, but now I can see that debugfs can set it.

> If you want to pretend that file systems never get upgraded, then life
> is much simpler. The general approach is that for less-sophisticated
> customers (e.g., most people running enterprise distros) file system
> upgrades are not a thing. But for sophisticated users, we do try to
> make thing work for people who are aware of the risks / caveats /
> rough edges. Google won't have been able to upgrade thousands and
> thousands of servers in data centers all over the world if we limited
> ourselves to Red Hat's support restrictions. Backup / reformat /
> restore really isn't a practical rollout strategy for many exabytes of
> file systems.
>
> It sounds like your safety checks / warnings are mostly targeted at
> low-information customers, no?

Yes, that seems like a reasonable compromise: just warn based
on s_min_extra_isize, and assume that anyone who used debugfs
to set s_min_extra_isize to a higher value from an ext3 file system
during the migration to ext4 was aware of the risks already.

That leaves the question of what we should set the s_time_gran
and s_time_max to on a superblock with s_min_extra_isize<16
and s_want_extra_isize>=16.

If we base it on s_min_extra_isize, we never try to set a timestamp
later than 2038 and so will never fail, but anyone with a grandfathered
s_min_extra_isize from ext3 won't be able to set extended
timestamps on any files any more. Based on s_want_extra_isize
we would keep the current behavior, but could add a custom
warning in the ext4 code about the small s_min_extra_isize
indicating a theoretical problem.

Arnd