Re: [git pull] first batch of ufs fixes

From: Richard Narron
Date: Fri Jun 16 2017 - 10:29:10 EST

On Thu, 15 Jun 2017, Al Viro wrote:

On Wed, Jun 14, 2017 at 08:11:33AM +0100, Al Viro wrote:
FWIW, it seems to work here. Said that, *BSD fsck_ffs is not worth much -
play a bit with redundancy in UFS superblock (starting with fragment
and block sizes, their ratio, logarithms, bitmasks, etc.) and you can
screw at least 10.3 into the ground when mounting an image that passes
their fsck. Sure, anyone who mounts untrusted images is a cretin who
deserves everything they get, fsck or no fsck, but... no complaints from
fsck is not a reliable indicator of image being in good condition and
that's PITA for testing.

Another pile of fun: "reserve ->s_minfree percents of total" logics had
been broken.
* using hardwired 5% is wrong - especially for ufs2, where it's
not even the default
* ufs_freespace() returns u64; testing for <= 0 is not doing the
right thing
* no capability checks before we need them, TYVM...
* ufs2 needs 64bit uspi->s_dsize (and ->s_size, while we are at it).
64bit variants were even calculated - and never used.
* while we are at it, doing "multiply the total data frags by
s_minfree and divide by 100" every time we allocate a block is bloody
dumb - that should be calculated once.

We really need to get the sodding tail unpacking moved up from the place
where it's buried - turns out that my doubts about that code managing to
avoid deadlocks had been correct. Long-term we need to move that thing
to iomap-based ->write_iter() and do unpacking there and in truncate().
For now I've slapped together something that is easier to backport -
avoiding ->truncate_mutex when possible and not holding ->s_lock over

Another bug in the same area: ufs_get_locked_page() doesn't guarantee
that buffer_heads are attached (race with vmscan trying to evict the
page in question can end with buffer_heads freed and page left alive
and uptodate). Callers do expect buffer_heads to be there, so we either
need to do create_empty_buffers() in those callers or in ufs_get_locked_page();
I went for the latter for now.

Off-by-one in ufs_truncate_blocks(): the logics when deciding whether
we need to do anything with direct blocks is broken when new size is
within the last direct block. It's better to find the path to the
last byte _not_ to be removed and use that instead of the path to the
beginning of the first block to be freed.

I've pushed fixes for those into vfs.git#ufs-fixes; they do need more
testing before I send a pull request, though.

The 8 patches in the ufs-fixes group were applied to Linux 4.12-rc5.
They seem to work fine with the simple testing that I do.

I tested all 3 BSDs, FreeBSD 11.0, OpenBSD 6.1 and NetBSD 7.1 using 2 filesystems, 44bsd (ufs1) and ufs2.
I found no errors doing a Linux mkdir, copy large file, BSD fsck, Linux rm large file, rmdir and BSD fsck in any of the 6 combinations.

Doing a "df" on BSD and Linux now match on the counts including the "Available" counts.

It might be worth testing with ufs filesystems using softdep and/or journaling. Should the Linux mount command reject such filesystems?

Now that ufs write access is working more or less, we're dangerous.