i.e. why didn't it round the start offset down to 48?
Answering that question will tell you where the bug is.
After xfs_bmap_compute_alignments() -> xfs_bmap_extsize_align(), ap->offset=48 - that seems ok.
Maybe the problem is in xfs_bmap_process_allocated_extent(). For the problematic case when calling that function:
args->fsbno=7840 args->len=16 ap->offset=48 orig_offset=56 orig_length=24
So, as the comment reads there, we could not satisfy the original length request, so we move up the position of the extent.
I assume that we just don't want to do that for forcealign, correct?
Of course, if the allocation start is rounded down to 48, then
the length should be rounded up to 32 to cover the entire range we
are writing new data to.