Re: [PATCH 06/12] iomap: Introduce read_inline() function hook
From: Goldwyn Rodrigues
Date: Thu Oct 10 2024 - 14:11:07 EST
On 10:47 07/10, Darrick J. Wong wrote:
> On Sat, Oct 05, 2024 at 03:30:53AM +0100, Matthew Wilcox wrote:
> > On Fri, Oct 04, 2024 at 04:04:33PM -0400, Goldwyn Rodrigues wrote:
> > > Introduce read_inline() function hook for reading inline extents. This
> > > is performed for filesystems such as btrfs which may compress the data
> > > in the inline extents.
>
> Why don't you set iomap->inline_data to the uncompressed buffer, let
> readahead copy it to the pagecache, and free it in ->iomap_end?
This will increase the number of copies. BTRFS uncompresses directly
into pagecache. Yes, this is an option but at the cost of efficiency.
>
> > This feels like an attempt to work around "iomap doesn't support
> > compressed extents" by keeping the decompression in the filesystem,
> > instead of extending iomap to support compressed extents itself.
> > I'd certainly prefer iomap to support compressed extents, but maybe I'm
> > in a minority here.
>
> I'm not an expert on fs compression, but I get the impression that most
> filesystems handle reads by allocating some folios, reading off the disk
> into those folios, and decompressing into the pagecache folios. It
> might be kind of amusing to try to hoist that into the vfs/iomap at some
> point, but I think the next problem you'd run into is that fscrypt has
> similar requirements, since it's also a data transformation step.
> fsverity I think is less complicated because it only needs to read the
> pagecache contents at the very end to check it against the merkle tree.
>
> That, I think, is why this btrfs iomap port want to override submit_bio,
> right? So that it can allocate a separate set of folios, create a
> second bio around that, submit the second bio, decode what it read off
> the disk into the folios of the first bio, then "complete" the first bio
> so that iomap only has to update the pagecache state and doesn't have to
> know about the encoding magic?
Yes, but that is not the only reason. BTRFS also calculates and checks
block checksums for data read during bio completions.
>
> And then, having established that beachhead, porting btrfs fscrypt is
> a simple matter of adding more transformation steps to the ioend
> processing of the second bio (aka the one that actually calls the disk),
> right? And I think all that processing stuff is more or less already in
> place for the existing read path, so it should be trivial (ha!) to
> call it in an iomap context instead of straight from btrfs.
> iomap_folio_state notwithstanding, of course.
>
> Hmm. I'll have to give some thought to what would the ideal iomap data
> transformation pipeline look like?
The order of transformation would make all the difference, and I am not
sure if everyone involved can come to a conclusion that all
transformations should be done in a particular decided order.
FWIW, checksums are performed on what is read/written on disk. So
for writes, compression happens before checksums.
>
> Though I already have a sneaking suspicion that will morph into "If I
> wanted to add {crypt,verity,compression} to xfs how would I do that?" ->
> "How would I design a pipeine to handle all three to avoid bouncing
> pages between workqueue threads like ext4 does?" -> "Oh gosh now I have
> a totally different design than any of the existing implementations." ->
> "Well, crumbs. :("
>
> I'll start that by asking: Hey btrfs developers, what do you like and
> hate about the current way that btrfs handles fscrypt, compression, and
> fsverity? Assuming that you can set all three on a file, right?
--
Goldwyn