Re: [RFC PATCH 1/2] mm: introduce bmap_walk()
From: Darrick J. Wong
Date: Mon Jun 19 2017 - 12:19:04 EST
On Sun, Jun 18, 2017 at 09:51:52AM +0200, Christoph Hellwig wrote:
> On Sat, Jun 17, 2017 at 05:29:23AM -0700, Dan Williams wrote:
> > On Fri, Jun 16, 2017 at 10:22 PM, Christoph Hellwig <hch@xxxxxx> wrote:
> > > On Fri, Jun 16, 2017 at 06:15:29PM -0700, Dan Williams wrote:
> > >> Refactor the core of generic_swapfile_activate() into bmap_walk() so
> > >> that it can be used by a new daxfile_activate() helper (to be added).
> > >
> > > No way in hell! generic_swapfile_activate needs to day and no new users
> > > of ->bmap over my dead body. It's a guaranteed to fuck up your data left,
> > > right and center.
> >
> > Certainly you're not saying that existing swapfiles are broken, so I
> > wonder what bugs you're talking about?
>
> They are somewhat broken, but we manage to paper over the fact.
>
> And in fact if you plan to use a method marked:
>
> /* Unfortunately this kludge is needed for FIBMAP. Don't use it */
> sector_t (*bmap)(struct address_space *, sector_t);
>
> I'd expect a little research.
>
> By it's signature alone ->bmap can't do a whole lot - it can try to
> translate the _current_ mapping of a relative block number to a physical
> one, and do extremely crude error reporting.
>
> Notice what it can't do:
>
> a) provide any guaranteed that the block mapping doesn't change any time
> after it returned
> b) deal with the fact that there might be anything like a physical block
> c) put the physical block into any sort of context, that is explain what
> device it actually is relative to
>
> So yes, swap files are broken. They sort of work by:
>
> a) ensuring that ->bmap is not implemented for anything fancy (btrfs), or
> living with it doing I/O into random places (XFS RT subvolumes, *cough*)
Ye $deities, it really /doesn't/ check XFS_IS_REALTIME_INODE(ip)! AIEEEE!
Uh... patch soon.
> b) doing extremely heavy handed locking to ensure things don't change at all
> (S_SWAPFILE). This might kinda sorta work for swapfiles which are
> part of the system and require privilegues, but an absolute no-go
> for anything else
> c) simply not using this brain-haired systems - see the swap over NFS
> support, or the WIP swap over btrfs patches.
>
> > Unless you had plans to go remove bmap() I don't see how this gets in
> > your way at all.
>
> I'm not talking about getting in my way. I'm talking about you doing
> something incredibly stupid. Don't do that.
>
> > That said, I think "please don't add a new bmap()
> > user, use iomap instead" is a fair comment. You know me well enough to
> > know that would be all it takes to redirect my work, I can do without
> > the bluster.
>
> But that's not the point. The point is that ->bmap() semantics simplify
> do not work in practice because they don't make sense.
Seconded, bmap doesn't coordinate with the filesystem in any way to
guarantee that the mappings are stable, nor does it seem to care about
delayed alloc reservations. Granted I suspect the dax usage model is
"all the blocks were already allocated" so there are no da reservations,
but still, ugh, bmap. :)
--D
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html