Re: [RFC] xfs: fake fallocate success for always CoW inodes

From: Christoph Hellwig

Date: Thu Nov 06 2025 - 09:46:15 EST


On Thu, Nov 06, 2025 at 02:42:30PM +0000, Matthew Wilcox wrote:
> On Thu, Nov 06, 2025 at 02:52:12PM +0100, Christoph Hellwig wrote:
> > On Thu, Nov 06, 2025 at 02:48:12PM +0100, Florian Weimer wrote:
> > > * Hans Holmberg:
> > >
> > > > We don't support preallocations for CoW inodes and we currently fail
> > > > with -EOPNOTSUPP, but this causes an issue for users of glibc's
> > > > posix_fallocate[1]. If fallocate fails, posix_fallocate falls back on
> > > > writing actual data into the range to try to allocate blocks that way.
> > > > That does not actually gurantee anything for CoW inodes however as we
> > > > write out of place.
> > >
> > > Why doesn't fallocate trigger the copy instead? Isn't this what the
> > > user is requesting?
> >
> > What copy?
>
> I believe Florian is thinking of CoW in the sense of "share while read
> only, then you have a mutable block allocation", rather than the
> WAFL (or SMR) sense of "we always put writes in a new location".

Note that the glibc posix_fallocate(3( fallback will never copy anyway.
It does a racy check and somewhat broken check if there is already
data, and if it thinks there isn't it writes zeroes. Which is the
wrong thing for just about every use case imaginable. And the only
thing to stop it from doing that is to implement fallocate(2) and
return success.