Re: [RFC PATCH 1/2] fs: introduce FALLOC_FL_FORCE_ZERO to fallocate

From: Darrick J. Wong
Date: Mon Jan 06 2025 - 12:31:44 EST


On Mon, Jan 06, 2025 at 08:27:49AM -0800, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 11:17:32AM -0500, Theodore Ts'o wrote:
> > Yes. And we might decide that it should be done using some kind of
> > ioctl, such as BLKDISCARD, as opposed to a new fallocate operation,
> > since it really isn't a filesystem metadata operation, just as
> > BLKDISARD isn't. The other side of the argument is that ioctls are
> > ugly, and maybe all new such operations should be plumbed through via
> > fallocate as opposed to adding a new ioctl. I don't have strong
> > feelings on this, although I *do* belive that whatever interface we
> > use, whether it be fallocate or ioctl, it should be supported by block
> > devices and files in a file system, to make life easier for those
> > databases that want to support running on a raw block device (for
> > full-page advertisements on the back cover of the Businessweek
> > magazine) or on files (which is how 99.9% of all real-world users
> > actually run enterprise databases. :-)
>
> If you want the operation to work for files it needs to be routed
> through the file system as otherwise you can't make it actually
> work coherently. While you could add a new ioctl that works on a
> file fallocate seems like a much better interface. Supporting it
> on a block device is trivial, as it can mostly (or even entirely
> depending on the exact definition of the interface) reuse the existing
> zero range / punch hole code.

I think we should wire it up as a new FALLOC_FL_WRITE_ZEROES mode,
document very vigorously that it exists to facilitate pure overwrites
(specifically that it returns EOPNOTSUPP for always-cow files), and not
add more ioctls.

(That said, doesn't BLKZEROOUT already do this for bdevs?)

--D