On Mon, Apr 29, 2024 at 05:47:42PM +0000, John Garry wrote:
Support atomic writes by producing a single BIO with REQ_ATOMIC flag set.
We rely on the FS to guarantee extent alignment, such that an atomic write
should never straddle two or more extents. The FS should also check for
validity of an atomic write length/alignment.
Signed-off-by: John Garry <john.g.garry@xxxxxxxxxx>
---
fs/iomap/direct-io.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index a3ed7cfa95bc..d7bdeb675068 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -275,6 +275,7 @@ static inline blk_opf_t iomap_dio_bio_opflags(struct iomap_dio *dio,
static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
struct iomap_dio *dio)
{
+ bool is_atomic = dio->iocb->ki_flags & IOCB_ATOMIC;
const struct iomap *iomap = &iter->iomap;
struct inode *inode = iter->inode;
unsigned int zeroing_size, pad;
@@ -387,6 +388,9 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
bio->bi_iter.bi_sector = iomap_sector(iomap, pos);
bio->bi_write_hint = inode->i_write_hint;
bio->bi_ioprio = dio->iocb->ki_ioprio;
+ if (is_atomic)
+ bio->bi_opf |= REQ_ATOMIC;
REQ_ATOMIC is only valid for write IO, isn't it?
This should be added in iomap_dio_bio_opflags() after it is
determined we are doing a write operation. Regardless, it should be
added in iomap_dio_bio_opflags(), not here. That also allows us to
get rid of the is_atomic variable.
+
bio->bi_private = dio;
bio->bi_end_io = iomap_dio_bio_end_io;
@@ -403,6 +407,12 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
}
n = bio->bi_iter.bi_size;
+ if (is_atomic && n != orig_count) {
+ /* This bio should have covered the complete length */
+ ret = -EINVAL;
+ bio_put(bio);
+ goto out;
+ }
What happens now if we've done zeroing IO before this? I suspect we
might expose stale data if the partial block zeroing converts the
unwritten extent in full...
if (dio->flags & IOMAP_DIO_WRITE) {
task_io_account_write(n);
} else {
Ignoring the error handling issues, this code might be better as:
if (dio->flags & IOMAP_DIO_WRITE) {
if ((opflags & REQ_ATOMIC) && n != orig_count) {
/* atomic writes are all or nothing */
ret = -EIO
bio_put(bio);
goto out;
}
}
so that we are not putting atomic write error checks in the read IO
submission path.