On 2020/06/26 2:18, Kanchan Joshi wrote:Sorry, I meant file pointer.
Semantics --->
Zone-append, by its nature, may perform write on a different location than what
was specified. It does not fit into POSIX, and trying to fit may just undermine
its benefit. It may be better to keep semantics as close to zone-append as
possible i.e. specify zone-start location, and obtain the actual-write location
post completion. Towards that goal, existing async APIs seem to fit fine.
Async APIs (uring, linux aio) do not work on implicit write-pointer and demand
explicit write offset (which is what we need for append). Neither write-pointer
What do you mean by "implicit write pointer" ? Are you referring to the behavior
of AIO write with a block device file open with O_APPEND ? The yes, it does not
work. But that is perfectly fine for regular files, that is for zonefs.
I would prefer that this paragraph simply state the semantic that is implemented
first. Then explain why the choice. But first, clarify how the API works, what
is allowed, what's not etc. That will also simplify reviewing the code as one
can then check the code against the goal.
Yes. I was refering to the problem of returning actual write-location usingis taken as input, nor it is updated on completion. And there is a clear way to
get zone-append result. Zone-aware applications while using these async APIs
can be fine with, for the lack of better word, zone-append semantics itself.
Sync APIs work with implicit write-pointer (at least few of those), and there is
no way to obtain zone-append result, making it hard for user-space zone-append.
Sync API are executed under inode lock, at least for regular files. So there is
absolutely no problem to use zone append. zonefs does it already. The problem is
the lack of locking for block device file.
Tests --->
Using new interface in fio (uring and libaio engine) by extending zbd tests
for zone-append: https://protect2.fireeye.com/url?k=e21dd5e0-bf837b7a-e21c5eaf-0cc47a336fae-c982437ed1be6cc8&q=1&u=https%3A%2F%2Fgithub.com%2Faxboe%2Ffio%2Fpull%2F1026
Changes since v1:
- No new opcodes in uring or aio. Use RWF_ZONE_APPEND flag instead.
- linux-aio changes vanish because of no new opcode
- Fixed the overflow and other issues mentioned by Damien
- Simplified uring support code, fixed the issues mentioned by Pavel
- Added error checks
Kanchan Joshi (1):
fs,block: Introduce RWF_ZONE_APPEND and handling in direct IO path
Selvakumar S (1):
io_uring: add support for zone-append
fs/block_dev.c | 28 ++++++++++++++++++++++++----
fs/io_uring.c | 32 ++++++++++++++++++++++++++++++--
include/linux/fs.h | 9 +++++++++
include/uapi/linux/fs.h | 5 ++++-
4 files changed, 67 insertions(+), 7 deletions(-)
--
Damien Le Moal
Western Digital Research