Re: [PATCH v3] ext4: fix direct I/O read error
From: åè
Date: Wed Jul 01 2020 - 12:00:44 EST
Does anyone else have any comments on the PATCH v3ïSuggestions are welcome.
Thanksï
åèæçiPhone
> å 2020å6æ29æïäå5:45ïJiang Ying <jiangying8582@xxxxxxx> åéï
>
> ïThis patch is used to fix ext4 direct I/O read error when
> the read size is not aligned with block size.
>
> Then, I will use a test to explain the error.
>
> (1) Make a file that is not aligned with block size:
> $dd if=/dev/zero of=./test.jar bs=1000 count=3
>
> (2) I wrote a source file named "direct_io_read_file.c" as following:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/file.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <string.h>
> #define BUF_SIZE 1024
>
> int main()
> {
> int fd;
> int ret;
>
> unsigned char *buf;
> ret = posix_memalign((void **)&buf, 512, BUF_SIZE);
> if (ret) {
> perror("posix_memalign failed");
> exit(1);
> }
> fd = open("./test.jar", O_RDONLY | O_DIRECT, 0755);
> if (fd < 0){
> perror("open ./test.jar failed");
> exit(1);
> }
>
> do {
> ret = read(fd, buf, BUF_SIZE);
> printf("ret=%d\n",ret);
> if (ret < 0) {
> perror("write test.jar failed");
> }
> } while (ret > 0);
>
> free(buf);
> close(fd);
> }
>
> (3) Compile the source file:
> $gcc direct_io_read_file.c -D_GNU_SOURCE
>
> (4) Run the test program:
> $./a.out
>
> The result is as following:
> ret=1024
> ret=1024
> ret=952
> ret=-1
> write test.jar failed: Invalid argument.
>
> I have tested this program on XFS filesystem, XFS does not have
> this problem, because XFS use iomap_dio_rw() to do direct I/O
> read. And the comparing between read offset and file size is done
> in iomap_dio_rw(), the code is as following:
>
> if (pos < size) {
> retval = filemap_write_and_wait_range(mapping, pos,
> pos + iov_length(iov, nr_segs) - 1);
>
> if (!retval) {
> retval = mapping->a_ops->direct_IO(READ, iocb,
> iov, pos, nr_segs);
> }
> ...
> }
>
> ...only when "pos < size", direct I/O can be done, or 0 will be return.
>
> I have tested the fix patch on Ext4, it is up to the mustard of
> EINVAL in man2(read) as following:
> #include <unistd.h>
> ssize_t read(int fd, void *buf, size_t count);
>
> EINVAL
> fd is attached to an object which is unsuitable for reading;
> or the file was opened with the O_DIRECT flag, and either the
> address specified in buf, the value specified in count, or the
> current file offset is not suitably aligned.
>
> So I think this patch can be applied to fix ext4 direct I/O error.
>
> However Ext4 introduces direct I/O read using iomap infrastructure
> on kernel 5.5, the patch is commit <b1b4705d54ab>
> ("ext4: introduce direct I/O read using iomap infrastructure"),
> then Ext4 will be the same as XFS, they all use iomap_dio_rw() to do direct
> I/O read. So this problem does not exist on kernel 5.5 for Ext4.
>
> From above description, we can see this problem exists on all the kernel
> versions between kernel 3.14 and kernel 5.4. Please apply this patch
> on these kernel versions, or please use the method on kernel 5.5 to fix
> this problem.
>
> Fixes: 9fe55eea7e4b ("Fix race when checking i_size on direct i/o read")
> Co-developed-by: Wang Long <wanglong19@xxxxxxxxxxx>
> Signed-off-by: Wang Long <wanglong19@xxxxxxxxxxx>
> Signed-off-by: Jiang Ying <jiangying8582@xxxxxxx>
>
> Changes since V2:
> Optimize the description of the commit message and make a variation for
> the patch, e.g. with:
>
> Before:
> loff_t size;
> size = i_size_read(inode);
> After:
> loff_t size = i_size_read(inode);
>
> Changes since V1:
> Signed-off use real name and add "Fixes:" flag
>
> ---
> fs/ext4/inode.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 516faa2..a66b0ac 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3821,6 +3821,11 @@ static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
> struct inode *inode = mapping->host;
> size_t count = iov_iter_count(iter);
> ssize_t ret;
> + loff_t offset = iocb->ki_pos;
> + loff_t size = i_size_read(inode);
> +
> + if (offset >= size)
> + return 0;
>
> /*
> * Shared inode_lock is enough for us - it protects against concurrent
> --
> 1.8.3.1