Re: [RFC v2 PATCH] fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE

From: Jan Kara
Date: Wed Dec 16 2015 - 10:34:12 EST


On Tue 15-12-15 09:43:25, Abhi Das wrote:
> During testing, I discovered that __generic_file_splice_read() returns
> 0 (EOF) when aops->readpage fails with AOP_TRUNCATED_PAGE on the first
> page of a single/multi-page splice read operation. This EOF return code
> causes the userspace test to (correctly) report a zero-length read error
> when it was expecting otherwise.
>
> The current strategy of returning a partial non-zero read when ->readpage
> returns AOP_TRUNCATED_PAGE works only when the failed page is not the
> first of the lot being processed.
>
> This patch attempts to retry lookup and call ->readpage again on pages
> that had previously failed with AOP_TRUNCATED_PAGE. With this patch, my
> tests pass and I haven't noticed any unwanted side effects.
>
> This version fixes a return code issue pointed out by Bob Peterson.
>
> Signed-off-by: Abhi Das <adas@xxxxxxxxxx>
> Cc: Bob Peterson <rpeterso@xxxxxxxxxx>
> ---
> fs/splice.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/fs/splice.c b/fs/splice.c
> index 801c21c..365cd2a 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -387,6 +387,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
> spd.nr_pages = 0;
> for (page_nr = 0; page_nr < nr_pages; page_nr++) {
> unsigned int this_len;
> + int retries = 0;
>
> if (!len)
> break;
> @@ -415,6 +416,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
> */
> if (!page->mapping) {
> unlock_page(page);
> +retry_lookup:
> page = find_or_create_page(mapping, index,
> mapping_gfp_mask(mapping));
>
> @@ -439,14 +441,13 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
> error = mapping->a_ops->readpage(in, page);
> if (unlikely(error)) {
> /*
> - * We really should re-lookup the page here,
> - * but it complicates things a lot. Instead
> - * lets just do what we already stored, and
> - * we'll get it the next time we are called.
> + * Re-lookup the page
> */
> - if (error == AOP_TRUNCATED_PAGE)
> + if (error == AOP_TRUNCATED_PAGE) {
> error = 0;
> -
> + if (retries++ < 3)
> + goto retry_lookup;
> + }

I don't like this retry-three-times loop. That is still leaving the
possibility of 0 return just much less likely (so it will lead to even
weirder and harded to debug failures). IMO we should just terminate the
loop like we did previously if spd.nr_pages > 0 and we retry indefinitely
if it is the first page that failed to read.

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/