Re: [RESEND4, PATCH 2/2] fuse: require /dev/fuse reads to have enough buffer capacity as negotiated

From: Kirill Smelkov
Date: Wed Apr 24 2019 - 08:26:38 EST


On Wed, Apr 24, 2019 at 12:48:36PM +0200, Miklos Szeredi wrote:
> On Wed, Mar 27, 2019 at 11:44 AM Kirill Smelkov <kirr@xxxxxxxxxx> wrote:
> >
> > A FUSE filesystem server queues /dev/fuse sys_read calls to get
> > filesystem requests to handle. It does not know in advance what would be
> > that request as it can be anything that client issues - LOOKUP, READ,
> > WRITE, ... Many requests are short and retrieve data from the
> > filesystem. However WRITE and NOTIFY_REPLY write data into filesystem.
> >
> > Before getting into operation phase, FUSE filesystem server and kernel
> > client negotiate what should be the maximum write size the client will
> > ever issue. After negotiation the contract in between server/client is
> > that the filesystem server then should queue /dev/fuse sys_read calls with
> > enough buffer capacity to receive any client request - WRITE in
> > particular, while FUSE client should not, in particular, send WRITE
> > requests with > negotiated max_write payload. FUSE client in kernel and
> > libfuse historically reserve 4K for request header. This way the
> > contract is that filesystem server should queue sys_reads with
> > 4K+max_write buffer.
> >
> > If the filesystem server does not follow this contract, what can happen
> > is that fuse_dev_do_read will see that request size is > buffer size,
> > and then it will return EIO to client who issued the request but won't
> > indicate in any way that there is a problem to filesystem server.
> > This can be hard to diagnose because for some requests, e.g. for
> > NOTIFY_REPLY which mimics WRITE, there is no client thread that is
> > waiting for request completion and that EIO goes nowhere, while on
> > filesystem server side things look like the kernel is not replying back
> > after successful NOTIFY_RETRIEVE request made by the server.
> >
> > -> We can make the problem easy to diagnose if we indicate via error
> > return to filesystem server when it is violating the contract.
> > This should not practically cause problems because if a filesystem
> > server is using shorter buffer, writes to it were already very likely to
> > cause EIO, and if the filesystem is read-only it should be too following
> > 8K minimum buffer size (= either FUSE_MIN_READ_BUFFER, see 1d3d752b47,
> > or = 4K + min(max_write)=4k cared to be so by process_init_reply).
> >
> > Please see [1] for context where the problem of stuck filesystem was hit
> > for real (because kernel client was incorrectly sending more than
> > max_write data with NOTIFY_REPLY; see also previous patch), how the
> > situation was traced and for more involving patch that did not make it
> > into the tree.
> >
> > [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2
>
> Applied.

Thanks. Looking forward for it to appear in fuse.git#for-next

Kirill