Re: [V9fs-developer] 9pfs hangs since 4.7

From: Al Viro
Date: Mon Jan 09 2017 - 15:05:57 EST


On Mon, Jan 09, 2017 at 08:39:31PM +0200, Tuomas Tynkkynen wrote:

> Yes, this does seem to be related to this or otherwise MAX_REQ related!
> - Bumping MAX_REQ up to 1024 makes the hang go away (on 4.7).
> - Dropping it to 64 makes the same hang happen on kernels where it worked
> before (I tried 4.4.x).
> - Doing s/(MAX_REQ - 1)/MAX_REQ/ makes the hang go away.

Note that it's still possible to trigger the same situation with that
off-by-one taken care of; if client sends 64 Treadlink and 64 Tflush
(one for each of those), then follows by another pile of Treadlink (feeding
them in as soon as free slots appear), the server can end up with failing
pdu_alloc() - completion of readlink will release its slot immediately,
but pdu won't get freed until the corresponding flush has gotten the
CPU.

I'm _not_ familiar with scheduling in qemu and I don't quite understand
the mechanism of getting from "handle_9p_output() bailed out with some
requests still not processed" to "no further requests get processed", so
it might be that for some reason triggering the former as described above
won't escalate to the latter, but I wouldn't count upon that.

Another thing I'm very certain about is that 9 0 0 0 108 1 0 1 0 sent by
broken client (Tflush tag = 1 oldtag = 1) will do nasty things to qemu
server. v9fs_flush() will try to find the pdu of request to cancel,
find its own argument, put itself on its ->complete and yield CPU, expecting
to get back once the victim gets through to pdu_complete(). Since the
victim is itself...

AFAICS, the things client can expect wrt Tflush handling are
* in no case should Rflush be sent before the reply to request
its trying to cancel
* the only case when server _may_ not send Rflush is the arrival
of more than one Tflush with the same oldtag; in that case it is allowed
to suppress replies to earlier ones. If they are not suppressed, replies
should come in the order of Tflush arrivals.
* if reply to Tflush is sent (see above), it must be Rflush.
* multiple Tflush with the same oldtag are allowed; Linux kernel
client does not issue those, but other clients might. As the matter of
fact, Plan 9 kernel client *does* issue those.
* Tflush to Tflush is no-op; it still needs a reply, and ordering
constraints apply (it can't be sent before the reply to Tflush it's been
refering to, which, in turn, can't be sent before the reply to request
the first Tflush refers to). Normally such requests are not sent, but
in principle they are allowed.
* Tflush to request that isn't being processed should be
answered immediately. The same goes for Tflush refering to itself.
The former is not an error (we might have already sent a reply), but
the latter might be worth a loud warning - clients are definitely not
supposed to do that. It still needs Rflush in response - Rerror is not
allowed.