Re: [V9fs-developer] 9pfs hangs since 4.7

From: Greg Kurz
Date: Tue Jan 10 2017 - 03:33:56 EST


On Sun, 8 Jan 2017 05:46:39 +0000
Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:

> On Sat, Jan 07, 2017 at 06:15:23PM +0000, Al Viro wrote:
> > On Sat, Jan 07, 2017 at 05:19:10PM +0000, Al Viro wrote:
> >
> > > released) simply trigger virtio_queue_notify_vq() again? It *is* a bug
> > > (if we get a burst filling a previously empty queue all at once, there won't
> > > be any slots becoming freed
> >
> > Umm... Nope, even in that scenario we'll have all requests except the last
> > one processed. I'm trying to put together a simpler reproducer, but...
> > no luck so far.
>
> FWIW, here's another fun bug in qemu 9pfs: client may issue multiple
> Tflush on the same pending request. Server must not reply to them
> out of order and it must reply to at least the last one. If client has
> sent more than one Tflush, it must treat an arrival of Rflush to one of
> those as if it has received an Rflush to each of the earlier Tflush as
> well (IOW, *their* tags are released). Client may not reuse the tag of
> victim request until it has received Rflush for the last Tflush it has
> sent.
>
> Linux kernel-side 9p client doesn't issue such multiple Tflush, but the
> protocol allows that.
>
> qemu server does not guarantee in-order replies to multiple Tflush; worse,
> while it is likely to reply only to one of those, it may very well be
> the _first_ one. Subsequent ones will be lost; moreover, it'll leak
> both coroutines and ring slots.
>

Yeah I'm aware about that, I had started to work on a fix but it's low
priority on my TODO list since linux guests don't do that... and I got
scheduled.

> AFAICS, a clean way to deal with that would be to handle Tflush synchronously,
> right in pdu_submit():
> if pdu->type == Tflush
> look the victim request up.
> if !victim || victim == pdu // [*]
> send reply and free pdu immediately.
> if victim->type == Tflush // [**]
> pdu->victim = victim->victim
> else
> pdu->victim = victim
> mark the victim cancelled
> add pdu to the tail of pdu->victim->flushes
> and let pdu_complete(pdu) send a reply to each request in pdu->flushes
> as well (freeing them as we go)
>

I had kept the asynchronous way and it resulted in quite convoluted code.
I'll give a try as per your suggestion.

> Some locking might be needed to avoid the races between pdu_submit() and
> pdu_complete(), but it's not going to be a wide critical area. I'm not
> familiar with qemu locking primitives, sorry; what's providing an exclusion
> between QLIST_INSERT_HEAD in pdu_alloc() (from handle_9p_output())
> and QLIST_REMOVE in pdu_free() (from pdu_complete())? Looks like the same
> thing ought to suffice here...
>

This code runs in coroutines (stack switching, see include/qemu/coroutine.h for
details): it is serialized.

> [*] a cute way to hurt the current qemu server, BTW - coroutine that waits for
> itself to complete...
>

Indeed :)

> [**] Tflush to Tflush is another fun corner case - it does *not* do anything
> to the earlier Tflush, but reply to it must follow that to original Tflush
> to avoid tag reuse problems. Note that this is not the "multiple Tflush"
> case mentioned above - those would all have oldtag pointing to the same
> request; they are not chained and unlike this one can happen with legitimate
> clients. Tflush to Tflush, OTOH, is not something a sane client would do.

I agree I don't see when a client would want to do that but FWIW, it is
mentioned in the last paragraph of http://man.cat-v.org/plan_9/5/flush :)

Thanks for your valuable suggestions, Al!

Cheers.

--
Greg