Re: [PATCH 3/3] 9p: Add mempools for RPCs

From: Christian Schoenebeck
Date: Sun Jul 10 2022 - 09:00:50 EST

Greg on CC: please correct me on false assumptions on QEMU side ...

On Samstag, 9. Juli 2022 22:50:30 CEST Dominique Martinet wrote:
> Christian Schoenebeck wrote on Sat, Jul 09, 2022 at 08:08:41PM +0200:
> > Mmm, I "think" that wouldn't be something new. There is no guarantee that
> > client would not get a late response delivery by server of a request that
> > client has already thrown away.
> No. Well, it shouldn't -- responding to tflush should guarantee that the
> associated request is thrown away by the server

Yes, but that's another aspect of Tflush, its main purpose actually: client
tells server that it no longer cares of previously sent request with oldtag=X.
That prevents the server routines from hanging for good on things that client
no longer cares for anyway, which otherwise evntually might lead to a complete
server lockup on certain setups.

On QEMU side we have a dedicated 'synth' fs driver test case to ensure that
this really works (a simulated fs I/O call that never returns -> Tflush aborts
it -> Test Passed):

> Order is not explicit, but I read this:
> > If it recognizes oldtag as the tag of a pending transaction, it should
> > abort any pending response and discard that tag.
> late replies to the oldtag are no longer allowed once rflush has been
> sent.

That's not quite correct, it also explicitly says this:

"The server may respond to the pending request before responding to the

And independent of what the 9p2000 spec says, consider this:

1. client sends a huge Twrite request
2. server starts to perform that write but it takes very long
3.A impatient client sends a Tflush to abort it
3.B server finally responds to Twrite with a normal Rwrite

These last two actions 3.A and 3.B may happen concurrently within the same
transport time frame, or "at the same time" if you will. There is no way to
prevent that from happening.

> But I guess that also depends on the transport being sequential -- that
> is the case for TCP but is it true for virtio as well? e.g. if a server
> replies something and immediately replies rflush are we guaranteed
> rflush is received second by the client?

That's more a higher level 9p server controller portion issue, not a low level
transport one:

In the scenario described above, QEMU server would always send Rflush response
second, yes. So client would receive:

1. Rwrite or R(l)error
2. Rflush

If the same assumption could be made for any 9p server implementation though,
I could not say.

As for transport: virtio itself is really just two FIFO ringbuffers (one
ringbuffer client -> server, one ringbuffer server -> client). Once either
side placed their request/response message there, it is there, standing in the
queue line and waiting for being pulled by the other side, no way back. Both
sides pull out messages from their FIFO one by one, no look ahead. And a
significant large time may pass for either side to pull the respective next
message. Order of messages received on one side, always corresponds to order
of messages being sent by other side, but that only applies to one ringbuffer
(direction). The two ringbuffers (message directions) are completely
independent from each other though, so no assumption can be made between them.

> There's also this bit:
> > When the client sends a Tflush, it must wait to receive the
> > corresponding Rflush before reusing oldtag for subsequent messages
> if we free the request at this point we'd reuse the tag immediately,
> which definitely lead to troubles.

Yes, that's the point I never understood why this is done by Linux client. I
find it problematic to recycle IDs in a distributed system within a short time
window. Additionally it also makes 9p protocol debugging more difficult, as
you often look at tag numbers in logs and think, "does this reference the
previous request, or is it about a new one now?"

> > What happens on server side is: requests come in sequentially, and are
> > started to be processed exactly in that order. But then they are actually
> > running in parallel on worker threads, dispatched back and forth between
> > threads several times. And Tflush itself is really just another request.
> > So there is no guarantee that the response order corresponds to the order
> > of requests originally sent by client, and if client sent a Tflush, it
> > might still get a response to its causal, abolished "normal" request.
> yes and processing flush ought to get a lock or something and look for
> oldtag.
> Looking at qemu code it does it right: processing flush find the old
> request and marks it as cancelled, then it waits for the request to
> finish (and possibly get discarded) during which (pdu_complete) it'll
> wake the flush up; so spurrious replies of a tag after flush should not
> be possible.
> --
> Dominique