Re: [PATCH 3/3] 9p: Add mempools for RPCs
From: Christian Schoenebeck
Date: Sun Jul 10 2022 - 11:17:07 EST
On Sonntag, 10. Juli 2022 15:19:56 CEST Dominique Martinet wrote:
> Christian Schoenebeck wrote on Sun, Jul 10, 2022 at 02:57:58PM +0200:
> > On Samstag, 9. Juli 2022 22:50:30 CEST Dominique Martinet wrote:
> > > Christian Schoenebeck wrote on Sat, Jul 09, 2022 at 08:08:41PM +0200:
> > > late replies to the oldtag are no longer allowed once rflush has been
> > > sent.
> > That's not quite correct, it also explicitly says this:
> > "The server may respond to the pending request before responding to the
> > Tflush."
> > And independent of what the 9p2000 spec says, consider this:
> > 1. client sends a huge Twrite request
> > 2. server starts to perform that write but it takes very long
> > 3.A impatient client sends a Tflush to abort it
> > 3.B server finally responds to Twrite with a normal Rwrite
> > These last two actions 3.A and 3.B may happen concurrently within the same
> > transport time frame, or "at the same time" if you will. There is no way
> > to
> > prevent that from happening.
> Yes, and that is precisely why we cannot free the buffers from the
> Twrite until we got the Rflush.
> Until the Rflush comes, a Rwrite can still come at any time so we cannot
> just free these resources.
With current client version, agreed, as it might potentially incorrectly
lookup a wrong (new) request with the already recycled tag number then. With
consecutive tag numbers this would not happen. Client lookup with the old tag
number would fail -> ignore reply. However ...
> In theory it'd be possible to free the buffers for some protocol and
> throw the data with the bathwater, but the man page says that in this
> case we should ignore the flush and behave as if the request behaved
> properly because of side-effects e.g. even if you try to interrupt an
> unlink() call if the server says it removed it, well, it's removed so we
> should tell userspace that.
... good point! I was probably too much thinking about Twrite/Tread examples,
so I haven't considered that case indeed.
> > > > When the client sends a Tflush, it must wait to receive the
> > > > corresponding Rflush before reusing oldtag for subsequent messages
> > >
> > > if we free the request at this point we'd reuse the tag immediately,
> > > which definitely lead to troubles.
> > Yes, that's the point I never understood why this is done by Linux client.
> > I find it problematic to recycle IDs in a distributed system within a
> > short time window. Additionally it also makes 9p protocol debugging more
> > difficult, as you often look at tag numbers in logs and think, "does this
> > reference the previous request, or is it about a new one now?"
> I can definitely agree with that.
> We need to keep track of used tags, but we don't need to pick the lowest
> tag available -- maybe the IDR code that allocates tag can be configured
> to endlessly increment and loop around, only avoiding duplicates?
> Ah, here it is, from Documentation/core-api/idr.rst:
> If you need to allocate IDs sequentially, you can use
> idr_alloc_cyclic(). The IDR becomes less efficient when dealing
> with larger IDs, so using this function comes at a slight cost.
> That would be another "easy change", if you'd like to check that cost at
> some point...
Nice! I'll definitely give this a whirl and will report back!
> (until we notice that some server has a static array for tags and stop
> working once you use a tag > 64 or something...)
That would be an incorrect server implementation then, a.k.a. bug. The spec is
clear that tag numbers are generated by client and does not mandate any
> Anyway, this is getting off-topic -- the point is that we need to keep
> resources around for the original reply when we send a tflush, so we
> can't just free that buffer first unless you're really good with it.
> It'd be tempting to just steal its buffers but these might still be
> useful, if e.g. both replies come in parallel.
> (speaking of which, why do we need two buffers? Do we ever re-use the
> sent buffer once the reply comes?... this all looks sequential to me...)
Yep, I was thinking the exact same, but for now I would leave it this way.
> So instead of arguing here I'd say let's first finish your smaller reqs
> patches and make mempool again on top of that with a failsafe just for
> flush buffers to never fallback on mempool; I think that'll be easier to
> do in this order.
OK then, fine with me!
No time today, but I hope to post a new version next week.