RE: [Xen-devel] [PATCH 2/4] xenbus: limit when state is forced to closed

From: Durrant, Paul
Date: Mon Dec 09 2019 - 11:26:23 EST


> -----Original Message-----
[snip]
> >
> > Well unbind is pretty useless now IMO since bind doesn't work, and a
> transition straight to closed is just plain wrong anyway.
>
> Why do you claim that a straight transition into the closed state is
> wrong?

It's badly documented, I agree, but have a look at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/xen-netback/xenbus.c#n480. Connected -> Closed is not a valid transition, and I don't think it was ever intended to be.

>
> I don't see any such mention in blkif.h, which also doesn't contain
> any guidelines regarding closing state transitions, so unless
> otherwise stated somewhere else transitions into closed can happen
> from any state IMO.
>

They can, but it is even more poorly documented what should be done in this case.

> > But, we could have a flag that the backend driver sets to say that it
> supports transparent re-bind that gates this code. Would that make you
> feel more comfortable?
>
> Having an option to leave state untouched when unbinding would be fine
> for me, otherwise state should be set to closed when unbinding. I
> don't think there's anything else that needs to be done in this
> regard, the cleanup should be exactly the same the only difference
> being the setting of all the active backends to closed state.
>

Ok, I'll add such a flag and define it for blkback only, in patch #4 i.e. when it actually gains the ability to rebind.

> > If you want unbind to actually do a proper unplug then that's extra work
> and not really something I want to tackle (and re-bind would still need to
> be toolstack initiated as something would have to re-create the xenstore
> area).
>
> Why do you say the xenstore area would need to be recreated?
>
> Setting state to closed shouldn't cause any cleanup of the xenstore
> area, as that should already happen for example when using pvgrub
> since in that case grub itself disconnects and already causes a
> transition to closed and a re-attachment afterwards by the guest
> kernel.
>

For some reason, when I originally tested, the xenstore area disappeared. I checked again and it did not this time. I just ended up with a frontend stuck in state 5 (because it is the system disk and won't go offline) trying to talk to a non-existent backend. Upon re-bind the backend goes into state 5 (because it sees the 5 in the frontend) and leaves the guest wedged.

Paul