Re: [PATCH RFC 2/2] vfio-ccw: support for halt/clear subchannel
From: Cornelia Huck
Date: Fri Jun 08 2018 - 08:20:32 EST
On Thu, 7 Jun 2018 18:37:16 +0200
Pierre Morel <pmorel@xxxxxxxxxxxxx> wrote:
> On 07/06/2018 11:54, Cornelia Huck wrote:
> > --------------
> > | scsw(g) | ssch
> > -------------- |
> > | guest
> > --------------------------------------------------------------
> > | qemu
> > -------------- v
> > | scsw(q) | emulate
> > -------------- |
> > |
> > -------------- v
> > | scsw(r) | pwrite()
> > -------------- |
> > |
> > --------------------------------------------------------------
> > | vfio
> > v
> > ssch
> > |
> > --------------------------------------------------------------
> > | hardware
> > -------------- v
> > | scsw(h) | actually do something
> > --------------
> >
> > The guest issues a ssch (which gets intercepted; it won't get control
> > back until ssch finishes with a cc set.) scsw(g) won't change, unless
> > the guest does a stsch for the subchannel on another vcpu, in which
> > case it will get whatever information qemu holds in scsw(q) at that
> > point in time.
> >
> > When qemu starts to emulate the guest's ssch, it will set the start
> > function bit in the fctl field of scsw(q). It then copies scsw(q) to
> > scsw(r) in the vfio region.
> >
> > The vfio code will then proceed to call ssch on the real subchannel.
> > This is the first time we get really asynchronous, as the ssch will
> > return with cc set and the start function will be performed at some
> > point in time. If we would do a stsch on the real subchannel, we would
> > see that scsw(h) now has the start function bit set.
> >
> > Currently, we won't return back up the chain until we get an interrupt
> > from the hardware, at which time we update the scsw(r) from the irb.
>
> I do not find where the thread waits for interrupt inside the
> write->fsm_io_request->fsm_io_helper->ssch chain.
>
> I must have miss it ten times. Can you point it to me?
I'm not surprised you did not find it, because it isn't there :)
I've let myself be confused by the /* Start channel program and wait
for I/O interrupt. */ comment -- it is wrong and should be removed (but
we did have that waiting initially). So we already have the async
situation :)
>
> > This will propagate into the scsw(q). At the time we finish handling
> > the guest's ssch and return control to it, we're all done and if the
> > guest does a stsch to update its scsw(g), it will get the current
> > scsw(q) which will already contain the scsw from the interrupt's irb
> > (indicating that the start function is already finished).
> >
> > Now let's imagine we have a future implementation that handles actually
> > performing the start on the hardware asynchronously, i.e. it returns
> > control to the guest without the interrupt having been posted (let's
> > say that it is a longer-running I/O request). If the guest now did a
> > stsch to update scsw(g), it would get the current state of scsw(q),
> > which would be "start function set, but not done yet".
> >
> > If the guest now does a hsch, it would trap in the same way as the ssch
> > before. When qemu gets control, it adds the halt bit in scsw(q) (which
> > is in accordance with the architecture).
>
> This I agree.
> The scsw(q) is part of the QEMU device.
>
> > My proposal is to do the same
> > copying to scsw(r) again, which would mean we get a request with both
> > the halt and the start bit set. The vfio code now needs to do a hsch
> > (instead of a ssch). The real channel subsystem should figure this out,
> > as we can't reliably check whether the start function has concluded
> > already (there's always a race window).
>
> This I do not agree scsw(r) is part of the driver.
> The interface here is not a device interface anymore but a driver
> interface.
> SCSW is a status, it is at its place in QEMU device interface with the
> guest
> but here pwrite() sends a command.
Hm, I rather consider that "we write a status, and the backend figures
out what to do based on that status".
>
>
> Since we do not do a STSCH on each command, scsw(q), should be
> updated by QEMU depending on syscall return value.
> This is not done currently, only the success path is right
> because it is set before the call.
>
> If I read right and the IRQ is asynchronous, the scsw(q) is not right,
> because not updated, after the interrupt is received from the guest.
Yes, we're probably missing some updates.