Re: [PATCH v3 0/8] vfio: ccw: Refactoring the VFIO CCW state machine

From: Cornelia Huck
Date: Tue Jun 19 2018 - 10:00:54 EST


On Thu, 14 Jun 2018 10:06:31 +0200
Pierre Morel <pmorel@xxxxxxxxxxxxx> wrote:

> I tried to make a better description to add later in documentation
> or in the next cover-letter.
>
> Note that in the current patch series I did not implement online/offline
> events but just kept the previous state changes.
> Not sure if it was a good idea, the goal was to be as simple as possible
> for this iteration.
> If you think it is worth to continue in this direction I will re-introduce
> these as events.

I'm still trying to figure out what we want from the state machine.
I've tried sketching your fsm proposal as outlined by you below for
myself and I have some remarks :)

>
> ======================
>
> 1) In CCW VFIO, Finite State Machine is used to clarify the VFIO CCW driver
> actions to be take depending on the events occurring in a device.
>
> To protect the state transitions, a mutex protect the action
> started when an event occurs in a specific state.
>
> The actions must never sleep or keep the mutex a long timer.
>
> To be able to take the mutex when hardware events occur we start
> a work on a dedicated workqueue, posting the event from inside the
> workqueue.
>
> The state machine has very few states describing the device driver life.
>
> ------------------------------------------------------------
> | NOT_OPERATIONALÂ | No guest is associated with the device|

I don't think that this is the right description. It is either the
initial state, or something has happened that rendered the device
inaccessible.

Also, we need to be careful with the virtual machine vs. guest
terminology. The only thing that has an impact from the guest is when
it does I/O and when an interrupt is generated as a result (i.e., the
IDLE/BUSY transitions). The other transitions are triggered by virtual
machine setup.

> ------------------------------------------------------------
> | STANDBYÂÂÂÂÂÂÂÂÂ | A guest is associated but not started |

This is basically "device is bound to driver, but no mdev has been set
up". In my understanding, we need to get the device to IDLE state
before the vm can present it to the guest (be it before the machine is
up or during hotplug).

> ------------------------------------------------------------
> | IDLEÂÂÂÂÂÂÂÂÂÂÂÂ | Guest started device readyÂÂÂÂÂÂÂÂÂÂÂ |

Agree with "device ready", disagree with "guest started" (see above).
"Device ready, accepting I/O requests"?

> ------------------------------------------------------------
> | BUSYÂÂÂÂÂÂÂÂÂÂÂÂ | The device is busy doing IOÂÂÂÂÂÂÂÂÂÂ |
> ------------------------------------------------------------
> | QUIESCINGÂÂÂÂÂÂÂ | The driver is releasing the deviceÂÂÂ |

Now you are talking about the driver :) This should probably be done
for the other states as well.

Isn't that state for waiting for outstanding I/O to be terminated
before the mdev is destroyed? IOW, the device may stay bound to the
driver afterwards?

> ------------------------------------------------------------
>
>
> 2) The following Events may apply to the state machine:
>
> If the event is not described, it means it has no influence
> on the state and that no action is required.
>
> 2.1) FSM in state NOT_OPERATIONAL
> ------------------------------------------------------------
> | INITÂÂÂÂÂÂÂÂÂÂÂÂ | a guest will drive the SCÂÂÂÂÂÂÂÂÂÂÂÂ |

Better to write out "subchannel", I could not figure out the
abbreviation immediately.

Also, doesn't the event really mean "we're initializing the device"?
The ultimate intention is of course for a guest to use the device, but
the immediate intention is just "get the device through the first
initialization steps".

> | |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | triggered by: mdev_openÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> | | action: initialize driver structures |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | action: register IOMMU notifierÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | state on success: STANDBYÂÂÂÂÂÂÂÂÂÂÂÂ |
> | | state on error : NOT_OPERATIONAL |
> ------------------------------------------------------------
>
> 2.2) FSM in state STANDBY
>
> ------------------------------------------------------------
> | ONLINEÂÂÂÂÂÂÂÂÂÂ | The host wants the SC onlineÂÂÂÂÂÂÂÂÂ |

Isn't that rather "an mdev is created"?

> | |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | triggered by: vfio_resetÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | action: enable SCÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | state on success: IDLEÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> | | state on error : STANDBY |

What happens if the subchannel is not operational when we try to get it
ready? Can it go to NOT_OPERATIONAL in that case?

> ------------------------------------------------------------
>
> Some operations may be intercepted by the state machine but
> will not induce a state change:
> OFFLINE: re-issue the disabling of the SC

Should that even be possible? If we're still busy tearing it down,
shouldn't we be in QUIESCING state?

> IRQÂÂÂ : re-issue the disabling of the SC

Either we should have cleared the subchannel out during QUIESCING, or
we got an unsolicited interrupt. The latter can be avoided if we make
sure that the device is disabled when we go to STANDBY.

>
> 2.3) FSM in state IDLE
>
> ------------------------------------------------------------
> | SSCH | The guest issue the ssch instruction |
> | |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | triggered by: vfio_write SSCH=1ÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | action: start an IO requestÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | state on success: BUSYÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> | | state on error : IDLE |

Should there be any way to drop to NOT_OPERATIONAL as well? We'll
probably get a notification from the common I/O layer if that happens,
though.

> ------------------------------------------------------------
> 2.4) FSM in states IDLE or BUSY
>
> ------------------------------------------------------------
> | IRQÂÂÂÂÂÂÂÂÂÂÂÂÂ | The hardware issue an interruptÂÂÂÂÂÂ |

I think we can get this in QUIESCING as well, and it is an indication
that QUIESCING is done (for final states).

If the subchannel is enabled while in STANDBY, we could get an
interrupt there as well.

> | |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | triggered by: vfio_ccw_sch_irqÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | action: update SCSW forward IRQÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | state on success: IDLEÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |

One thing to consider (and I'm not sure we're handling it correctly
right now): What if we don't have a final status for the I/O yet, i.e.
there will be another interrupt for the I/O? Should we stay in BUSY in
that case?

> | | state on error : IDLE |

Hm, what error?

> ------------------------------------------------------------
>
> ------------------------------------------------------------
> | OFFLINEÂÂÂÂÂÂÂÂÂ | The host wants the SC offlineÂÂÂÂÂÂÂÂ |

That's mdev teardown, I guess?

> | |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | triggered by: css removeÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | triggered by: css shutdownÂÂÂÂÂÂÂÂÂÂÂ |

These as well.

> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | action: disable SCÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | state on success: STANDBYÂÂÂÂÂÂÂÂÂÂÂÂ |

If it's really a css remove (unbind from driver or device removal), it
should not really have a target state, as the vfio-ccw device will be
gone, no? This is only correct for mdev removal.

> | | state on error : NOT_OPERATIONAL |
> ------------------------------------------------------------
>
> ------------------------------------------------------------
> | STORE_SCHIBÂÂÂÂÂ | The hardware issue "schib updated"ÂÂÂ |

This is the common I/O layer's sch_event callback. That can mean
different things:
- Something regarding the paths to the device changed. We can translate
that to "schib updated".
- Device is gone. This is a different situation; we may either try to
implement the disconnected state, or we may give up the device.

Also, can't we get this event in QUIESCING or STANDBY as well?

> | |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | triggered by: sch_eventÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> | | action: Store the SCHIB in IO region |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | state on success: no changeÂÂÂÂÂÂÂÂÂÂ |

As said, we may want some different handling for "device gone"
situations.

Also, what happens if we get that event in BUSY? Is the I/O still
running?

> | | state on error : NOT_OPERATIONAL |
> ------------------------------------------------------------
>
> NOTE 1:
> ÂÂ All out of band IO related instructions, XSCH, CSCH, HSCH
> ÂÂ can be started in both states IDLE and BUSY.

Hm. What do you mean by "out of band"?

You can, of course, get all of the instructions above in both IDLE and
BUSY states. The question is what we want to do with them. Do we want
to put them through to the hardware in any case? If we do e.g. a hsch
and get a cc 2, we don't want to drop to IDLE, but stay in BUSY (as the
start function is likely still running).

What about RSCH?

>
> NOTE 2:
> ÂÂ Currently IRQ means solicited interrupt, but handle both
> ÂÂ solicited and unsolicited interrupts.
> ÂÂ The unsolicited interrupt event may be implemented separatly.

We need to be ready for unsolicited interrupts at any point in time. It
mainly depends on the state whether we want to forward them to the
guest (BUSY, IDLE) or not (QUIESCING, STANDBY).

>
> 2.5) FSM in any state
>
> ------------------------------------------------------------
> | NOT_OPERATIONALÂ | The device is releasedÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> | |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | triggered by: mdev_releaseÂÂÂÂÂÂÂÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | action: unregister IOMMU notifierÂÂÂÂ |
> |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | state on success: NOT_OPERATIONALÂÂÂÂ |
> ------------------------------------------------------------

Confused. Isn't that "device gone or not operational" as well?