On 3/14/19 12:33 PM, Oleksandr Andrushchenko wrote:Ok, so it seems I have no choice, but implement proper .resume then )
On 3/14/19 17:40, Boris Ostrovsky wrote:For save/restore (migration) there is no guarantee that the new backend
On 3/14/19 11:10 AM, Oleksandr Andrushchenko wrote:Hm, so it does work for your use-case, but doesn't for mine.
On 3/14/19 5:02 PM, Boris Ostrovsky wrote:xl save / xl restore
On 3/14/19 10:52 AM, Oleksandr Andrushchenko wrote:Ah, the difference can be of the way we get the guest enter
On 3/14/19 4:47 PM, Boris Ostrovsky wrote:My tree is about a month old so perhaps there is some sort of
On 3/14/19 9:17 AM, Oleksandr Andrushchenko wrote:Exactly, if you take a look at the .resume callback as it is now
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>What do you mean? Are you saying that after resume you lose
Currently on driver resume we remove all the network queues and
destroy shared Tx/Rx rings leaving the driver in its current state
and never signaling the backend of this frontend's state change.
This leads to the number of consequences:
- when frontend withdraws granted references to the rings etc. it
cannot
ÂÂÂÂ be cleanly done as the backend still holds those (it was not
told to
ÂÂÂÂ free the resources)
- it is not possible to resume driver operation as all the
communication
ÂÂÂÂ means with the backned were destroyed by the frontend, thus
ÂÂÂÂ making the frontend appear to the guest OS as functional, but
ÂÂÂÂ not really.
connectivity?
what it does it destroys the rings etc. and never notifies the
backend
of that, e.g. it stays in, say, connected state with communication
channels destroyed. It never goes into any other Xen bus state, so
there is
no way its state machine can help recovering.
regression
but this certainly works for me. After resume netfront gets
XenbusStateInitWait from backend which causes xennet_connect().
the suspend state. I am making my guest to suspend with:
echo mem > /sys/power/state
And then I use an interrupt to the guest (this is a test code)
to wake it up.
Could you please share your exact use-case when the guest enters
suspend
and what you do to resume it?
I can see no way backend may want enter XenbusStateInitWait in myYours looks like ACPI path, I don't know how well it was tested TBH.
use-case
as it simply doesn't know we want him to.
What would be the best way forward?
1. Implement .resume properly as, for example, block front does [1]
2. Remove .resume completely: this does work as long as backend
doesn't change anything
has the same set of features.
I am still a bit unsure if we really need to re-initialize rings,
re-read front's config from
Xenstore etc - what changes on backend side are expected when we
resume the front driver?
Number of queues, for example. Or things in xennet_fix_features().
Thank you!
-boris
Thank you,
-boris
Oleksandr
[1]
https://elixir.bootlin.com/linux/v5.0.2/source/drivers/block/xen-blkfront.c#L2072