Re: Introduce xenwatch multithreading (mtwatch)
From: Juergen Gross
Date: Fri Sep 14 2018 - 05:18:40 EST
On 14/09/18 09:34, Dongli Zhang wrote:
> Hi,
>
> This patch set introduces xenwatch multithreading (mtwatch) based on the
> below xen summit 2018 design session notes:
>
> https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg00017.html
>
>
> xenwatch_thread is a single kernel thread processing the callback function
> for subscribed xenwatch events successively. The xenwatch is stalled in 'D'
> state if any of callback function is stalled and uninterruptible.
>
> The domU create/destroy is failed if xenwatch is stalled in 'D' state as
> the paravirtual driver init/uninit cannot complete. Usually, the only
> option is to reboot dom0 server unless there is solution/workaround to
> move forward and complete the stalled xenwatch event callback function.
> Below is the output of 'xl create' when xenwatch is stalled (the issue is
> reproduced on purpose by hooking netif_receive_skb() to intercept an
> sk_buff sent out from vifX.Y on dom0 with patch at
> https://github.com/finallyjustice/patchset/blob/master/xenwatch-stall-by-vif.patch):
>
> # xl create pv.cfg
> Parsing config from pv.cfg
> libxl: error: libxl_device.c:1080:device_backend_callback: Domain 2:unable to add device with path /local/domain/0/backend/vbd/2/51712
> libxl: error: libxl_create.c:1278:domcreate_launch_dm: Domain 2:unable to add disk devices
> libxl: error: libxl_device.c:1080:device_backend_callback: Domain 2:unable to remove device with path /local/domain/0/backend/vbd/2/51712
> libxl: error: libxl_domain.c:1073:devices_destroy_cb: Domain 2:libxl__devices_destroy failed
> libxl: error: libxl_domain.c:1000:libxl__destroy_domid: Domain 2:Non-existant domain
> libxl: error: libxl_domain.c:959:domain_destroy_callback: Domain 2:Unable to destroy guest
> libxl: error: libxl_domain.c:886:domain_destroy_cb: Domain 2:Destruction of domain failed
>
>
> The idea of this patch set is to create a per-domU xenwatch thread for each
> domid. The per-domid thread is created when the 1st pv backend device (for
> this domid and with xenwatch multithreading enabled) is created, while this
> thread is destroyed when the last pv backend device (for this domid and
> with xenwatch multithreading enabled) is removed. Per-domid xs_watch_event
> is never put on the default event list, but is put on the per-domid event
> list directly.
>
>
> For more details, please refer to the xen summit 2018 design session notes
> and presentation slides:
>
> https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg00017.html
> http://www.donglizhang.org/xenwatch_multithreading.pdf
>
> ----------------------------------------------------------------
>
> Dongli Zhang (6):
> xenbus: prepare data structures and parameter for xenwatch multithreading
> xenbus: implement the xenwatch multithreading framework
> xenbus: dispatch per-domU watch event to per-domU xenwatch thread
> xenbus: process otherend_watch event at 'state' entry in xenwatch multithreading
> xenbus: process be_watch events in xenwatch multithreading
> drivers: enable xenwatch multithreading for xen-netback and xen-blkback driver
>
> Documentation/admin-guide/kernel-parameters.txt | 3 +
> drivers/block/xen-blkback/xenbus.c | 3 +-
> drivers/net/xen-netback/xenbus.c | 1 +
> drivers/xen/xenbus/xenbus_probe.c | 24 +-
> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++
> drivers/xen/xenbus/xenbus_xs.c | 357 +++++++++++++++++++++++-
> include/xen/xenbus.h | 70 +++++
> 7 files changed, 484 insertions(+), 6 deletions(-)
>
> Thank you very much!
One general remark regarding your commit messages: Please drop the
patch number from the text itself, its enough to have it in the mail
subject line. Later it is completely irrelevant in git history.
Juergen