Re: [PATCH] ptp: Add vDSO-style vmclock support
From: Michael S. Tsirkin
Date: Fri Jul 26 2024 - 01:56:06 EST
On Fri, Jul 26, 2024 at 01:09:24AM -0400, Michael S. Tsirkin wrote:
> On Thu, Jul 25, 2024 at 10:29:18PM +0100, David Woodhouse wrote:
> > > > > Then can't we fix it by interrupting all CPUs right after LM?
> > > > >
> > > > > To me that seems like a cleaner approach - we then compartmentalize
> > > > > the ABI issue - kernel has its own ABI against userspace,
> > > > > devices have their own ABI against kernel.
> > > > > It'd mean we need a way to detect that interrupt was sent,
> > > > > maybe yet another counter inside that structure.
> > > > >
> > > > > WDYT?
> > > > >
> > > > > By the way the same idea would work for snapshots -
> > > > > some people wanted to expose that info to userspace, too.
> >
> > Those people included me. I wanted to interrupt all the vCPUs, even the
> > ones which were in userspace at the moment of migration, and have the
> > kernel deal with passing it on to userspace via a different ABI.
> >
> > It ends up being complex and intricate, and requiring a lot of new
> > kernel and userspace support. I gave up on it in the end for snapshots,
> > and didn't go there again for this.
>
> Maybe become you insist on using ACPI?
> I see a fairly simple way to do it. For example, with virtio:
>
> one vq per CPU, with a single outstanding buffer,
> callback copies from the buffer into the userspace
> visible memory.
>
> Want me to show you the code?
Couldn't resist, so I wrote a bit of this code.
Fundamentally, we keep a copy of the hypervisor abi
in the device:
struct virtclk_info *vci {
struct vmclock_abi abi;
};
each vq will has its own copy:
struct virtqueue_info {
struct scatterlist sg[];
struct vmclock_abi abi;
}
we add it during probe:
sg_init_one(vqi->sg, &vqi->abi, sizeof(vqi->abi));
virtqueue_add_inbuf(vq,
vqi->sg, 1,
&vq->vabi,
GFP_ATOMIC);
We set the affinity for each vq:
for (i = 0; i < num_online_cpus(); i++)
virtqueue_set_affinity(vi->vq[i], i);
(virtio net does it, and it handles cpu hotplug as well)
each vq callback would do:
static void vmclock_cb(struct virtqueue *vq)
{
struct virtclk_info *vci = vq->vdev->priv;
struct virtqueue_info *vqi = vq->priv;
void *buf;
unsigned int len;
buf = virtqueue_get_buf(vq, &len);
if (!buf)
return;
BUG_ON(buf != &vq->abi);
spin_lock(vci->lock);
if (memcmp(&vci->abi, &vqi->abi, sizeof(vqi->abi))) {
memcpy(&vci->abi, &vqi->abi, sizeof(vqi->abi));
}
/* Update the userspace visible structure now */
.....
/* Re-add the buffer */
virtqueue_add_inbuf(vq,
vqi->sg, 1,
&vqi->abi,
GFP_ATOMIC);
spin_unlock(vi->lock);
}
That's it!
Where's the problem here?
--
MST