RE: [PATCH 2/2] random: add fork_event sysctl for polling VM forks
From: Michael Kelley (LINUX)
Date: Wed May 04 2022 - 11:45:45 EST
From: Alexander Graf <graf@xxxxxxxxxx> Sent: Monday, May 2, 2022 11:35 AM
>
> On 02.05.22 20:04, Jason A. Donenfeld wrote:
> > Hey Lennart,
> >
> > On Mon, May 02, 2022 at 06:51:19PM +0200, Lennart Poettering wrote:
> >> On Mo, 02.05.22 18:12, Jason A. Donenfeld (Jason@xxxxxxxxx) wrote:
> >>
> >>>>> In order to inform userspace of virtual machine forks, this commit adds
> >>>>> a "fork_event" sysctl, which does not return any data, but allows
> >>>>> userspace processes to poll() on it for notification of VM forks.
> >>>>>
> >>>>> It avoids exposing the actual vmgenid from the hypervisor to userspace,
> >>>>> in case there is any randomness value in keeping it secret. Rather,
> >>>>> userspace is expected to simply use getrandom() if it wants a fresh
> >>>>> value.
> >>>> Wouldn't it make sense to expose a monotonic 64bit counter of detected
> >>>> VM forks since boot through read()? It might be interesting to know
> >>>> for userspace how many forks it missed the fork events for. Moreover it
> >>>> might be interesting to userspace to know if any fork happened so far
> >>>> *at* *all*, by checking if the counter is non-zero.
> >>> "Might be interesting" is different from "definitely useful". I'm not
> >>> going to add this without a clear use case. This feature is pretty
> >>> narrowly scoped in its objectives right now, and I intend to keep it
> >>> that way if possible.
> >> Sure, whatever. I mean, if you think it's preferable to have 3 API
> >> abstractions for the same concept each for it's special usecase, then
> >> that's certainly one way to do things. I personally would try to
> >> figure out a modicum of generalization for things like this. But maybe
> >> that' just me…
> >>
> >> I can just tell you, that in systemd we'd have a usecase for consuming
> >> such a generation counter: we try to provide stable MAC addresses for
> >> synthetic network interfaces managed by networkd, so we hash them from
> >> /etc/machine-id, but otoh people also want them to change when they
> >> clone their VMs. We could very nicely solve this if we had a
> >> generation counter easily accessible from userspace, that starts at 0
> >> initially. Because then we can hash as we always did when the counter
> >> is zero, but otherwise use something else, possibly hashed from the
> >> generation counter.
> > This doesn't work, because you could have memory-A split into memory-A.1
> > and memory-A.2, and both A.2 and A.1 would ++counter, and wind up with
> > the same new value "2". The solution is to instead have the hypervisor
> > pass a unique value and a counter. We currently have a 16 byte unique
> > value from the hypervisor, which I'm keeping as a kernel space secret
> > for the RNG; we're waiting on a word-sized monotonic counter interface
> > from hypervisors in the future. When we have the latter, then we can
> > start talking about mmapable things. Your use case would probably be
> > served by exposing that 16-byte unique value (hashed with some constant
> > for safety I suppose), but I'm hesitant to start going down that route
> > all at once, especially if we're to have a more useful counter in the
> > future.
>
>
> Michael, since we already changed the CID in the spec, can we add a
> property to the device that indicates the first 4 bytes of the UUID will
> always be different between parent and child?
>
> That should give us the ability to mmap the vmgenid directly to user
> space and act based on a simple u32 compare for clone notification, no?
>
I'm not ignoring this request, but my interpretation of the subsequent
discussion is that it's probably not the path that we want to go down
anyway. Is that a correct interpretation?
Also, the chances of getting the Windows team to focus on a revision
to the spec are not high, especially a revision that has new semantics. :-(
Getting the new CID added was a relatively low bar, though I'm still trying
to get the publicly available version of the spec updated to include the
new CID.
Michael