Re: [PATCH RFC] pass write value to in_range pointers

From: Michael S. Tsirkin
Date: Mon Jun 22 2009 - 13:28:01 EST

On Mon, Jun 22, 2009 at 12:29:10PM -0400, Gregory Haskins wrote:
> Michael S. Tsirkin wrote:
> > On Mon, Jun 22, 2009 at 11:45:00AM -0400, Gregory Haskins wrote:
> >
> >> Michael S. Tsirkin wrote:
> >>
> >>> It seems that a lot of complexity and trickiness with iosignalfd is
> >>> handling the group/item relationship, which comes about because kvm does
> >>> not currently let a device on the bus claim a write transaction based on the
> >>> value written. This could be greatly simplified if the value written
> >>> was passed to the in_range check for write operation. We could then
> >>> simply make each kvm_iosignalfd a device on the bus.
> >>>
> >>> What does everyone think of the following lightly tested patch?
> >>>
> >>>
> >> Hi Michael,
> >> Its interesting, but I am not convinced its necessary. We created the
> >> group/item layout because iosignalfds are unique in that they are
> >> probably the only IO device that wants to do some kind of address
> >> aliasing.
> >>
> >
> > We actually already have aliasing: is_write flag is used for this
> > purpose.
> Yes, but read/write address aliasing is not the same thing is
> multi-match data aliasing.

What's the big difference?

> Besides, your proposal also breaks

s/break/removes limitation/ :)

> some of
> the natural relationship models
> (e.g. all the aliased iosignal_items
> always belong to the same underlying device. io_bus entries have an
> arbitrary topology).

iosignal_item is an artifact, they are not seen by user -
they are just a work around an API limitation.

And they are only grouped if the same PIO offset is used for all accesses.
Why is not always the case. If a device uses several PIO offsets
(as virtio does), you create separate devices for a single guest device too.

> > Actually, it's possible to remove is_write by passing
> > a null pointer in write_val for reads. I like this a bit less as
> > the code generated is less compact ... Avi, what do you think?
> >
> >
> >> With what you are proposing here, you are adding aliasing
> >> support to the general infrastructure which I am not (yet) convinced is
> >> necessary.
> >>
> >
> > Infrastructure is a big name for something that adds a total of 10 lines to kvm.
> > And it should at least halve the size of your 450-line patch.
> >
> Your patch isn't complete until some critical missing features are added
> to io_bus, though, so its not really just 10 lines.
> For one, it will
> need to support much more than 6 devices.

Isn't this like a #define change? With the item patch we are still
limited in the number of groups we can create.

What we gain is a simple array/list instead of a tree of
linked lists that makes cheshire cheese out of CPU data cache.

> It will also need to support
> multiple matches.

What, signal many fds on the same address/value pair?
I see this as a bug. Why is this a good thing to support?
Just increases the chance of leaking this fd.

> Also you are proposing an general interface change
> that doesn't make sense to all but one device type. So now every
> io-device developer that comes along will scratch their head at what to
> do with that field.

What do they do with is_write now? Ignore it. It's used in a whole
of 1 place.

> None of these are insurmountable hurdles, but my point is that today the
> complexity is encapsulated in the proper place IMO.

It's better to get rid of complexity than encapsulate it.

> E.g. The one and
> only device that cares to do this "weird" thing handles it behind an
> interface that makes sense to all parties involved.
> >
> >> If there isn't a use case for other devices to have
> >> aliasing, I would think the logic is best contained in iosignalfd. Do
> >> you have something in mind?
> >>
> >
> > One is enough :)
> >
> I am not convinced yet. ;) It appears to me that we are leaking
> iosignalfd-isms into the general code. If there is another device that
> wants to do something similar, ok. But I can't think of any.

You never know. is_write was used by a whole of 1 user: coalesced_mmio,
then your patch comes along ...

> > Seriously, do you see that this saves you all of RCU, linked lists and
> > counters?
> Well, also keep in mind we will probably be converting io_bus to RCU
> very soon, so we are going the opposite direction ;)
> Kind Regards,
> -Greg

Same direction. Let's put RCU in iobus, we don't need another one on
top of it. That's encapsulating complexity.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at