Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels
From: Vitaly Kuznetsov
Date: Fri Apr 03 2020 - 10:56:51 EST
Andrea Parri <parri.andrea@xxxxxxxxx> writes:
> On Mon, Mar 30, 2020 at 02:45:54PM +0200, Vitaly Kuznetsov wrote:
>> Andrea Parri <parri.andrea@xxxxxxxxx> writes:
>>
>> >> Correct me if I'm wrong, but currently vmbus_chan_sched() accesses
>> >> per-cpu list of channels on the same CPU so we don't need a spinlock to
>> >> guarantee that during an interrupt we'll be able to see the update if it
>> >> happened before the interrupt (in chronological order). With a global
>> >> list of relids, who guarantees that an interrupt handler on another CPU
>> >> will actually see the modified list?
>> >
>> > Thanks for pointing this out!
>> >
>> > The offer/resume path presents implicit full memory barriers, program
>> > -order after the array store which should guarantee the visibility of
>> > the store to *all* CPUs before the offer/resume can complete (c.f.,
>> >
>> > tools/memory-model/Documentation/explanation.txt, Sect. #13
>> >
>> > and assuming that the offer/resume for a channel must complete before
>> > the corresponding handler, which seems to be the case considered that
>> > some essential channel fields are initialized only later...)
>> >
>> > IIUC, the spin lock approach you suggested will work and be "simpler";
>> > an obvious side effect would be, well, a global synchronization point
>> > in vmbus_chan_sched()...
>> >
>> > Thoughts?
>>
>> This is, of course, very theoretical as if we're seeing an interrupt for
>> a channel at the same time we're writing its relid we're already in
>> trouble. I can, however, try to suggest one tiny improvement:
>
> Indeed. I think the idea (still quite informal) is that:
>
> 1) the mapping of the channel relid is propagated to (visible from)
> all CPUs before add_channel_work is queued (full barrier in
> queue_work()),
>
> 2) add_channel_work is queued before the channel is opened (aka,
> before the channel ring buffer is allocate/initalized and the
> OPENCHANNEL msg is sent and acked from Hyper-V, cf. OPEN_STATE),
>
> 3) the channel is opened before Hyper-V can start sending interrupts
> for the channel, and hence before vmbus_chan_sched() can find the
> channel relid in recv_int_page set,
>
> 4) vmbus_chan_sched() finds the channel's relid in recv_int_page
> set before it search/load from the channel array (full barrier
> in sync_test_and_clear_bit()).
>
> This is for the "normal"/not resuming from hibernation case; for the
> latter, notice that:
>
> a) vmbus_isr() (and vmbus_chan_sched()) can not run until when
> vmbus_bus_resume() has finished (@resume_noirq callback),
>
> b) vmbus_bus_resume() can not complete before nr_chan_fixup_on_resume
> equals 0 in check_ready_for_resume_event().
>
> (and check_ready_for_resume_event() does also provides a full barrier).
>
> If makes sense to you, I'll try to add some of the above in comments.
>
It does, thank you!
--
Vitaly