Re: [RFC PATCH 11/13] x86/uintr: Introduce uintr_wait() syscall
From: Andy Lutomirski
Date: Thu Sep 30 2021 - 18:02:12 EST
On Thu, Sep 30, 2021, at 12:29 PM, Thomas Gleixner wrote:
> On Thu, Sep 30 2021 at 11:08, Andy Lutomirski wrote:
>> On Tue, Sep 28, 2021, at 9:56 PM, Sohil Mehta wrote:
>> I think we have three choices:
>>
>> Use a fancy wrapper around SENDUIPI. This is probably a bad idea.
>>
>> Treat the NV-2 as a real interrupt and honor affinity settings. This
>> will be annoying and slow, I think, if it's even workable at all.
>
> We can make it a real interrupt in form of a per CPU interrupt, but
> affinity settings are not really feasible because the affinity is in the
> UPID.ndst field. So, yes we can target it to some CPU, but that's racy.
>
>> Handle this case with faults instead of interrupts. We could set a
>> reserved bit in UPID so that SENDUIPI results in #GP, decode it, and
>> process it. This puts the onus on the actual task causing trouble,
>> which is nice, and it lets us find the UPID and target directly
>> instead of walking all of them. I don't know how well it would play
>> with hypothetical future hardware-initiated uintrs, though.
>
> I thought about that as well and dismissed it due to the hardware
> initiated ones but thinking more about it, those need some translation
> unit (e.g. irq remapping) anyway, so it might be doable to catch those
> as well. So we could just ignore them for now and go for the #GP trick
> and deal with the device initiated ones later when they come around :)
Sounds good to me. In the long run, if Intel wants device initiated fancy interrupts to work well, they need a new design.
>
> But even with that we still need to keep track of the armed ones per CPU
> so we can handle CPU hotunplug correctly. Sigh...
I don’t think any real work is needed. We will only ever have armed UPIDs (with notification interrupts enabled) for running tasks, and hot-unplugged CPUs don’t have running tasks. We do need a way to drain pending IPIs before we offline a CPU, but that’s a separate problem and may be unsolvable for all I know. Is there a magic APIC operation to wait until all initiated IPIs targeting the local CPU arrive? I guess we can also just mask the notification vector so that it won’t crash us if we get a stale IPI after going offline.
>
> Thanks,
>
> tglx