Re: [PATCH] uio: Allow to take irq bottom-half into irq_handler with additional dt-binding
From: Andrey Zhizhikin
Date: Wed Dec 06 2017 - 11:59:14 EST
Hello Greg,
Thanks a lot for your prompt reply!
First of, this is my first patch submission to the Kernel, so thanks a
lot for your additional guidelines here regarding missing pieces.
Please don't judge me hard here. :)
I would add new DT bindings to Documentation and contact DT
maintainers to have a new binding discussed. However, I was not able
to find any dt-binding documentation for uio drivers in the kernel,
presumably I would have to create a new entry there...
As for the win against latency and running the patch against the
system which has all IRQ in threaded mode:
I've actually originated this patch based on the PREEMPT_RT kernel
configuration, where all IRQs are threaded. I have ARM-based system
running around 20 genirq UIO instances, and was demoting 2 of those
from threaded to non-threaded IRQ handlers without any issues recorded
to all the IRQ handlers. This patch actually is aimed exactly with the
logic that if new property is not found - then system behavior is not
amended, and all IRQs stays threaded. If needed, then a developer can
enable this property for it's node, but then he should be well-aware
of what this property implications are.
In average, using ftrace and kernelshark to analyze I observed the
gain of 20-30 usec per chain: irq_handler_entry -> irq/XX-uio -> <User
space IRQ part> so I would say the gain is not very significant for
average user-space task. However IMHO, there are several hidden
benefits here with having this modification, namely:
- It eliminates few re-scheduling operations, making INT ACK code more
robust and relieves the pressure from the scheduler when HW interrupt
for this IRQ is signaled at a high-enough frequency;
- It makes top and bottom half to be executed back-to-back with IRQ
OFF, making operation pseudo-atomic;
- Above gain might be significant when average latency times for the
systems are comparable.
I do have a worst-case latency measured with cyclictest here at 50
usec, so as a developer I would consider to have above gain in my
system. :)
Please let me know what you think on those points, if they all make
sense to you - otherwise I can drop this patch out.
--
Regards,
Andrey.
On Wed, Dec 6, 2017 at 4:31 PM, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> On Wed, Dec 06, 2017 at 03:55:40PM +0100, Andrey Zhizhikin wrote:
>> Certain Kernel preemption models are using threaded interrupt handlers,
>> which is in general quite beneficial. However, threaded handlers
>> introducing additional scheduler overhead, when the bottom-half thread
>> should be woken up and scheduled for execution. This can result is
>> additional latency, which in certain cases is not desired.
>>
>> UIO driver with Generic IRQ handler, that wraps a HW block might suffer
>> a small degradation when it's bottom half is executed, since it needs
>> its bottom half to be woken up by the scheduler every time INT is
>> delivered. For high rate INT signals, this also bring additional
>> undesired load on the scheduler itself.
>>
>> Since the actual ACK is performed in the top-half, and bottom-half of
>> the UIO driver with Generic IRQ handler is relatively slick (only flag
>> is set based on the INT reception), it might be beneficial to move this
>> bottom-half to the irq_handler itself, rather than to have a separate
>> thread to service it.
>>
>> This patch aims to address the task above, and in addition introduces
>> a new dt-binding which could be configured on a per-node basis. That
>> means developers utilizing the UIO driver could decide which UIO
>> instance is critical in terms of interrupt processing, and move their
>> corresponding bottom-halves to the irq_handler to fight additional
>> scheduling latency.
>>
>> New DT binding:
>> - no-threaded-irq: when present, request_irq() is called with
>> IRQF_NO_THREAD flag set, effectively skipping threaded interrupt
>> handler and taking bottom-half into irq_handler
>>
>> Signed-off-by: Andrey Zhizhikin <andrey.z@xxxxxxxxx>
>
> For new DT bindings, don't you have to add them to the in-kernel
> documentation and get an ack from the DT maintainers? Please do that
> here.
>
> ALso, how much does this really save in latency/delay by not allowing a
> threaded irq? What about systems that run all irqs in threaded mode?
> Will that break something here?
>
> thanks,
>
> greg k-h