Le 30/01/2025 à 21:26, Sebastian Andrzej Siewior a écrit :
On 2025-01-30 22:27:07 [+0530], Shrikanth Hegde wrote:
| #DEFINE need_irq_preemption() \
| (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
|
| if (need_irq_preemption()) {
be a bit smaller/ quicker? This could be a fast path ;)
I am okay with either way. I did try both[1], there wasn't any significant difference,
hence chose a simpler one. May be system size, workload pattern might matter.
Let me do some more testing to see which one wins.
Is there any specific benchmark which might help here?
No idea. As per bean counting: preempt_model_preemptible() should
resolve in two function calls + conditional in the dynamic case. This
should be more expensive compared to a nop/ branch ;)
I did a build comparison on pmac32_defconfig + dynamic preemption, the static branch looks definitely more optimal:
Without static branch:
...
294: 7c 08 02 a6 mflr r0
298: 90 01 00 24 stw r0,36(r1)
29c: 48 00 00 01 bl 29c <interrupt_exit_kernel_prepare+0x50>
29c: R_PPC_REL24 preempt_model_full
2a0: 2c 03 00 00 cmpwi r3,0
2a4: 41 82 00 a4 beq 348 <interrupt_exit_kernel_prepare+0xfc>
...
2a8: 81 22 00 4c lwz r9,76(r2)
2ac: 71 29 00 04 andi. r9,r9,4
2b0: 40 82 00 d4 bne 384 <interrupt_exit_kernel_prepare+0x138>
...
2b4: 7d 20 00 a6 mfmsr r9
2b8: 55 29 07 fa rlwinm r9,r9,0,31,29
...
348: 48 00 00 01 bl 348 <interrupt_exit_kernel_prepare+0xfc>
348: R_PPC_REL24 preempt_model_lazy
34c: 2c 03 00 00 cmpwi r3,0
350: 40 a2 ff 58 bne 2a8 <interrupt_exit_kernel_prepare+0x5c>
354: 4b ff ff 60 b 2b4 <interrupt_exit_kernel_prepare+0x68>
With the static branch:
...
294: 48 00 00 58 b 2ec <interrupt_exit_kernel_prepare+0xa0>
...
298: 7d 20 00 a6 mfmsr r9
29c: 55 29 07 fa rlwinm r9,r9,0,31,29
...
2ec: 81 22 00 4c lwz r9,76(r2)
2f0: 71 29 00 04 andi. r9,r9,4
2f4: 41 82 ff a4 beq 298 <interrupt_exit_kernel_prepare+0x4c>
...
With the static branch it is just an unconditional branch being later replaced by a NOP. It must be more efficient.
So in first case we need to get and save LR, call preempt_model_full(), compare with 0, call preempt_model_lazy() and compare again, and at some point read and restore LR.
And the preempt_model_() functions are not tiny functions:
00003620 <preempt_model_full>:
3620: 3d 20 00 00 lis r9,0
3622: R_PPC_ADDR16_HA preempt_dynamic_mode
3624: 94 21 ff f0 stwu r1,-16(r1)
3628: 80 69 00 00 lwz r3,0(r9)
362a: R_PPC_ADDR16_LO preempt_dynamic_mode
362c: 2c 03 ff ff cmpwi r3,-1
3630: 41 82 00 18 beq 3648 <preempt_model_full+0x28>
3634: 68 63 00 02 xori r3,r3,2
3638: 38 21 00 10 addi r1,r1,16
363c: 7c 63 00 34 cntlzw r3,r3
3640: 54 63 d9 7e srwi r3,r3,5
3644: 4e 80 00 20 blr
...
3648: 0f e0 00 00 twui r0,0
364c: 68 63 00 02 xori r3,r3,2
3650: 38 21 00 10 addi r1,r1,16
3654: 7c 63 00 34 cntlzw r3,r3
3658: 54 63 d9 7e srwi r3,r3,5
365c: 4e 80 00 20 blr
00003660 <preempt_model_lazy>:
3660: 3d 20 00 00 lis r9,0
3662: R_PPC_ADDR16_HA preempt_dynamic_mode
3664: 94 21 ff f0 stwu r1,-16(r1)
3668: 80 69 00 00 lwz r3,0(r9)
366a: R_PPC_ADDR16_LO preempt_dynamic_mode
366c: 2c 03 ff ff cmpwi r3,-1
3670: 41 82 00 18 beq 3688 <preempt_model_lazy+0x28>
3674: 68 63 00 03 xori r3,r3,3
3678: 38 21 00 10 addi r1,r1,16
367c: 7c 63 00 34 cntlzw r3,r3
3680: 54 63 d9 7e srwi r3,r3,5
3684: 4e 80 00 20 blr
...
3688: 0f e0 00 00 twui r0,0
368c: 68 63 00 03 xori r3,r3,3
3690: 38 21 00 10 addi r1,r1,16
3694: 7c 63 00 34 cntlzw r3,r3
3698: 54 63 d9 7e srwi r3,r3,5
369c: 4e 80 00 20 blr
But you would still need preempt_model_preemptible() for the !DYN case.
+ preempt_model_voluntary() ? "voluntary" :
+ preempt_model_full() ? "full" :
+ preempt_model_lazy() ? "lazy" :
+ "",
So intend to rework this part. I have patches stashed at
https://eur01.safelinks.protection.outlook.com/? url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fbigeasy%2Fstaging.git%2Flog%2F%3Fh%3Dpreemption_string&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C553a8f640a9c4514597d08dd416c6534%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638738655988556429%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=ZfVQi1iRYaVCTrZ5vIhOQ7yAKaDnOwFi0NRjycfrI5A%3D&reserved=0
which I didn't sent yet due to the merge window. Just a heads up ;)
Makes sense. I had seen at-least two places where this code was there, ftrace/powerpc.
There were way more places..
You want me to remove this part?
No, just be aware.
I don't know how this will be routed I guess we merge the sched pieces
first and then I submit the other pieces via the relevant maintainer
tree. In that case please be aware that all parts get removed/ replaced
properly.
Sebastian