On 2019/11/11 21:27, Will Deacon wrote:
On Mon, Nov 11, 2019 at 09:23:55PM +0800, Zhenyu Ye wrote:
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses. This patch adds support for this feature.
This is the second version of the patch.
I traced the __flush_tlb_range() for a minute and get some statistical
data as below:
PAGENUM COUNT
1 34944
2 5683
3 1343
4 7857
5 838
9 339
16 933
19 427
20 5821
23 279
41 338
141 279
512 428
1668 120
2038 100
Those data are based on kernel-5.4.0, where PAGENUM = end - start, COUNT
shows number of calls to the __flush_tlb_range() in a minute. There only
shows the data which COUNT >= 100. The kernel is started normally, and
transparent hugepage is opened. As we can see, though most user TLBI
ranges were 1 pages long, the num of long-range can not be ignored.
The new feature of TLB range can improve lots of performance compared to
the current implementation. As an example, flush 512 ranges needs only 1
instruction as opposed to 512 instructions using current implementation.
And for a new hardware feature, support is better than not.
Signed-off-by: Zhenyu Ye <yezhenyu2@xxxxxxxxxx>
---
ChangeLog v1 -> v2:
- Change the main implementation of this feature.
- Add some comments.
How does this address my concerns here:
https://lore.kernel.org/linux-arm-kernel/20191031131649.GB27196@willie-the-truck/
?
Will
.
I think your concern is more about the hardware level, and we can do
nothing about
this at all. The interconnect/DVM implementation is not exposed to
software layer
(and no need), and may should be constrained at hardware level.