Re: [PATCH v2 0/3] Fix several bugs in KVM stage 2 translation

From: wangyanan (Y)
Date: Wed Dec 02 2020 - 07:00:59 EST


Hi Will, Marc,
On 2020/12/2 4:59, Will Deacon wrote:
On Wed, Dec 02, 2020 at 04:10:31AM +0800, Yanan Wang wrote:
When installing a new pte entry or updating an old valid entry in stage 2
translation, we use get_page()/put_page() to record page_count of the page-table
pages. PATCH 1/3 aims to fix incorrect use of get_page()/put_page() in stage 2,
which might make page-table pages unable to be freed when unmapping a range.

When dirty logging of a guest with hugepages is finished, we should merge tables
back into a block entry if adjustment of huge mapping is found necessary.
In addition to installing the block entry, we should not only free the non-huge
page-table pages but also invalidate all the TLB entries of non-huge mappings for
the block. PATCH 2/3 adds enough TLBI when merging tables into a block entry.

The rewrite of page-table code and fault handling add two different handlers
for "just relaxing permissions" and "map by stage2 page-table walk", that's
good improvement. Yet, in function user_mem_abort(), conditions where we choose
the above two fault handlers are not strictly distinguished. This will causes
guest errors such as infinite-loop (soft lockup will occur in result), because of
calling the inappropriate fault handler. So, a solution that can strictly
distinguish conditions is introduced in PATCH 3/3.
For the series:

Acked-by: Will Deacon <will@xxxxxxxxxx>

Thanks for reporting these, helping me to understand the issues and then
spinning a v2 so promptly.

Will
.

Thanks for the help and suggestions.

BTW: there are two more things below that I want to talk about.

1.  Recently, I have been focusing on the ARMv8.4-TTRem feature which is aimed at changing block size in stage 2 mapping.

I have a plan to implement this feature for stage 2 translation when splitting a block into tables or merging tables into a block.

This feature supports changing block size without performing *break-before-make*, which might have some improvement on performance.

What do you think about this?


2. Given that the issues we discussed before were found in practice when guest state changes from dirty logging to dirty logging canceled.

I could add a test file testing on this case to selftests/ or kvm unit tests/, if it's necessary.


Yanan