Re: [RFC 02/14] x86/apic: Initialize Secure AVIC APIC backing page

From: Neeraj Upadhyay
Date: Wed Oct 09 2024 - 13:53:28 EST




On 10/9/2024 10:33 PM, Dave Hansen wrote:
> On 10/9/24 09:31, Neeraj Upadhyay wrote:
>>> Second, this looks to be allocating a potentially large physically
>>> contiguous chunk of memory, then handing it out 4k at a time. The loop is:
>>>
>>> buf = alloc(NR_CPUS * PAGE_SIZE);
>>> for (i = 0; i < NR_CPUS; i++)
>>> foo[i] = buf + i * PAGE_SIZE;
>>>
>>> but could be:
>>>
>>> for (i = 0; i < NR_CPUS; i++)
>>> foo[i] = alloc(PAGE_SIZE);
>>>
>>> right?
>>
>> Single contiguous allocation is done here to avoid TLB impact due to backing page
>> accesses (e.g. sending ipi requires writing to target CPU's backing page).
>> I can change it to allocation in chunks of size 2M instead of one big allocation.
>> Is that fine? Also, as described in commit message, reserving entire 2M chunk
>> for backing pages also prevents splitting of NPT entries into individual 4K entries.
>> This can happen if part of a 2M page is not allocated for backing pages by guest
>> and page state change (from private to shared) is done for that part.
>
> Ick.
>
> First, this needs to be thoroughly commented, not in the changelogs.
>

Ok.

> Second, this is premature optimization at its finest. Just imagine if
> _every_ site that needed 16k or 32k of shared memory decided to allocate
> a 2M chunk for this _and_ used it sparsely. What's the average number
> of vCPUs in a guest. 4? 8?
>

Got it.

> The absolute minimum that we can do here is some stupid infrastructure
> that you call for allocating shared pages, or for things that _will_ be
> converted to shared so they get packed.
>
> But hacking uncommented 2M allocations into every site seems like
> insanity to me.
>
> IMNHO, you can either invest the time to put the infrastructure in place
> and get 2M pages, or you can live with the suboptimal performance of 4k.

I will start with 4K. For later, I will get the performance numbers to propose
a change in allocation scheme - for ex, allocating a bigger contiguous
batch from the total allocation required for backing pages (num_possible_cpus() * 4K)
without doing 2M reservation.


- Neeraj