Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT
From: Yan Zhao
Date: Sat Sep 14 2024 - 06:02:45 EST
> > ===Resources & users list===
> >
> > Resources SHARED users EXCLUSIVE users
> > ------------------------------------------------------------------------
> > (1) TDR tdh_mng_rdwr tdh_mng_create
> > tdh_vp_create tdh_mng_add_cx
> > tdh_vp_addcx tdh_mng_init
> > tdh_vp_init tdh_mng_vpflushdone
> > tdh_vp_enter tdh_mng_key_config
> > tdh_vp_flush tdh_mng_key_freeid
> > tdh_vp_rd_wr tdh_mr_extend
> > tdh_mem_sept_add tdh_mr_finalize
> > tdh_mem_sept_remove tdh_vp_init_apicid
> > tdh_mem_page_aug tdh_mem_page_add
> > tdh_mem_page_remove
> > tdh_mem_range_block
> > tdh_mem_track
> > tdh_mem_range_unblock
> > tdh_phymem_page_reclaim
>
> In pamt_walk() it calls promote_sharex_lock_hp() with the lock type passed into
> pamt_walk(), and tdh_phymem_page_reclaim() passed TDX_LOCK_EXCLUSIVE. So that is
> an exclusive lock. But we can ignore it because we only do reclaim at TD tear
> down time?
Hmm, if the page to reclaim is not a TDR page, lock_and_map_implicit_tdr() is
called to lock the page's corresponding TDR page with SHARED lock.
if the page to reclaim is a TDR page, it's indeed locked with EXCLUSIVE.
But in pamt_walk() it calls promote_sharex_lock_hp() for the passed in
TDX_LOCK_EXCLUSIVE only when
if ((pamt_1gb->pt == PT_REG) || (target_size == PT_1GB)) or
if ((pamt_2mb->pt == PT_REG) || (target_size == PT_2MB))
"pamt_1gb->pt == PT_REG" (or "pamt_2mb->pt == PT_REG)") is true when it's
assigned (not PT_NDA) and is a normal page (i.e. not TDR, TDVPR...).
This is true only after tdh_mem_page_add()/tdh_mem_page_aug() assigns the page
to a TD with huge page size.
This will not happen for a TDR page.
For normal pages when huge page is supported in future, looks we need to
update tdh_phymem_page_reclaim() to include size info too.
>
> Separately, I wonder if we should try to add this info as comments around the
> SEAMCALL implementations. The locking is not part of the spec, but never-the-
> less the kernel is being coded against these assumptions. So it can sort of be
> like "the kernel assumes this" and we can at least record what the reason was.
> Or maybe just comment the parts that KVM assumes.
Agreed.