Re: [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support
From: Sathyanarayanan Kuppuswamy
Date: Mon Mar 27 2023 - 22:50:41 EST
Hi Kai,
On 3/27/23 7:38 PM, Huang, Kai wrote:
>> +/* Reserve an IRQ from x86_vector_domain for TD event notification */
>> +static int __init tdx_event_irq_init(void)
>> +{
>> + struct irq_alloc_info info;
>> + cpumask_t saved_cpumask;
>> + struct irq_cfg *cfg;
>> + int cpu, irq;
>> +
>> + if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
>> + return 0;
>> +
>> + init_irq_alloc_info(&info, NULL);
>> +
>> + /*
>> + * Event notification vector will be delivered to the CPU
>> + * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
>> + * So set the IRQ affinity to the current CPU.
>> + */
>> + cpu = get_cpu();
>> + cpumask_copy(&saved_cpumask, current->cpus_ptr);
>> + info.mask = cpumask_of(cpu);
>> + put_cpu();
> The 'saved_cpumask' related code is ugly. If you move put_cpu() to the end of
> this function, I think you can remove all related code:
>
> cpu = get_cpu();
>
> /*
> * Set @info->mask to local cpu to make sure a valid vector is
> * pre-allocated when TDX event notification IRQ is allocated
> * from x86_vector_domain.
> */
> init_irq_alloc_info(&info, cpumask_of(cpu));
>
> // rest staff: request_irq(), hypercall ...
>
> put_cpu();
>
init_irq_alloc_info() is a sleeping function. Since get_cpu() disables
preemption, we cannot call sleeping function after it. Initially, I
have implemented it like you have mentioned. However, I discovered the
following error.
[ 2.400755] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
[ 2.404664] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
[ 2.408671] preempt_count: 1, expected: 0
[ 2.412650] RCU nest depth: 0, expected: 0
[ 2.412666] no locks held by swapper/0/1.
[ 2.416650] Preemption disabled at:
[ 2.416650] [<ffffffff83b8089f>] tdx_arch_init+0x38/0x117
[ 2.420670] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc4-00117-g672ca073d9f9-dirty #2527
[ 2.424650] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 2.424650] Call Trace:
[ 2.424650] <TASK>
[ 2.424650] dump_stack_lvl+0x6a/0x86
[ 2.424650] __might_resched.cold+0xf4/0x12f
[ 2.424650] __mutex_lock+0x50/0x810
[ 2.424650] ? lock_is_held_type+0xd8/0x130
[ 2.424650] ? __irq_alloc_descs+0xcf/0x310
[ 2.424650] ? find_held_lock+0x2b/0x80
[ 2.424650] ? __irq_alloc_descs+0xcf/0x310
[ 2.424650] __irq_alloc_descs+0xcf/0x310
[ 2.424650] irq_domain_alloc_descs.part.0+0x49/0xa0
[ 2.424650] __irq_domain_alloc_irqs+0x2a0/0x4f0
[ 2.424650] ? next_arg+0x129/0x1f0
[ 2.424650] ? tdx_guest_init+0x5b/0x5b
[ 2.424650] tdx_arch_init+0x8e/0x117
[ 2.424650] do_one_initcall+0x137/0x2ec
[ 2.424650] ? rcu_read_lock_sched_held+0x36/0x60
[ 2.424650] kernel_init_freeable+0x1e3/0x241
[ 2.424650] ? rest_init+0x1a0/0x1a0
[ 2.424650] kernel_init+0x17/0x170
[ 2.424650] ? rest_init+0x1a0/0x1a0
[ 2.424650] ret_from_fork+0x1f/0x30
[ 2.424650] </TASK>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer