Re: [Qemu-devel] [PATCH v9 04/12] vfio iommu: Add support for mediated devices

From: Kirti Wankhede
Date: Tue Nov 01 2016 - 23:30:57 EST




On 11/2/2016 6:54 AM, Alexey Kardashevskiy wrote:
> On 02/11/16 01:01, Kirti Wankhede wrote:
>>
>>
>> On 10/28/2016 7:48 AM, Alexey Kardashevskiy wrote:
>>> On 27/10/16 23:31, Kirti Wankhede wrote:
>>>>
>>>>
>>>> On 10/27/2016 12:50 PM, Alexey Kardashevskiy wrote:
>>>>> On 18/10/16 08:22, Kirti Wankhede wrote:
>>>>>> VFIO IOMMU drivers are designed for the devices which are IOMMU capable.
>>>>>> Mediated device only uses IOMMU APIs, the underlying hardware can be
>>>>>> managed by an IOMMU domain.
>>>>>>
>>>>>> Aim of this change is:
>>>>>> - To use most of the code of TYPE1 IOMMU driver for mediated devices
>>>>>> - To support direct assigned device and mediated device in single module
>>>>>>
>>>>>> Added two new callback functions to struct vfio_iommu_driver_ops. Backend
>>>>>> IOMMU module that supports pining and unpinning pages for mdev devices
>>>>>> should provide these functions.
>>>>>> Added APIs for pining and unpining pages to VFIO module. These calls back
>>>>>> into backend iommu module to actually pin and unpin pages.
>>>>>>
>>>>>> This change adds pin and unpin support for mediated device to TYPE1 IOMMU
>>>>>> backend module. More details:
>>>>>> - When iommu_group of mediated devices is attached, task structure is
>>>>>> cached which is used later to pin pages and page accounting.
>>>>>
>>>>>
>>>>> For SPAPR TCE IOMMU driver, I ended up caching mm_struct with
>>>>> atomic_inc(&container->mm->mm_count) (patches are on the way) instead of
>>>>> using @current or task as the process might be gone while VFIO container is
>>>>> still alive and @mm might be needed to do proper cleanup; this might not be
>>>>> an issue with this patchset now but still you seem to only use @mm from
>>>>> task_struct.
>>>>>
>>>>
>>>> Consider the example of QEMU process which creates VFIO container, QEMU
>>>> in its teardown path would release the container. How could container be
>>>> alive when process is gone?
>>>
>>> do_exit() in kernel/exit.c calls exit_mm() (which sets NULL to tsk->mm)
>>> first, and then releases open files by calling exit_files(). So
>>> container's release() does not have current->mm.
>>>
>>
>> Incrementing usage count (get_task_struct()) while saving task structure
>> and decementing it (put_task_struct()) from release() should work here.
>> Updating the patch.
>
> I cannot see how the task->usage counter prevents do_exit() from performing
> the exit, can you?
>

It will not prevent exit from do_exit(), but that will make sure that we
don't have stale pointer of task structure. Then we can check whether
the task is alive and get mm pointer in teardown path as below:

{
struct task_struct *task = domain->external_addr_space->task;
struct mm_struct *mm = NULL;

put_pfn(pfn, prot);

if (pid_alive(task))
mm = get_task_mm(task);

if (mm) {
if (do_accounting)
vfio_lock_acct(task, -1);

mmput(mm);
}
}

Thanks,
Kirti