Re: [Qemu-devel] [PATCH v9 04/12] vfio iommu: Add support for mediated devices

From: Kirti Wankhede
Date: Wed Nov 02 2016 - 08:46:24 EST

On 11/2/2016 5:51 PM, Jike Song wrote:
> On 11/02/2016 12:09 PM, Alexey Kardashevskiy wrote:
>> On 02/11/16 14:29, Kirti Wankhede wrote:
>>> On 11/2/2016 6:54 AM, Alexey Kardashevskiy wrote:
>>>> On 02/11/16 01:01, Kirti Wankhede wrote:
>>>>> On 10/28/2016 7:48 AM, Alexey Kardashevskiy wrote:
>>>>>> On 27/10/16 23:31, Kirti Wankhede wrote:
>>>>>>> On 10/27/2016 12:50 PM, Alexey Kardashevskiy wrote:
>>>>>>>> On 18/10/16 08:22, Kirti Wankhede wrote:
>>>>>>>>> VFIO IOMMU drivers are designed for the devices which are IOMMU capable.
>>>>>>>>> Mediated device only uses IOMMU APIs, the underlying hardware can be
>>>>>>>>> managed by an IOMMU domain.
>>>>>>>>> Aim of this change is:
>>>>>>>>> - To use most of the code of TYPE1 IOMMU driver for mediated devices
>>>>>>>>> - To support direct assigned device and mediated device in single module
>>>>>>>>> Added two new callback functions to struct vfio_iommu_driver_ops. Backend
>>>>>>>>> IOMMU module that supports pining and unpinning pages for mdev devices
>>>>>>>>> should provide these functions.
>>>>>>>>> Added APIs for pining and unpining pages to VFIO module. These calls back
>>>>>>>>> into backend iommu module to actually pin and unpin pages.
>>>>>>>>> This change adds pin and unpin support for mediated device to TYPE1 IOMMU
>>>>>>>>> backend module. More details:
>>>>>>>>> - When iommu_group of mediated devices is attached, task structure is
>>>>>>>>> cached which is used later to pin pages and page accounting.
>>>>>>>> For SPAPR TCE IOMMU driver, I ended up caching mm_struct with
>>>>>>>> atomic_inc(&container->mm->mm_count) (patches are on the way) instead of
>>>>>>>> using @current or task as the process might be gone while VFIO container is
>>>>>>>> still alive and @mm might be needed to do proper cleanup; this might not be
>>>>>>>> an issue with this patchset now but still you seem to only use @mm from
>>>>>>>> task_struct.
>>>>>>> Consider the example of QEMU process which creates VFIO container, QEMU
>>>>>>> in its teardown path would release the container. How could container be
>>>>>>> alive when process is gone?
>>>>>> do_exit() in kernel/exit.c calls exit_mm() (which sets NULL to tsk->mm)
>>>>>> first, and then releases open files by calling exit_files(). So
>>>>>> container's release() does not have current->mm.
>>>>> Incrementing usage count (get_task_struct()) while saving task structure
>>>>> and decementing it (put_task_struct()) from release() should work here.
>>>>> Updating the patch.
>>>> I cannot see how the task->usage counter prevents do_exit() from performing
>>>> the exit, can you?
>>> It will not prevent exit from do_exit(), but that will make sure that we
>>> don't have stale pointer of task structure. Then we can check whether
>>> the task is alive and get mm pointer in teardown path as below:
>> Or you could just reference and use @mm as KVM and others do. Or there is
>> anything else you need from @current than just @mm?
> I agree. If @mm is the only thing needed, there is really no reason to
> refer to the @task :-)

In vfio_lock_acct(), that is for page accounting, if mm->mmap_sem is
already held then page accounting is deferred, where task structure is
used to get mm and work is deferred only if mm exist:
mm = get_task_mm(task);

That is where this module need task structure.