Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
From: Dmitry Osipenko
Date: Mon Aug 15 2022 - 09:46:06 EST
On 8/15/22 16:06, Christian König wrote:
> Am 15.08.22 um 13:50 schrieb Dmitry Osipenko:
>> On 8/15/22 14:28, Christian König wrote:
>>>>>> Maybe it was discussed privately? In this case I will be happy to get
>>>>>> more info from you about the root of the problem so I could start to
>>>>>> look at how to fix it properly. It's not apparent where the
>>>>>> problem is
>>>>>> to a TTM newbie like me.
>>>>>>
>>>>> Well this is completely unfixable. See the whole purpose of TTM is to
>>>>> allow tracing where what is mapped of a buffer object.
>>>>>
>>>>> If you circumvent that and increase the page reference yourself than
>>>>> that whole functionality can't work correctly any more.
>>>> Are you suggesting that the problem is that TTM doesn't see the KVM
>>>> page
>>>> faults/mappings?
>>> Yes, and no. It's one of the issues, but there is more behind that (e.g.
>>> what happens when TTM switches from pages to local memory for backing a
>>> BO).
>> If KVM page fault could reach TTM, then it should be able to relocate
>> BO. I see now where is the problem, thanks. Although, I'm wondering
>> whether it already works somehow.. I'll try to play with the the AMDGPU
>> shrinker and see what will happen on guest mapping of a relocated BO.
>
> Well the page fault already somehow reaches TTM, otherwise the pfn
> couldn't be filled in in the first place.
>
> The issues is more that KVM should never ever grab a page reference to
> pages mapped with VM_IO or VM_PFNMAP.
>
> Essentially we need to apply the same restriction as with
> get_user_pages() here.
>
>>> Another question is why is KVM accessing the page structure in the first
>>> place? The VMA is mapped with VM_PFNMAP and VM_IO, KVM should never ever
>>> touch any of those pages.
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.19%2Fsource%2Fvirt%2Fkvm%2Fkvm_main.c%23L2549&data=05%7C01%7Cchristian.koenig%40amd.com%7C2f38c27f20f842fc582a08da7eb4580d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637961610314049167%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Pu5F1EF9UvDPdOQ7sjJ1WDRt5XpFZmAMXdkexnDpEmU%3D&reserved=0
>>
>
> Well that comment sounds like KVM is doing the right thing, so I'm
> wondering what exactly is going on here.
KVM actually doesn't hold the page reference, it takes the temporal
reference during page fault and then drops the reference once page is
mapped, IIUC. Is it still illegal for TTM? Or there is a possibility for
a race condition here?
--
Best regards,
Dmitry