Re: [PATCH] drm/amdgpu: cache in more vm fault information

From: Alex Deucher
Date: Wed Mar 06 2024 - 11:15:43 EST


On Wed, Mar 6, 2024 at 11:06 AM Khatri, Sunil <sukhatri@xxxxxxx> wrote:
>
>
> On 3/6/2024 9:07 PM, Christian König wrote:
> > Am 06.03.24 um 16:13 schrieb Khatri, Sunil:
> >>
> >> On 3/6/2024 8:34 PM, Christian König wrote:
> >>> Am 06.03.24 um 15:29 schrieb Alex Deucher:
> >>>> On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil <sukhatri@xxxxxxx> wrote:
> >>>>>
> >>>>> On 3/6/2024 6:12 PM, Christian König wrote:
> >>>>>> Am 06.03.24 um 11:40 schrieb Khatri, Sunil:
> >>>>>>> On 3/6/2024 3:37 PM, Christian König wrote:
> >>>>>>>> Am 06.03.24 um 10:04 schrieb Sunil Khatri:
> >>>>>>>>> When an page fault interrupt is raised there
> >>>>>>>>> is a lot more information that is useful for
> >>>>>>>>> developers to analyse the pagefault.
> >>>>>>>> Well actually those information are not that interesting because
> >>>>>>>> they are hw generation specific.
> >>>>>>>>
> >>>>>>>> You should probably rather use the decoded strings here, e.g. hub,
> >>>>>>>> client, xcc_id, node_id etc...
> >>>>>>>>
> >>>>>>>> See gmc_v9_0_process_interrupt() an example.
> >>>>>>>> I saw this v9 does provide more information than what v10 and v11
> >>>>>>>> provide like node_id and fault from which die but thats again very
> >>>>>>>> specific to IP_VERSION(9, 4, 3)) i dont know why thats information
> >>>>>>>> is not there in v10 and v11.
> >>>>>>> I agree to your point but, as of now during a pagefault we are
> >>>>>>> dumping this information which is useful like which client
> >>>>>>> has generated an interrupt and for which src and other information
> >>>>>>> like address. So i think to provide the similar information in the
> >>>>>>> devcoredump.
> >>>>>>>
> >>>>>>> Currently we do not have all this information from either job or vm
> >>>>>>> being derived from the job during a reset. We surely could add more
> >>>>>>> relevant information later on as per request but this
> >>>>>>> information is
> >>>>>>> useful as
> >>>>>>> eventually its developers only who would use the dump file provided
> >>>>>>> by customer to debug.
> >>>>>>>
> >>>>>>> Below is the information that i dump in devcore and i feel that is
> >>>>>>> good information but new information could be added which could be
> >>>>>>> picked later.
> >>>>>>>
> >>>>>>>> Page fault information
> >>>>>>>> [gfxhub] page fault (src_id:0 ring:24 vmid:3 pasid:32773)
> >>>>>>>> in page starting at address 0x0000000000000000 from client 0x1b
> >>>>>>>> (UTCL2)
> >>>>>> This is a perfect example what I mean. You record in the patch is
> >>>>>> the
> >>>>>> client_id, but this is is basically meaningless unless you have
> >>>>>> access
> >>>>>> to the AMD internal hw documentation.
> >>>>>>
> >>>>>> What you really need is the client in decoded form, in this case
> >>>>>> UTCL2. You can keep the client_id additionally, but the decoded
> >>>>>> client
> >>>>>> string is mandatory to have I think.
> >>>>>>
> >>>>>> Sure i am capturing that information as i am trying to minimise the
> >>>>>> memory interaction to minimum as we are still in interrupt context
> >>>>>> here that why i recorded the integer information compared to
> >>>>>> decoding
> >>>>> and writing strings there itself but to postpone till we dump.
> >>>>>
> >>>>> Like decoding to the gfxhub/mmhub based on vmhub/vmid_src and client
> >>>>> string from client id. So are we good to go with the information with
> >>>>> the above information of sharing details in devcoredump using the
> >>>>> additional information from pagefault cached.
> >>>> I think amdgpu_vm_fault_info() has everything you need already (vmhub,
> >>>> status, and addr). client_id and src_id are just tokens in the
> >>>> interrupt cookie so we know which IP to route the interrupt to. We
> >>>> know what they will be because otherwise we'd be in the interrupt
> >>>> handler for a different IP. I don't think ring_id has any useful
> >>>> information in this context and vmid and pasid are probably not too
> >>>> useful because they are just tokens to associate the fault with a
> >>>> process. It would be better to have the process name.
> >>
> >> Just to share context here Alex, i am preparing this for devcoredump,
> >> my intention was to replicate the information which in KMD we are
> >> sharing in Dmesg for page faults. If assuming we do not add client id
> >> specially we would not be able to share enough information in
> >> devcoredump.
> >> It would be just address and hub(gfxhub/mmhub) and i think that is
> >> partial information as src id and client id and ip block shares good
> >> information.
> >>
> >> For process related information we are capturing that information
> >> part of dump from existing functionality.
> >> **** AMDGPU Device Coredump ****
> >> version: 1
> >> kernel: 6.7.0-amd-staging-drm-next
> >> module: amdgpu
> >> time: 45.084775181
> >> process_name: soft_recovery_p PID: 1780
> >>
> >> Ring timed out details
> >> IP Type: 0 Ring Name: gfx_0.0.0
> >>
> >> Page fault information
> >> [gfxhub] page fault (src_id:0 ring:24 vmid:3 pasid:32773)
> >> in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
> >> VRAM is lost due to GPU reset!
> >>
> >> Regards
> >> Sunil
> >>
> >>>
> >>> The decoded client name would be really useful I think since the
> >>> fault handled is a catch all and handles a whole bunch of different
> >>> clients.
> >>>
> >>> But that should be ideally passed in as const string instead of the
> >>> hw generation specific client_id.
> >>>
> >>> As long as it's only a pointer we also don't run into the trouble
> >>> that we need to allocate memory for it.
> >>
> >> I agree but i prefer adding the client id and decoding it in
> >> devcorecump using soc15_ih_clientid_name[fault_info->client_id]) is
> >> better else we have to do an sprintf this string to fault_info in irq
> >> context which is writing more bytes to memory i guess compared to an
> >> integer:)
> >
> > Well I totally agree that we shouldn't fiddle to much in the interrupt
> > handler, but exactly what you suggest here won't work.
> >
> > The client_id is hw generation specific, so the only one who has that
> > is the hw generation specific fault handler. Just compare the defines
> > here:
> >
> > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c#L83
> >
> >
> > and here:
> >
> > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/gfxhub_v11_5_0.c#L38
> >
> >
> Got your point. Let me see but this is a lot of work in irq context.
> Either we can drop totally the client id thing as alex is suggesting
> here as its always be same client and src id or let me come up with a
> patch and see if its acceptable.
>
> Also as Alex pointed we need to decode from status register which kind
> of page fault it is (permission, read, write etc) this all is again
> family specific and thats all in IRQ context. Not feeling good about it
> but let me try to share all that in a new patch.
>

I don't think you need to decode it. As long as you have a way to
identify the chip, we can just include the raw status register and the
developer can decode it when they look at the devcoredump.

Alex


> Regards
> Sunil.
>
> > Regards,
> > Christian.
> >
> >>
> >> We can argue on values like pasid and vmid and ring id to be taken
> >> off if they are totally not useful.
> >>
> >> Regards
> >> Sunil
> >>
> >>>
> >>> Christian.
> >>>
> >>>>
> >>>> Alex
> >>>>
> >>>>> regards
> >>>>> sunil
> >>>>>
> >>>>>> Regards,
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Regards
> >>>>>>> Sunil Khatri
> >>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Christian.
> >>>>>>>>
> >>>>>>>>> Add all such information in the last cached
> >>>>>>>>> pagefault from an interrupt handler.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Sunil Khatri <sunil.khatri@xxxxxxx>
> >>>>>>>>> ---
> >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++++++--
> >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 7 ++++++-
> >>>>>>>>> drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +-
> >>>>>>>>> drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 2 +-
> >>>>>>>>> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 2 +-
> >>>>>>>>> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 2 +-
> >>>>>>>>> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +-
> >>>>>>>>> 7 files changed, 18 insertions(+), 8 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>>>>>>>> index 4299ce386322..b77e8e28769d 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>>>>>>>> @@ -2905,7 +2905,7 @@ void amdgpu_debugfs_vm_bo_info(struct
> >>>>>>>>> amdgpu_vm *vm, struct seq_file *m)
> >>>>>>>>> * Cache the fault info for later use by userspace in
> >>>>>>>>> debugging.
> >>>>>>>>> */
> >>>>>>>>> void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev,
> >>>>>>>>> - unsigned int pasid,
> >>>>>>>>> + struct amdgpu_iv_entry *entry,
> >>>>>>>>> uint64_t addr,
> >>>>>>>>> uint32_t status,
> >>>>>>>>> unsigned int vmhub)
> >>>>>>>>> @@ -2915,7 +2915,7 @@ void amdgpu_vm_update_fault_cache(struct
> >>>>>>>>> amdgpu_device *adev,
> >>>>>>>>> xa_lock_irqsave(&adev->vm_manager.pasids, flags);
> >>>>>>>>> - vm = xa_load(&adev->vm_manager.pasids, pasid);
> >>>>>>>>> + vm = xa_load(&adev->vm_manager.pasids, entry->pasid);
> >>>>>>>>> /* Don't update the fault cache if status is 0. In the
> >>>>>>>>> multiple
> >>>>>>>>> * fault case, subsequent faults will return a 0 status
> >>>>>>>>> which is
> >>>>>>>>> * useless for userspace and replaces the useful fault
> >>>>>>>>> status, so
> >>>>>>>>> @@ -2924,6 +2924,11 @@ void amdgpu_vm_update_fault_cache(struct
> >>>>>>>>> amdgpu_device *adev,
> >>>>>>>>> if (vm && status) {
> >>>>>>>>> vm->fault_info.addr = addr;
> >>>>>>>>> vm->fault_info.status = status;
> >>>>>>>>> + vm->fault_info.client_id = entry->client_id;
> >>>>>>>>> + vm->fault_info.src_id = entry->src_id;
> >>>>>>>>> + vm->fault_info.vmid = entry->vmid;
> >>>>>>>>> + vm->fault_info.pasid = entry->pasid;
> >>>>>>>>> + vm->fault_info.ring_id = entry->ring_id;
> >>>>>>>>> if (AMDGPU_IS_GFXHUB(vmhub)) {
> >>>>>>>>> vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_GFX;
> >>>>>>>>> vm->fault_info.vmhub |=
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>>>> index 047ec1930d12..c7782a89bdb5 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>>>> @@ -286,6 +286,11 @@ struct amdgpu_vm_fault_info {
> >>>>>>>>> uint32_t status;
> >>>>>>>>> /* which vmhub? gfxhub, mmhub, etc. */
> >>>>>>>>> unsigned int vmhub;
> >>>>>>>>> + unsigned int client_id;
> >>>>>>>>> + unsigned int src_id;
> >>>>>>>>> + unsigned int ring_id;
> >>>>>>>>> + unsigned int pasid;
> >>>>>>>>> + unsigned int vmid;
> >>>>>>>>> };
> >>>>>>>>> struct amdgpu_vm {
> >>>>>>>>> @@ -605,7 +610,7 @@ static inline void
> >>>>>>>>> amdgpu_vm_eviction_unlock(struct amdgpu_vm *vm)
> >>>>>>>>> }
> >>>>>>>>> void amdgpu_vm_update_fault_cache(struct amdgpu_device
> >>>>>>>>> *adev,
> >>>>>>>>> - unsigned int pasid,
> >>>>>>>>> + struct amdgpu_iv_entry *entry,
> >>>>>>>>> uint64_t addr,
> >>>>>>>>> uint32_t status,
> >>>>>>>>> unsigned int vmhub);
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> >>>>>>>>> index d933e19e0cf5..6b177ce8db0e 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> >>>>>>>>> @@ -150,7 +150,7 @@ static int gmc_v10_0_process_interrupt(struct
> >>>>>>>>> amdgpu_device *adev,
> >>>>>>>>> status = RREG32(hub->vm_l2_pro_fault_status);
> >>>>>>>>> WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
> >>>>>>>>> - amdgpu_vm_update_fault_cache(adev, entry->pasid,
> >>>>>>>>> addr,
> >>>>>>>>> status,
> >>>>>>>>> + amdgpu_vm_update_fault_cache(adev, entry, addr, status,
> >>>>>>>>> entry->vmid_src ? AMDGPU_MMHUB0(0) :
> >>>>>>>>> AMDGPU_GFXHUB(0));
> >>>>>>>>> }
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>>>>>>> index 527dc917e049..bcf254856a3e 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>>>>>>> @@ -121,7 +121,7 @@ static int gmc_v11_0_process_interrupt(struct
> >>>>>>>>> amdgpu_device *adev,
> >>>>>>>>> status = RREG32(hub->vm_l2_pro_fault_status);
> >>>>>>>>> WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
> >>>>>>>>> - amdgpu_vm_update_fault_cache(adev, entry->pasid,
> >>>>>>>>> addr,
> >>>>>>>>> status,
> >>>>>>>>> + amdgpu_vm_update_fault_cache(adev, entry, addr, status,
> >>>>>>>>> entry->vmid_src ? AMDGPU_MMHUB0(0) :
> >>>>>>>>> AMDGPU_GFXHUB(0));
> >>>>>>>>> }
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> >>>>>>>>> index 3da7b6a2b00d..e9517ebbe1fd 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> >>>>>>>>> @@ -1270,7 +1270,7 @@ static int
> >>>>>>>>> gmc_v7_0_process_interrupt(struct
> >>>>>>>>> amdgpu_device *adev,
> >>>>>>>>> if (!addr && !status)
> >>>>>>>>> return 0;
> >>>>>>>>> - amdgpu_vm_update_fault_cache(adev, entry->pasid,
> >>>>>>>>> + amdgpu_vm_update_fault_cache(adev, entry,
> >>>>>>>>> ((u64)addr) << AMDGPU_GPU_PAGE_SHIFT,
> >>>>>>>>> status, AMDGPU_GFXHUB(0));
> >>>>>>>>> if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> >>>>>>>>> index d20e5f20ee31..a271bf832312 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> >>>>>>>>> @@ -1438,7 +1438,7 @@ static int
> >>>>>>>>> gmc_v8_0_process_interrupt(struct
> >>>>>>>>> amdgpu_device *adev,
> >>>>>>>>> if (!addr && !status)
> >>>>>>>>> return 0;
> >>>>>>>>> - amdgpu_vm_update_fault_cache(adev, entry->pasid,
> >>>>>>>>> + amdgpu_vm_update_fault_cache(adev, entry,
> >>>>>>>>> ((u64)addr) << AMDGPU_GPU_PAGE_SHIFT,
> >>>>>>>>> status, AMDGPU_GFXHUB(0));
> >>>>>>>>> if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> >>>>>>>>> index 47b63a4ce68b..dc9fb1fb9540 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> >>>>>>>>> @@ -666,7 +666,7 @@ static int gmc_v9_0_process_interrupt(struct
> >>>>>>>>> amdgpu_device *adev,
> >>>>>>>>> rw = REG_GET_FIELD(status,
> >>>>>>>>> VM_L2_PROTECTION_FAULT_STATUS, RW);
> >>>>>>>>> WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
> >>>>>>>>> - amdgpu_vm_update_fault_cache(adev, entry->pasid, addr,
> >>>>>>>>> status, vmhub);
> >>>>>>>>> + amdgpu_vm_update_fault_cache(adev, entry, addr, status,
> >>>>>>>>> vmhub);
> >>>>>>>>> dev_err(adev->dev,
> >>>>>>>>> "VM_L2_PROTECTION_FAULT_STATUS:0x%08X\n",
> >>>
> >