[AMD Official Use Only - General]
Could the attached patch help?
Evan
-----Original Message-----
From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of ???
Sent: Friday, November 18, 2022 5:25 PM
To: Michel Dänzer <michel.daenzer@xxxxxxxxxxx>; Koenig, Christian
<Christian.Koenig@xxxxxxx>; Deucher, Alexander
<Alexander.Deucher@xxxxxxx>
Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Pan, Xinhui <Xinhui.Pan@xxxxxxx>;
linux-kernel@xxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH] drm/amdgpu: add mb for si
在 2022/11/18 17:18, Michel Dänzer 写道:
On 11/18/22 09:01, Christian König wrote:need to give a detailed explanation why this is necessary.
Am 18.11.22 um 08:48 schrieb Zhenneng Li:
During reboot test on arm64 platform, it may failure on boot, so addMemory barries are not supposed to be sprinkled around like this, you
this mb in smc.
The error message are as follows:
[ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init
[amdgpu]] *ERROR*
late_init of IP block <si_dpm> failed -22 [
7.006919][ 7] [ T295] amdgpu 0000:04:00.0:
amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] amdgpu
0000:04:00.0: Fatal error during GPU init
affect the values of rst & clk.Regards,In particular, it makes no sense in this specific place, since it cannot directly
Christian.
Signed-off-by: Zhenneng Li <lizhenneng@xxxxxxxxxx>
---
drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
index 8f994ffa9cd1..c7656f22278d 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
@@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
amdgpu_device *adev)
u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
+ mb();
+
if (!(rst & RST_REG) && !(clk & CK_DISABLE))
return true;
I thinks so too.
But when I do reboot test using nine desktop machines, there maybe report
this error on one or two machines after Hundreds of times or Thousands of
times reboot test, at the beginning, I use msleep() instead of mb(), these
two methods are all works, but I don't know what is the root case.
I use this method on other verdor's oland card, this error message are
reported again.
What could be the root reason?
test environmen:
graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87
driver: amdgpu
os: ubuntu 2004
platform: arm64
kernel: 5.4.18