Re: [PATCH v4 3/5] memory: tegra186-emc: Support non-bpmp icc scaling

From: Aaron Kling

Date: Tue Nov 11 2025 - 18:17:53 EST


On Tue, Nov 11, 2025 at 3:29 PM Jon Hunter <jonathanh@xxxxxxxxxx> wrote:
>
>
> On 11/11/2025 17:04, Aaron Kling wrote:
>
> ...
>
> > My setup uses the boot stack from L4T r32.7.6, though cboot is source
> > built and has had changes over time to support newer Android versions.
> > There shouldn't be anything there that would affect emc clock, though.
> >
> > I'm seeing the emc clock stay at the boot value, namely 1600MHz. Per
> > both debugfs clk/emc/clk_rate and bpmp/debug/clk/emc/rate. I don't
> > even see 250MHz as an option. Debugfs emc/available_rates lists 204MHz
> > as the closest entry.
> >
> > I'm trying to think what could cause a drop in the selected clock
> > rate. This patch should only dynamically change the rate if the opp
> > tables exist, enabling the cpufreq based scaling via icc. But those
> > tables don't exist on linux-next right now. My test ramdisk does
> > nothing except set up sysfs/procfs/etc just enough to run a busybox
> > shell for debugging. Do the Nvidia regression testing boot scripts do
> > anything to sysfs or debugfs that would affect emc?
>
> So this is definitely coming from ICC. On boot I see a request for
> 250MHz coming from the PCIe driver ...
>
> [ 13.861227] tegra186_emc_icc_set_bw-356: rate 250000000
> [ 13.861350] CPU: 1 UID: 0 PID: 68 Comm: kworker/u32:1 Not tainted 6.18.0-rc4-next-20251110-00001-gfc12493c80fb-dirty #9 PREEMPT
> [ 13.861362] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
> [ 13.861370] Workqueue: events_unbound deferred_probe_work_func
> [ 13.861388] Call trace:
> [ 13.861393] show_stack+0x18/0x24 (C)
> [ 13.861407] dump_stack_lvl+0x74/0x8c
> [ 13.861419] dump_stack+0x18/0x24
> [ 13.861426] tegra186_emc_icc_set_bw+0xc8/0x14c
> [ 13.861438] apply_constraints+0x70/0xb0
> [ 13.861451] icc_set_bw+0x88/0x128
> [ 13.861461] tegra_pcie_icc_set+0x7c/0x10c [pcie_tegra194]
> [ 13.861499] tegra_pcie_dw_start_link+0x178/0x2b0 [pcie_tegra194]
> [ 13.861510] dw_pcie_host_init+0x664/0x6e0
> [ 13.861523] tegra_pcie_dw_probe+0x6d4/0xbfc [pcie_tegra194]
> [ 13.861534] platform_probe+0x5c/0x98
> [ 13.861547] really_probe+0xbc/0x2a8
> [ 13.861555] __driver_probe_device+0x78/0x12c
> [ 13.861563] driver_probe_device+0x3c/0x15c
> [ 13.861572] __device_attach_driver+0xb8/0x134
> [ 13.861580] bus_for_each_drv+0x84/0xe0
> [ 13.861588] __device_attach+0x9c/0x188
> [ 13.861596] device_initial_probe+0x14/0x20
> [ 13.861610] bus_probe_device+0xac/0xb0
> [ 13.861619] deferred_probe_work_func+0x88/0xc0
> [ 13.861627] process_one_work+0x148/0x28c
> [ 13.861640] worker_thread+0x2d0/0x3d8
> [ 13.861648] kthread+0x128/0x200
> [ 13.861659] ret_from_fork+0x10/0x20
>
> The actual rate that is set is 408MHz if I read the rate after
> it is set ...
>
> [ 13.912099] tegra186_emc_icc_set_bw-362: rate 408000000
>
> This is a simple boot test and so nothing we are doing via
> debugfs/sysfs to influence this.

Alright, I think I've got the picture of what's going on now. The
standard arm64 defconfig enables the t194 pcie driver as a module. And
my simple busybox ramdisk that I use for mainline regression testing
isn't loading any modules. If I set the pcie driver to built-in, I
replicate the issue. And I don't see the issue on my normal use case,
because I have the dt changes as well.

So it appears that the pcie driver submits icc bandwidth. And without
cpufreq submitting bandwidth as well, the emc driver gets a very low
number and thus sets a very low emc freq. The question becomes... what
to do about it? If the related dt changes were submitted to
linux-next, everything should fall into place. And I'm not sure where
this falls on the severity scale since it doesn't full out break boot
or prevent operation.

Aaron