Re: [PATCH V4 4/5] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor

From: Dmitry Baryshkov
Date: Tue Dec 17 2024 - 07:10:50 EST


On Tue, Dec 17, 2024 at 04:35:15PM +0530, Sibi Sankar wrote:
>
>
> On 12/17/24 16:16, Dmitry Baryshkov wrote:
> > On Tue, Dec 17, 2024 at 03:46:24PM +0530, Sibi Sankar wrote:
> > >
> > >
> > > On 12/5/24 17:00, Dmitry Baryshkov wrote:
> > > > On Thu, 5 Dec 2024 at 12:53, Sibi Sankar <quic_sibis@xxxxxxxxxxx> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 11/14/24 18:02, Dmitry Baryshkov wrote:
> > > > > > On Thu, Nov 14, 2024 at 09:43:53AM +0530, Sibi Sankar wrote:
> > > > > > >
> > > > > > >
> > > > > > > On 10/26/24 23:46, Dmitry Baryshkov wrote:
> > > > > > > > On Tue, Oct 22, 2024 at 01:48:25PM +0530, Sibi Sankar wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On 10/7/24 23:27, Dmitry Baryshkov wrote:
> > > > > > > > > > On Mon, Oct 07, 2024 at 11:40:22AM GMT, Sibi Sankar wrote:
> > > > > >
> > > > > > > > > >
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +struct map_param_msg {
> > > > > > > > > > > + u32 hw_type;
> > > > > > > > > > > + u32 mon_idx;
> > > > > > > > > > > + u32 nr_rows;
> > > > > > > > > > > + struct map_table tbl[MAX_MAP_ENTRIES];
> > > > > > > > > > > +} __packed;
> > > > > > > > > > > +
> > > > > > > > > > > +struct node_msg {
> > > > > > > > > > > + u32 cpumask;
> > > > > > > > > > > + u32 hw_type;
> > > > > > > > > > > + u32 mon_type;
> > > > > > > > > > > + u32 mon_idx;
> > > > > > > > > > > + char mon_name[MAX_NAME_LEN];
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +struct scalar_param_msg {
> > > > > > > > > > > + u32 hw_type;
> > > > > > > > > > > + u32 mon_idx;
> > > > > > > > > > > + u32 val;
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +enum common_ev_idx {
> > > > > > > > > > > + INST_IDX,
> > > > > > > > > > > + CYC_IDX,
> > > > > > > > > > > + CONST_CYC_IDX,
> > > > > > > > > > > + FE_STALL_IDX,
> > > > > > > > > > > + BE_STALL_IDX,
> > > > > > > > > > > + NUM_COMMON_EVS
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +enum grp_ev_idx {
> > > > > > > > > > > + MISS_IDX,
> > > > > > > > > > > + WB_IDX,
> > > > > > > > > > > + ACC_IDX,
> > > > > > > > > > > + NUM_GRP_EVS
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +#define EV_CPU_CYCLES 0
> > > > > > > > > > > +#define EV_INST_RETIRED 2
> > > > > > > > > > > +#define EV_L2_D_RFILL 5
> > > > > > > > > > > +
> > > > > > > > > > > +struct ev_map_msg {
> > > > > > > > > > > + u32 num_evs;
> > > > > > > > > > > + u32 hw_type;
> > > > > > > > > > > + u32 cid[NUM_COMMON_EVS];
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +struct cpufreq_memfreq_map {
> > > > > > > > > > > + unsigned int cpufreq_mhz;
> > > > > > > > > > > + unsigned int memfreq_khz;
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +struct scmi_monitor_info {
> > > > > > > > > > > + struct cpufreq_memfreq_map *freq_map;
> > > > > > > > > > > + char mon_name[MAX_NAME_LEN];
> > > > > > > > > > > + u32 mon_idx;
> > > > > > > > > > > + u32 mon_type;
> > > > > > > > > > > + u32 ipm_ceil;
> > > > > > > > > > > + u32 mask;
> > > > > > > > > > > + u32 freq_map_len;
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +struct scmi_memory_info {
> > > > > > > > > > > + struct scmi_monitor_info *monitor[MAX_MONITOR_CNT];
> > > > > > > > > > > + u32 hw_type;
> > > > > > > > > > > + int monitor_cnt;
> > > > > > > > > > > + u32 min_freq;
> > > > > > > > > > > + u32 max_freq;
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +struct scmi_memlat_info {
> > > > > > > > > > > + struct scmi_protocol_handle *ph;
> > > > > > > > > > > + const struct qcom_generic_ext_ops *ops;
> > > > > > > > > > > + struct scmi_memory_info *memory[MAX_MEMORY_TYPES];
> > > > > > > > > > > + u32 cluster_info[NR_CPUS];
> > > > > > > > > > > + int memory_cnt;
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > +static int populate_cluster_info(u32 *cluster_info)
> > > > > > > > > > > +{
> > > > > > > > > > > + char name[MAX_NAME_LEN];
> > > > > > > > > > > + int i = 0;
> > > > > > > > > > > +
> > > > > > > > > > > + struct device_node *cn __free(device_node) = of_find_node_by_path("/cpus");
> > > > > > > > > > > + if (!cn)
> > > > > > > > > > > + return -ENODEV;
> > > > > > > > > > > +
> > > > > > > > > > > + struct device_node *map __free(device_node) = of_get_child_by_name(cn, "cpu-map");
> > > > > > > > > > > + if (!map)
> > > > > > > > > > > + return -ENODEV;
> > > > > > > > > > > +
> > > > > > > > > > > + do {
> > > > > > > > > > > + snprintf(name, sizeof(name), "cluster%d", i);
> > > > > > > > > > > + struct device_node *c __free(device_node) = of_get_child_by_name(map, name);
> > > > > > > > > > > + if (!c)
> > > > > > > > > > > + break;
> > > > > > > > > > > +
> > > > > > > > > > > + *(cluster_info + i) = of_get_child_count(c);
> > > > > > > > > > > + i++;
> > > > > > > > > > > + } while (1);
> > > > > > > > > >
> > > > > > > > > > Can you use existing API from drivers/base/arch_topology.c? If not, can
> > > > > > > > > > it be extended to support your usecase?
> > > > > > > > >
> > > > > > > > > ack. But I'm pretty sure it's going to take a while for reaching such
> > > > > > > > > an agreement so I'll drop this feature during the next re-spin.
> > > > > > > >
> > > > > > > > Why? What kind of API do you actually need? The arch_topology.c simply
> > > > > > > > exports a table of struct cpu_topology. Is it somehow different from
> > > > > > > > what you are parsing manually?
> > > > > > >
> > > > > > > yup, we had to figure out the physical id of the cpu
> > > > > > > since cpus can be disabled by the bootloader using
> > > > > > > status = "failed" property and we have to pass this
> > > > > > > onto the cpucp memlat algorithm.
> > > > > >
> > > > > > Isn't it equal to the index in the cpu_topology table?
> > > > >
> > > > > from what I see cpu_topology indexes remain unpopulated
> > > > > for cpus that are disabled since get_cpu_for_node
> > > > > ignores those?
> > > >
> > > > Why do you need cpu_topology for disabled aka non-existing CPU devices?
> > >
> > > sorry was out sick couldn't back earlier. We need the know
> > > what cpus are disbled to pass on the correct mask of cpus
> > > enabled to the SCP.
> >
> > Yes. So isn't it enough to know only the enabled CPUs?
>
> yes just knowing the physical index of the enabled cpus
> should be enough.

Then exiting cpu_topology is enough for your case, isn't it?

--
With best wishes
Dmitry