Re: [PATCH v5 3/4] arm64: topology: Support SMT control on ACPI based system

From: Morten Rasmussen
Date: Thu Sep 12 2024 - 07:59:34 EST


On Fri, Sep 06, 2024 at 04:36:30PM +0800, Yicong Yang wrote:
> On 2024/9/6 15:06, Morten Rasmussen wrote:
> > Hi Yicong,
> >
> > On Thu, Sep 05, 2024 at 08:02:20PM +0800, Yicong Yang wrote:
> >> On 2024/9/5 16:34, Pierre Gondois wrote:
> >>> Hello Yicong,
> >>>
> >>> If a platform has CPUs with:
> >>> - 1 thread
> >>> - X (!= 1) threads
> >>> Then I think that the asymmetry is not detected
> >>
> >> Ah ok, I only handle the case where there are several thread numbers except no SMT CPUs in the
> >> system. For this case I was thinking we don't need to handle this since there's only one kind
> >> of SMT core in the system, control should works fine for the SMT CPU clusters and we may not
> >> care about the CPUs with no SMT.
> >>
> >> Below code should handle the case if we initialize the max_smt_thread_num to 0. I also
> >> reword the warning messages to match the fact. For heterogeneous SMT topology we still
> >> support control SMT by on/off toggle but not fully support setting the thread number.
> >>
> >> int max_smt_thread_num = 0;
> >> [...]
> >> /*
> >> * This should be a short loop depending on the number of heterogeneous
> >> * CPU clusters. Typically on a homogeneous system there's only one
> >> * entry in the XArray.
> >> */
> >> xa_for_each(&hetero_cpu, hetero_id, entry) {
> >> /*
> >> * If max_smt_thread_num has been initialized and doesn't match
> >> * the thread number of this entry, then the system has
> >> * heterogeneous SMT topology.
> >> */
> >> if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
> >> pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
> >
> > What does 'partly supported' mean here?
> >
> > If the SMT control doesn't work as intended for this topology, I don't
> > think it should be enabled for it.
> >
>
> The /sys/devices/system/cpu/smt/control supports 2 kind of controls [1]
> (a) enable/disable SMT entirely by writing on/off
> (b) enable/disable SMT partially by writing a valid thread number (CONFIG_SMT_NUM_THREADS_DYNAMIC)
>
> We assume 3 kind of SMT topology:
> 1. homogeneous SMT topology, all the CPUs support SMT or not
> 2.1 heterogeneous SMT topology, part of CPU clusters have SMT and others not. Clusters support SMT
> have the same SMT thread number
> 2.2 heterogeneous SMT topology, part of CPU clusters have SMT and the thread number may differs
> (e.g. cluster 1 is of SMT 2 and cluster 2 is of SMT 4. not sure there's such a real system)
>
> For any of above SMT topology, control (a) should work as expected. When enabling SMT by writing "on"
> the SMT is disabled for those CPU clusters who have SMT. Same for disabling SMT by writing "off".
> But control (b) may not work for case 2.2 since the SMT thread number is not the same across
> the system.
>
> For this series alone we won't met this since CONFIG_SMT_NUM_THREADS_DYNAMIC is not enabled.
> So control (b) only supports write 1/max_thread which behaves same as write off/on and will
> work as intended for all the 3 cases. As discussed Pierre will add support for
> CONFIG_SMT_NUM_THREADS_DYNAMIC since thunderX2 is a symmetric SMT4 machine and
> CONFIG_SMT_NUM_THREADS_DYNAMIC would be useful. We thought a warning should be useful
> if running on a system of case 2.2.

Thanks for explaining the situation.

So IIUC, for case 2.2 there will be _no_ failures if someone writes a
value different from 1 or max_threads?

The SMT control code can handle that max_threads isn't the correct
number of threads for all cores in the system?