Re: [PATCH V2 2/3] arch_topology: Avoid use-after-free for scale_freq_data

From: Viresh Kumar
Date: Wed Jun 16 2021 - 23:06:20 EST


On 16-06-21, 13:00, Ionela Voinescu wrote:
> I would agree if it wasn't for the fact that the driver provides the
> set_freq_scale() implementation that ends up using driver internal data
> which could have been freed by the driver's own .exit()/stop_cpu()
> callbacks. The API and the generic implementation has the responsibility
> of making sure of sane access to its own structures.

How do you see timer core or workqueue core then ? Why do they make
sure they don't end up calling user's function once we ask them not to
?

And also scheduler's own util update mechanism, the exact thing
happens there. Users (cpufreq governors) call
cpufreq_add_update_util_hook() and cpufreq_remove_update_util_hook()
to pass their own data structure "struct update_util_data", which has
ia callback within. This is what happens from scheduler's context in
those cases.

static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
{
struct update_util_data *data;

data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
cpu_of(rq)));
if (data)
data->func(data, rq_clock(rq), flags);
}

Also note that some kind of synchronisation mechanism is indeed
required between topology_set_scale_freq_source() and
topology_clear_scale_freq_source(), there is no race there today,
sure, but this is an API which is made generic.

> Even if we would want to keep drivers from shooting themselves in the
> foot, I would prefer we postpone it until we have more users for this,
> before we add any synchronisation mechanisms to functionality called
> on the tick.

The rcu mechanism is very much used in the scheduler itself because it
is lightweight. Honestly I don't even see any other way (w.r.t.
locking) users can fix it at their end. They don't know which was the
last tick that used their callback.

> Let's see if there's a less invasive solution to fix CPPC for now, what
> do you think?

For me, this change is required in the API despite how CPPC ends up
using it. Else we are failing at defining the API itself IMHO.

--
viresh