Re: [PATCH v3 1/1] coresight: prevent deactivate active config while enabling the config

From: Mike Leach
Date: Mon Mar 24 2025 - 14:21:59 EST


Hi,

On Mon, 24 Mar 2025 at 13:59, Yeoreum Yun <yeoreum.yun@xxxxxxx> wrote:
>
> Hi Mike,
>
> Please ignore my foremer mail.. and please see my comments for your
> suggestion.
>
> > Hi
> >
> > On Fri, 14 Mar 2025 at 15:25, Yeo Reum Yun <YeoReum.Yun@xxxxxxx> wrote:
> > >
> > > Hi, Mike.
> > >
> > > > > static void cscfg_remove_owned_csdev_features(struct coresight_device *csdev, void *load_owner)
> > > > > @@ -867,6 +870,28 @@ void cscfg_csdev_reset_feats(struct coresight_device *csdev)
> > > > > }
> > > > > EXPORT_SYMBOL_GPL(cscfg_csdev_reset_feats);
> > > > >
> > > > > +static bool cscfg_config_desc_get(struct cscfg_config_desc *config_desc, bool enable)
> > > > > +{
> > > > > + if (enable)
> > > > > + return atomic_inc_not_zero(&config_desc->active_cnt);
> > > > > +
> > > >
> > > > Not sure why we have an "enable" parameter here - it completely
> > > > changes the meaning of the function - with no comment at the start.
> > >
> > > Sorry. But what I intended is to distinguish
> > > - activation of config
> > > - enable of activated config.
> > > Because, current coresight doesn't grab the module reference on enable of activate config,
> > > But It grabs that reference only in activation.
> > > That's why I used to "enable" parameter to distinguish this
> > > while I integrate with module_owner count.
> > >
> > > > > list_for_each_entry(config_desc, &cscfg_mgr->config_desc_list, item) {
> > > > > if ((unsigned long)config_desc->event_ea->var == cfg_hash) {
> > > > > - atomic_dec(&config_desc->active_cnt);
> > > > > atomic_dec(&cscfg_mgr->sys_active_cnt);
> > > > > - cscfg_owner_put(config_desc->load_owner);
> > > > > + cscfg_config_desc_put(config_desc);
> > > > > dev_dbg(cscfg_device(), "Deactivate config %s.\n", config_desc->name);
> > > > > break;
> > > > > }
> > > > > @@ -1047,7 +1066,7 @@ int cscfg_csdev_enable_active_config(struct coresight_device *csdev,
> > > > > unsigned long cfg_hash, int preset)
> > > > > {
> > > > > struct cscfg_config_csdev *config_csdev_active = NULL, *config_csdev_item;
> > > > > - const struct cscfg_config_desc *config_desc;
> > > > > + struct cscfg_config_desc *config_desc;
> > > > > unsigned long flags;
> > > > > int err = 0;
> > > > >
> > > > > @@ -1062,8 +1081,8 @@ int cscfg_csdev_enable_active_config(struct coresight_device *csdev,
> > > > > raw_spin_lock_irqsave(&csdev->cscfg_csdev_lock, flags);
> > > > > list_for_each_entry(config_csdev_item, &csdev->config_csdev_list, node) {
> > > > > config_desc = config_csdev_item->config_desc;
> > > > > - if ((atomic_read(&config_desc->active_cnt)) &&
> > > > > - ((unsigned long)config_desc->event_ea->var == cfg_hash)) {
> > > > > + if (((unsigned long)config_desc->event_ea->var == cfg_hash) &&
> > > > > + cscfg_config_desc_get(config_desc, true)) {
> > > > >
> > > > This obfuscates the logic of the comparisons without good reason. With
> > > > the true parameter, the function does no "get" operation but just
> > > > replicates the logic being replaced - checking the active_cnt is
> > > > non-zero.
> > > >
> > > > Restore this to the original logic to make it readable again
> > >
> > > It's not a replicates of comparsion logic, but if true,
> >
> > sorry - missed that point .
> >
> > > It get the reference of active_cnt but not get module reference.
> > > The fundemental fault in the UAF becase of just "atomic_read()"
> > > so, it should hold reference in here.
> > >
> > > So, If you think the cscfg_config_desc_get()'s parameter makes obfuscation,
> > > I think there're two way to modfiy.
> > >
> > > 1. cscfg_config_desc_get()/put() always grab/drop the module count.
> > > 2. remove cscfg_config_desc_get()/put() but just use atomic_XXX(&active_cnt) only
> > > with cscfg_owner_get()/put()
> > >
> > > Any thougt?
> > >
> > > Thanks!
> > >
> > >
> >
> > The get and put functions are asymmetrical w.r.t. owner.
> >
> > The put will put owner if active count decrements to 0,
> > The get if not on enable path will put owner unconditionally.
> >
> > This means that the caller has to work out the correct input conditions.
> >
> > Might be better if:-
> >
> > get_desc()
> > {
> > if (! desc->refcnt) {
> > if (!get_owner())
> > return false;
> > }
> > desc->refcnt++;
> > return true;
> > }
>
> I think This makes another problem when
> it races with _cscfg_deactivate_config().
>
> CPU0 CPU1
> (sysfs enable) load module
> cscfg_load_config_sets()
> activate config. // sysfs
> (sys_active_cnt == 1)
>
> // sysfs
> _cscfg_deactivate_config()
> (sys_active_cnt == 0)
> (config->active_cnt = 0)
> ...
> cscfg_csdev_enable_active_config()
> lock(csdev->cscfg_csdev_lock)
> // here get module reference??
> // even sys_active_cnt == 0 and
> // config->active_cnt == 1.
> get_desc()
> unlock(csdev->cscfg_csdev_lock)
>
>
> // access to config_desc which freed
> // while unloading module.
> cfs_csdev_enable_config
>
>
> Because, the desc->refcnt meaning of zero is different from the context.
> - while activate . it should get module reference if zero.
> - while enable active configuration, if zero, it should be failed.
>
> that means to prevent this race, the core key point is:
> when config->active_cnt == 0, it should be failed in cscfg_csdev_enable_active_config()
>

This is not a failure case, it simple means that this config should
not be activated for this device.
It is possible for a configuration to be active on the system, without
it being active for a particular coresight device.

Having a get/put interface for the config descriptor - which prevents
the config from being unloaded is fine, the key logic here is that we
are searching a list of possible enabled configurations for this
device and taking the necessary action to enable it if we find one -
and there can only ever be a single configuration enabled for a trace
session.

Therefore when the list of loaded configs for a device is > 1, then
all but one is allowed to be active - so the search code will validly
find instances where config->active_cnt == 0.

My objection to the original interface was not the get/put operations
to protect the module from unload, but the fact that the logic
deciding if a config needed to be enabled on the device was hidden
inside the get() operation.
My suggestion is to restore the logic that decides if there is a
config to enable on the device be clear in the enable function itself,
then use get/put as appropriate to prevent module unload.

Regards

Mike


> Because, according to context the handling the zero reference value is
> different, It seems,to integrate the get_desc() interface to handle
> above case together without extra arguments (in case of here is
> "enable").
>
> If this interface is really ugly and unhappy to you,
> I think It should remove get_desc()/put_desc().
> although we can add new interface for cscfg_config_desc_get() for enable
> path. but it makes people more confused.
>
> So my suggestion is:
> - sustain this patch's contents
> - or remove get_desc()/put_desc() interface but use
> atomic_inc_zero(&config_desc->active_cnt) directly in
> cscfg_csdev_enable_active_config()
>
> Any thougt?
>
> Thanks.



--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK