Re: [PATCH] thermal: sysfs: Perform bounds check when storing thermal states
From: Varad Gautam
Date: Wed Jul 06 2022 - 08:30:40 EST
On Wed, Jul 6, 2022 at 12:21 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Jul 06, 2022 at 12:01:19PM +0200, Varad Gautam wrote:
> > On Wed, Jul 6, 2022 at 11:21 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, Jul 06, 2022 at 04:51:59PM +0800, Zhang Rui wrote:
> > > > On Wed, 2022-07-06 at 09:16 +0200, Varad Gautam wrote:
> > > > > On Wed, Jul 6, 2022 at 8:45 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > wrote:
> > > > > >
> > > > > > On Tue, Jul 05, 2022 at 11:02:50PM +0200, Varad Gautam wrote:
> > > > > > > On Tue, Jul 5, 2022 at 6:18 PM Greg KH <
> > > > > > > gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote:
> > > > > > > > > Check that a user-provided thermal state is within the
> > > > > > > > > maximum
> > > > > > > > > thermal states supported by a given driver before attempting
> > > > > > > > > to
> > > > > > > > > apply it. This prevents a subsequent OOB access in
> > > > > > > > > thermal_cooling_device_stats_update() while performing
> > > > > > > > > state-transition accounting on drivers that do not have this
> > > > > > > > > check
> > > > > > > > > in their set_cur_state() handle.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Varad Gautam <varadgautam@xxxxxxxxxx>
> > > > > > > > > Cc: stable@xxxxxxxxxxxxxxx
> > > > > > > > > ---
> > > > > > > > > drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
> > > > > > > > > 1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/thermal/thermal_sysfs.c
> > > > > > > > > b/drivers/thermal/thermal_sysfs.c
> > > > > > > > > index 1c4aac8464a7..0c6b0223b133 100644
> > > > > > > > > --- a/drivers/thermal/thermal_sysfs.c
> > > > > > > > > +++ b/drivers/thermal/thermal_sysfs.c
> > > > > > > > > @@ -607,7 +607,7 @@ cur_state_store(struct device *dev,
> > > > > > > > > struct device_attribute *attr,
> > > > > > > > > const char *buf, size_t count)
> > > > > > > > > {
> > > > > > > > > struct thermal_cooling_device *cdev =
> > > > > > > > > to_cooling_device(dev);
> > > > > > > > > - unsigned long state;
> > > > > > > > > + unsigned long state, max_state;
> > > > > > > > > int result;
> > > > > > > > >
> > > > > > > > > if (sscanf(buf, "%ld\n", &state) != 1)
> > > > > > > > > @@ -618,10 +618,20 @@ cur_state_store(struct device *dev,
> > > > > > > > > struct device_attribute *attr,
> > > > > > > > >
> > > > > > > > > mutex_lock(&cdev->lock);
> > > > > > > > >
> > > > > > > > > + result = cdev->ops->get_max_state(cdev, &max_state);
> > > > > > > > > + if (result)
> > > > > > > > > + goto unlock;
> > > > > > > > > +
> > > > > > > > > + if (state > max_state) {
> > > > > > > > > + result = -EINVAL;
> > > > > > > > > + goto unlock;
> > > > > > > > > + }
> > > > > > > > > +
> > > > > > > > > result = cdev->ops->set_cur_state(cdev, state);
> > > > > > > >
> > > > > > > > Why doesn't set_cur_state() check the max state before setting
> > > > > > > > it? Why
> > > > > > > > are the callers forced to always check it before? That feels
> > > > > > > > wrong...
> > > > > > > >
> > > > > > >
> > > > > > > The problem lies in thermal_cooling_device_stats_update(), not
> > > > > > > set_cur_state().
> > > > > > >
> > > > > > > If ->set_cur_state() doesn't error out on invalid state,
> > > > > > > thermal_cooling_device_stats_update() does a:
> > > > > > >
> > > > > > > stats->trans_table[stats->state * stats->max_states +
> > > > > > > new_state]++;
> > > > > > >
> > > > > > > stats->trans_table reserves space depending on max_states, but
> > > > > > > we'd end up
> > > > > > > reading/writing outside it. cur_state_store() can prevent this
> > > > > > > regardless of
> > > > > > > the driver's ->set_cur_state() implementation.
> > > > > >
> > > > > > Why wouldn't cur_state_store() check for an out-of-bounds condition
> > > > > > by
> > > > > > calling get_max_state() and then return an error if it is invalid,
> > > > > > preventing thermal_cooling_device_stats_update() from ever being
> > > > > > called?
> > > > > >
> > > > >
> > > > > That's what this patch does, it adds the out-of-bounds check.
> > > >
> > > > No, I think Greg' question is
> > > > why cdev->ops->set_cur_state() return 0 when setting a cooling state
> > > > that exceeds the maximum cooling state?
> > >
> > > Yes, that is what I am asking, it should not allow a state to be
> > > exceeded.
> > >
> >
> > Indeed, it is upto the driver to return !0 from cdev->ops->set_cur_state()
> > when setting state > max - and it is a driver bug for not doing so.
> >
> > But a buggy driver should not lead to cur_state_store() performing an OOB
> > access.
>
> Agreed, which is why the code that does the access should check before
> it does so. Right now you are relying on the sysfs code to do so, which
> seems very wrong.
>
I see the point.
The OOB access happens in thermal_cooling_device_stats_update().
By placing the check in cur_state_store(), I'm trying to ensure
two things for a buggy driver:
1. The driver's cdev->ops->set_cur_state() doesn't get called if
the new state is > max state. This is to prevent the driver
from storing the new (invalid) state internally. If the driver
didn't realise/reject an invalid state, chances are it will try
to propagate it internally and take actions according to that,
which can have side effects on system stability.
2. The kernel doesn't do an OOB access in
thermal_cooling_device_stats_update().
> thanks,
>
> greg k-h