Re: [PATCH v2] atm/fore200e: Fix possible data race in fore200e_open()
From: Gui-Dong Han
Date: Thu Jan 23 2025 - 19:51:41 EST
On Thu, Jan 23, 2025 at 11:12 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
>
> On Wed, 22 Jan 2025 02:37:45 +0000 Gui-Dong Han wrote:
> > Protect access to fore200e->available_cell_rate with rate_mtx lock to
> > prevent potential data race.
> >
> > In this case, since the update depends on a prior read, a data race
> > could lead to a wrong fore200e.available_cell_rate value.
> >
> > The field fore200e.available_cell_rate is generally protected by the lock
> > fore200e.rate_mtx when accessed. In all other read and write cases, this
> > field is consistently protected by the lock, except for this case and
> > during initialization.
>
> Please describe the call paths which interact to cause the race.
The fore200e.available_cell_rate is a shared resource used to track
the available bandwidth for allocation.
The functions fore200e_open(), fore200e_close(), and
fore200e_change_qos(), which modify fore200e.available_cell_rate to
reflect bandwidth availability, may interact and cause a race
condition.
fore200e_open(struct atm_vcc *vcc)
{
...
/* Pseudo-CBR bandwidth requested? */
if ((vcc->qos.txtp.traffic_class == ATM_CBR) &&
(vcc->qos.txtp.max_pcr > 0)) {
mutex_lock(&fore200e->rate_mtx);
if (fore200e->available_cell_rate < vcc->qos.txtp.max_pcr) {
mutex_unlock(&fore200e->rate_mtx);
... // Error handling code
return -EAGAIN;
}
/* Reserve bandwidth */
fore200e->available_cell_rate -= vcc->qos.txtp.max_pcr;
mutex_unlock(&fore200e->rate_mtx);
}
...
if (fore200e_activate_vcin(fore200e, 1, vcc, vcc->qos.rxtp.max_sdu) < 0) {
... // Error handling code
fore200e->available_cell_rate += vcc->qos.txtp.max_pcr;
... // Error handling code
return -EINVAL;
}
}
There is further evidence within fore200e_open() itself. The function
correctly takes the lock when subtracting vcc->qos.txtp.max_pcr from
fore200e.available_cell_rate to reserve bandwidth, but later, in the
error handling path, it adds vcc->qos.txtp.max_pcr back without
holding the lock. There is no justification for modifying a shared
resource without the corresponding lock in this case.
Regards,
Han