Re: [PATCH 1/3] iio: light: Add driver for ap3216c

From: Sven Van Asbroeck
Date: Tue Feb 12 2019 - 23:40:29 EST


On Tue, Feb 12, 2019 at 3:47 PM Jonathan Cameron <jic23@xxxxxxxxxx> wrote:
>
>
> Good question on whether it is guaranteed to read in increasing register
> order (I didn't actually check the addresses are in increasing order
> but assume they are or you would have pointed that out ;)
>
> That strikes me as behaviour that should probably be documented as long
> as it is true currently.
>

Looking at the datasheet closely... this chip doesn't seem to support
bulk/raw accesses. At least, it's not documented. So maybe we should
steer clear of that, and tell the regmap core, via .use_single_read
and .use_single_write in regmap_config.

So once these flags are set, we could call regmap_bulk_read() on the
LO/HI combo registers, and it would probably work. But do we really
want to? The LO register, when read, cause side-effects
in the corresponding HI register. That's weird / unexpected for developers.
Maybe it makes sense to make this explicit, not implicit. In addition,
bulk_read() behaviour changes in the future may break the register
reads ?

> >
>
> If we clear just the right one, (which I think we can do from
> the datasheet
> "1: Software clear after writing 1 into address 0x01 each bit#"
> However the code isn't writing a 3 in that clear, so I'm not
> sure if the datasheet is correct or not...
>
> and it is a level interrupt (which I think it is?) then we would
> be safe against this miss.
>
> If either we can only globally clear or it's not a level interrupt
> there isn't much we can do to avoid a miss, it's just a bad hardware
> design.
>

I think we're in luck, and this would work ! Note 1 on page 12 of the
datasheet:

"Note1. The INT pin will be set low and set INT status bit when ALS or
PS or (ALS+PS) interrupt event occurrence.
User can clear INT bit and individual status bits when reading the
register 0xD(ALS) , 0xF(PS) and 0xD+0xF(ALS+PS) respectively."

>
>

I'm sorry to wear out your patience, but I think there are more
synchronization issues lurking here.

The iio_push_event(THRESH) tells interested userspace processes:
"hey, maybe you want to check if a threshold is crossed", right?
Is my understanding correct?

If so, I think it's possible that a threshold is crossed, without
userspace knowing. To see why, let's look at the handler again:

static irqreturn_t ap3216c_event_handler(int irq, void *p)
{
/* imagine ALS interrupt came in */
regmap_read(data->regmap, AP3216C_INT_STATUS, &status);
if (status & als) iio_push_event(LIGHT);

/* imagine schedule happens here */
msleep(...);
/* while we are not running, userspace reacts to the event,
* reads the new ALS value.
* Next, imagine a _new_ ALS interrupt comes in, while we are
* still sleeping. the irq does not get re-scheduled, as it's
* still running !
* Next, we proceed:
*/
ap3216c_clear_als(data);
/* at this point there has been a new ALS interrupt without
* userspace knowing about it.
*/
}

The _chance_ of this happening is very low, as ALS/PS events are quite
widely spaced. But we're not an RTOS. Question is: do we want to take
the risk? Idk, perhaps it's ok to trade simplicity for the low chance of
missing a threshold.

+static int ap3216c_write_event_config(struct iio_dev *indio_dev,
+ const struct iio_chan_spec *chan,
+ enum iio_event_type type,
+ enum iio_event_direction dir, int state)
+{
+ struct ap3216c_data *data = iio_priv(indio_dev);
+
+ switch (chan->type) {
+ case IIO_LIGHT:
+ data->als_thresh_en = state;
+ return 0;
+
+ case IIO_PROXIMITY:
+ data->prox_thresh_en = state;
+ return 0;
+
+ default:
+ return -EINVAL;
+ }

I think this may not work as intended. One thread (userspace) writes
a variable, another thread (threaded irq handler) checks it. but there
is no explicit or implicit memory barrier. So when userspace activates
thresholding, it may take a long time for the handler to 'see' it !

One way to guarantee that the irq handler 'sees' this immediately
is to grab a mutex, which issues an implicit memory barrier.

>
>

Allow me to suggest a simple, straightforward way to make all the above
issues go away (if indeed they are issues) :

First, disable fine-grained regmap locking, by setting .disable_locking
in regmap_config.

Next, read ALS and PS _exclusively_ in the irq handler, guard it with
a mutex:

static irqreturn_t ap3216c_event_handler(int irq, void *p)
{
int ret = IRQ_NONE;

mutex_lock(&data->m);
regmap_read(data->regmap, AP3216C_INT_STATUS, &status);
if (status & als) {
/* mutex m ensures LO and HI are read non-interleaved */
regmap_read(data->regmap, ALS_LO, &als_lo);
regmap_read(data->regmap, ALS_HI, &als_hi);
/* mutex m's memory barrier lets userspace 'see' data->als change */
data->als = als_lo | (als_hi<<8);
/* mutex m's memory barrier lets us 'see' do_thresholding change */
if (data->do_thresholding)
iio_push_event();
/* mutex m prevents userspace from reading the cached value
* in between iio_push_event() and als_clear(), which means
* userspace never 'misses' interrupts.
*/
als_clear(data);
ret = IRQ_HANDLED;
}
( ... ps case left out for brevity ... )
mutex_unlock(&data->m);

return ret;
}

Now, ap3216c_read_raw becomes:

ap3216c_read_raw()
{
mutex_lock(&data->m);
switch (mask) {
IIO_CHAN_INFO_PROCESSED:
switch (chan->type) {
case IIO_LIGHT:
*val = some_function_of(data->als);
}
}
mutex_unlock(&data->m);
}

and ap3216c_write_event_config becomes:

ap3216c_write_event_config()
{
mutex_lock(&data->m);
switch (chan->type) {
case IIO_LIGHT:
data->als_thresh_en = state;
goto out;
}
out:
mutex_unlock(&data->m);
}

In fact, we'd need mutex_lock around any function that touches the
regmap, cached data, or threshold enable/disable flags.