Re: [PATCH] iio/chemical/bme680: Fix SPI read interface

From: Jonathan Cameron
Date: Wed Feb 20 2019 - 06:33:44 EST


On Mon, 18 Feb 2019 07:03:10 +0000
Mike Looijmans <mike.looijmans@xxxxxxxx> wrote:

> On 16-02-19 12:27, Himanshu Jha wrote:
> > Hello Mike,
> >
> > On Fri, Feb 15, 2019 at 02:47:55PM +0100, Mike Looijmans wrote:
> >> The SPI interface implementation was completely broken.
> >
> > My apologies!
> >
> > SPI interface caused me a lot of trouble in my project timeline. I tried
> > many different ways to try playing with acpi dsl, printks etc. One
> > of the problem was also that we don't have any facility such `i2cdetect`
> > or instantiate a `new_device` from userspace.
> >
> > Are there any methods that one should try to debug SPIs ?
> > -- excluding the use of Oscilloscope
>
> Just persisting usually makes it work... Once you're able to read a single
> register, the rest is peanuts...
Agreed on that assessment. It's the first read that sometimes takes ours
of pulling your hair out.

A few comments inline, on the discussion rather than the patch.
Nice work Mike. We'll definitely want to push this to stable. I'll do
that for v2 if it looks good. Nothing much will go upstream until
after the merge window closes now anyway which will be about 3 weeks
time.

Jonathan

>
> > With all these problems, I referred other drivers from the Bosch family,
> > particularly BME280 which has exactly the same behavior and decided to
> > do the same. I believed that it would also work cleanly for BME680 and
> > now I see the page selection fuzz was the main problem.
> >
> > I tried to test it on Beaglebone but everytime something new pops-up
> > with beagle setup such as they moved from capemgr -> uboot overlays.
> > Lack of latest documentation on "What new in this release?" troubled me
> > + most of blogs on setup are outdated.
> >
> >> When using the SPI interface, there are only 7 address bits, the upper bit
> >> is controlled by a page select register.
> >
> > Isn't it that upper bit is controlled R/W ?
>
> I meant the upper 'address' bit here. Bit 7 of the first byte indeed controls
> read/write like on many register-controlled SPI devices (see regmap's standard
> SPI implementation).
>
> > And one question that I have in mind:
> >
> > Initially after poweron we are in page 0, and hence we can't access
> > Page 1 which actually has the `spi_mem_page` bit. So, how can we switch
> > to Page 1 eventually since Page 1 covers all the relevant data/config
> > registers ?
>
> My approach was to just assume that the register would work in either page,
> and that turned out to be correct. Most 'paging' chips work that way. Though
> they typically use register 0 or 255 for that.
>
> > I assume if you're in Page 0, then you can't access Page 1 registers or
> > simply put those registers are not visible.
> >
> > Some inconsistency in datasheet:
> >
> > Page 24/50:
> >
> > "After power-on, spi_mem_page is in its reset state and page 0 (0x80 to 0xFF)
> > will be active. Page 1 (0x00 to 0x7F) will be
> > active on setting spi_mem_page to 1."
> >
> > OTOH,
> >
> > Page 26/50:
> >
> > "5.3.1.4 SPI memory map page selection â spi_mem_page
> >
> > In SPI mode complete memory page is accessed using page 0 & page 1.
> > Register spi_mem_page is used for page selection.
> > After power-on, spi_mem_page is in its reset state and
> > page 0(0x00 to 0x7F) will be active."
> >
> >
> > That's contradicting!
>
> They fell into their own trap of calling the high range "page 0" I guess.
>
> In this context, it's also meaningless. Upon probe, you cannot really be sure
> which page is active anyway, and the chip doesn't have a reset GPIO to enforce
> a known startup state. So "assume nothing" is the way forward.
>
>
> > page 0 (0x80 to 0xFF) Vs page 0(0x00 to 0x7F) ?
> >
> >> The core needs access to both
> >> ranges, so implement register read/write for both regions. The regmap
> >> paging functionality didn't agree with a register that needs to be read
> >> and modified, so I implemented a custom paging algorithm.
> >>
> >> This fixes that the device wouldn't even probe in SPI mode.
> >
> > Most importantly: Does it now work for you ?
>
> Yes, with these changes the chip works (I have it connected to SPI only).
>
> > As I tried you patch and it is not working for me in my QEMU setup.
> >
> > /sys/bus/iio/devices # lsmod
> > Module Size Used by Tainted: G
> > bme680_spi 16384 0
> > bme680_core 24576 1 bme680_spi
> > /sys/bus/iio/devices # ls
> >
> > SPI didn't probe and no logs for any errors/warnings as well.
>
> Probably it isn't in the devicetree then. Mine looks like this:
>
> +&spi1 {
> + status = "okay";
> + num-cs = <2>;
> +
> + /* Environmental Sensor */
> + bme680_env: bme680_env@0 {
> + compatible = "bme680";
> + reg = <0>;
> + spi-max-frequency = <10000000>;
> + };
>
>
> > Maybe I'll give it try again later on a different board or a native
> > build+boot on the current board.
> >
> > Anyway, does anyone has any idea/clues for writing a dt-overlay for
> > am335x-boneblack-wireless.dts to add just a compatible property to
> > probe bme680_spi driver ?
> >
> >> The SPI interface then isn't different from I2C, merged them into the core,
> >> and the I2C/SPI named registers are no longer needed.
> >>
> >> Implemented register value caching for the registers to reduce the I2C/SPI
> >> data transfers considerably.
> >>
> >> The calibration set reads as all zeroes until some undefined point in time,
> >> and I couldn't determine what makes it valid. The datasheet mentions these
> >> registers but does not provide any hints on when they become valid, and they
> >> aren't even enumerated in the memory map. So check the calibration and
> >> retry reading it from the device after each measurement until it provides
> >> something valid.
> >
> > Hmm. One point I would point out here is that: if calib constants are
> > invalid then without any doubt readings would wrong/disastrous. But I
> > never saw this issue when testing I2C interface.
>
> It might be timing related. The SPI bus runs at 10MHz, and my platform is a
> ZynqMP.
>
> My initial idea was to just read everything from page 0 (only chip-id, reset,
> and calibration data is there) and then switch to page 1 and remain there. But
> the calibration data always read zero at probe time. Reading the registers
> again at a later time made them valid.
>
> > As for missing info from datasheet, I mean Bosch doesn't even care to
> > reply or involve in the discussions. This driver was also ported to
> > Zephyr project and Pull request lying stale since November. In the end,
> > developers from Nordic Semiconductor who majorly work on BLE and some
> > other stuff took over their work and are now trying to complete it.
>
> Yeah, there are only few manufacturers who care about drivers. Most seem to
> think that a bare-metal app or a dedicated 3.1 kernel is all the world will
> ever need.
>
> Most of our hardware developers have learned that software support is often
> more important than price or features.

Absolutely!

>
> > Also, look around BSEC(https://www.bosch-sensortec.com/bst/products/all_products/bsecreveals)
> > for more info that you may want to know. And go ahead reverse engineering[1]
> > their code to know more OR work around and rely on the heuristics[2] that you
> > observe while testing.
> >
> > [1] I saw a code snippet in BSEC which says after measurement is completed in
> > FORCED mode and data is ready to read, we need to wait for current
> > mode go in SLEEP mode before actually reading the data.
> >
> > [2] One fellow developer few moths ago explained to play with those calib
> > constants as well:
> > https://lore.kernel.org/linux-iio/20181215191743.GA1263@himanshu-Vostro-3559/
> >
> > Then there are many things to rant about. But currently I'm busy with
> > university exams, so I might not be able to work on this till 10th March.
> >
> >> Report temperature in millidegrees Celcius instead of degrees.
> >
> > Jonathan, I didn't know about this but is there any rationale to report
> > in millidegress ?
> >
> > I was under the impression that degC is the standard and you would
> > always want you Fitbit/Smart watch to report in degC. Ofcourse,
> > userspace program can convert millideg -> degC
>
> All IIO drivers report millidegrees. It's probably related to hwmon.

Indeed. Good guess :)

>
>
> >> Signed-off-by: Mike Looijmans <mike.looijmans@xxxxxxxx>
> >> ---
> >> drivers/iio/chemical/bme680.h | 6 +-
> >> drivers/iio/chemical/bme680_core.c | 54 ++++++++++++++---
> >> drivers/iio/chemical/bme680_i2c.c | 21 -------
> >> drivers/iio/chemical/bme680_spi.c | 116 +++++++++++++++++++++++++------------
> >> 4 files changed, 126 insertions(+), 71 deletions(-)
> >>
> >> diff --git a/drivers/iio/chemical/bme680.h b/drivers/iio/chemical/bme680.h
> >> index 0ae89b87..4edc5d21 100644
> >> --- a/drivers/iio/chemical/bme680.h
> >> +++ b/drivers/iio/chemical/bme680.h
> >> @@ -2,11 +2,9 @@
> >> #ifndef BME680_H_
> >> #define BME680_H_
> >>
> >> -#define BME680_REG_CHIP_I2C_ID 0xD0
> >> -#define BME680_REG_CHIP_SPI_ID 0x50
> >> +#define BME680_REG_CHIP_ID 0xD0
> >> #define BME680_CHIP_ID_VAL 0x61
> >> -#define BME680_REG_SOFT_RESET_I2C 0xE0
> >> -#define BME680_REG_SOFT_RESET_SPI 0x60
> >> +#define BME680_REG_SOFT_RESET 0xE0
> >> #define BME680_CMD_SOFTRESET 0xB6
> >> #define BME680_REG_STATUS 0x73
> >> #define BME680_SPI_MEM_PAGE_BIT BIT(4)
> >> diff --git a/drivers/iio/chemical/bme680_core.c b/drivers/iio/chemical/bme680_core.c
> >> index 70c1fe4..ccde4c6 100644
> >> --- a/drivers/iio/chemical/bme680_core.c
> >> +++ b/drivers/iio/chemical/bme680_core.c
> >> @@ -63,9 +63,23 @@ struct bme680_data {
> >> s32 t_fine;
> >> };
> >>
> >> +static const struct regmap_range bme680_volatile_ranges[] = {
> >> + regmap_reg_range(BME680_REG_MEAS_STAT_0, BME680_REG_GAS_R_LSB),
> >> + regmap_reg_range(BME680_REG_STATUS, BME680_REG_STATUS),
> >> + regmap_reg_range(BME680_T2_LSB_REG, BME680_GH3_REG),
> >> +};
> >> +
> >> +static const struct regmap_access_table bme680_volatile_table = {
> >> + .yes_ranges = bme680_volatile_ranges,
> >> + .n_yes_ranges = ARRAY_SIZE(bme680_volatile_ranges),
> >> +};
> >> +
> >> const struct regmap_config bme680_regmap_config = {
> >> .reg_bits = 8,
> >> .val_bits = 8,
> >> + .max_register = 0xef,
> >
> > Any reason for 0xEF ?
>
> It's what I calculated the highest possibly address to be. It mostly affects
> debugging (/sys/kernel/debug/regmap).
>
>
> >> + .volatile_table = &bme680_volatile_table,
> >> + .cache_type = REGCACHE_RBTREE,
> >> };
> >> EXPORT_SYMBOL(bme680_regmap_config);
> >>
> >> @@ -316,6 +330,10 @@ static s16 bme680_compensate_temp(struct bme680_data *data,
> >> s64 var1, var2, var3;
> >> s16 calc_temp;
> >>
> >> + /* If the calibration is invalid, attempt to reload it */
> >> + if (!calib->par_t2)
> >> + bme680_read_calib(data, calib);
> >
> > How did you observe this behavior ? Was the readings not correct before ?
>
> I always read back either zeroes or valid data from the calibration data. If
> par_t2 is zero, it nulls the temperature readout, so it must have a non-zero
> value to be valid.
>
> I think the chips needs at least a 'forced' mode before it actually loads
> these, or maybe they're just in a memory region that so slow that it takes the
> chip seconds to read them into its register space.
>
> The datasheet doesn't mention much about these registers at all, so this is
> the best I could come up with.
>
>
> >
> >> var1 = (adc_temp >> 3) - (calib->par_t1 << 1);
> >> var2 = (var1 * calib->par_t2) >> 11;
> >> var3 = ((var1 >> 1) * (var1 >> 1)) >> 12;
> >> @@ -583,8 +601,7 @@ static int bme680_gas_config(struct bme680_data *data)
> >> return ret;
> >> }
> >>
> >> -static int bme680_read_temp(struct bme680_data *data,
> >> - int *val, int *val2)
> >> +static int bme680_read_temp(struct bme680_data *data, int *val)
> >> {
> >> struct device *dev = regmap_get_device(data->regmap);
> >> int ret;
> >> @@ -617,10 +634,9 @@ static int bme680_read_temp(struct bme680_data *data,
> >> * compensate_press/compensate_humid to get compensated
> >> * pressure/humidity readings.
> >> */
> >> - if (val && val2) {
> >> - *val = comp_temp;
> >> - *val2 = 100;
> >> - return IIO_VAL_FRACTIONAL;
> >> + if (val) {
> >> + *val = comp_temp * 10; /* Centidegrees to millidegrees */
> >> + return IIO_VAL_INT;
> >> }
> >>
> >> return ret;
> >> @@ -635,7 +651,7 @@ static int bme680_read_press(struct bme680_data *data,
> >> s32 adc_press;
> >>
> >> /* Read and compensate temperature to get a reading of t_fine */
> >> - ret = bme680_read_temp(data, NULL, NULL);
> >> + ret = bme680_read_temp(data, NULL);
> >> if (ret < 0)
> >> return ret;
> >>
> >> @@ -668,7 +684,7 @@ static int bme680_read_humid(struct bme680_data *data,
> >> u32 comp_humidity;
> >>
> >> /* Read and compensate temperature to get a reading of t_fine */
> >> - ret = bme680_read_temp(data, NULL, NULL);
> >> + ret = bme680_read_temp(data, NULL);
> >> if (ret < 0)
> >> return ret;
> >>
> >> @@ -761,7 +777,7 @@ static int bme680_read_raw(struct iio_dev *indio_dev,
> >> case IIO_CHAN_INFO_PROCESSED:
> >> switch (chan->type) {
> >> case IIO_TEMP:
> >> - return bme680_read_temp(data, val, val2);
> >> + return bme680_read_temp(data, val);
> >> case IIO_PRESSURE:
> >> return bme680_read_press(data, val, val2);
> >> case IIO_HUMIDITYRELATIVE:
> >> @@ -867,8 +883,28 @@ int bme680_core_probe(struct device *dev, struct regmap *regmap,
> >> {
> >> struct iio_dev *indio_dev;
> >> struct bme680_data *data;
> >> + unsigned int val;
> >> int ret;
> >>
> >> + ret = regmap_write(regmap, BME680_REG_SOFT_RESET,
> >> + BME680_CMD_SOFTRESET);
> >> + if (ret < 0) {
> >> + dev_err(dev, "Failed to reset chip\n");
> >> + return ret;
> >> + }
> >
> > FYI Bosch commented to me earlier:
> >
> > "It is recommended but not mandatory. The idea behind the soft
> > reset is to get the driver in sync with the sensor and the
> > fastest way to do so is via a soft reset."
> >
> > But its good to keep it, nevermind ..
>
> This initialization block was just moved from the SPI and I2C parts into the
> core, no change in behaviour.
>
> >
> >> + ret = regmap_read(regmap, BME680_REG_CHIP_ID, &val);
> >> + if (ret < 0) {
> >> + dev_err(dev, "Error reading chip ID\n");
> >> + return ret;
> >> + }
> >> +
> >> + if (val != BME680_CHIP_ID_VAL) {
> >> + dev_err(dev, "Wrong chip ID, got %x expected %x\n",
> >> + val, BME680_CHIP_ID_VAL);
> >> + return -ENODEV;
> >> + }
> >> +
> >> indio_dev = devm_iio_device_alloc(dev, sizeof(*data));
> >> if (!indio_dev)
> >> return -ENOMEM;
> >> diff --git a/drivers/iio/chemical/bme680_i2c.c b/drivers/iio/chemical/bme680_i2c.c
> >> index 06d4be5..cfc4449 100644
> >> --- a/drivers/iio/chemical/bme680_i2c.c
> >> +++ b/drivers/iio/chemical/bme680_i2c.c
> >> @@ -23,8 +23,6 @@ static int bme680_i2c_probe(struct i2c_client *client,
> >> {
> >> struct regmap *regmap;
> >> const char *name = NULL;
> >> - unsigned int val;
> >> - int ret;
> >>
> >> regmap = devm_regmap_init_i2c(client, &bme680_regmap_config);
> >> if (IS_ERR(regmap)) {
> >> @@ -33,25 +31,6 @@ static int bme680_i2c_probe(struct i2c_client *client,
> >> return PTR_ERR(regmap);
> >> }
> >>
> >> - ret = regmap_write(regmap, BME680_REG_SOFT_RESET_I2C,
> >> - BME680_CMD_SOFTRESET);
> >> - if (ret < 0) {
> >> - dev_err(&client->dev, "Failed to reset chip\n");
> >> - return ret;
> >> - }
> >> -
> >> - ret = regmap_read(regmap, BME680_REG_CHIP_I2C_ID, &val);
> >> - if (ret < 0) {
> >> - dev_err(&client->dev, "Error reading I2C chip ID\n");
> >> - return ret;
> >> - }
> >> -
> >> - if (val != BME680_CHIP_ID_VAL) {
> >> - dev_err(&client->dev, "Wrong chip ID, got %x expected %x\n",
> >> - val, BME680_CHIP_ID_VAL);
> >> - return -ENODEV;
> >> - }
> >> -
> >> if (id)
> >> name = id->name;
> >>
> >> diff --git a/drivers/iio/chemical/bme680_spi.c b/drivers/iio/chemical/bme680_spi.c
> >> index c9fb05e..e130715 100644
> >> --- a/drivers/iio/chemical/bme680_spi.c
> >> +++ b/drivers/iio/chemical/bme680_spi.c
> >> @@ -11,28 +11,94 @@
> >>
> >> #include "bme680.h"
> >>
> >> +struct bme680_spi_bus_context {
> >> + struct spi_device *spi;
> >> + u8 current_page;
> >> +};
> >> +
> >> +/*
> >> + * In SPI mode there are only 7 address bits, a "page" register determines
> >> + * which part of the 8-bit range is active. This function looks at the address
> >> + * and writes the page selection bit if needed
> >> + */
> >> +static int bme680_regmap_spi_select_page(
> >> + struct bme680_spi_bus_context *ctx, u8 reg)
> >> +{
> >> + struct spi_device *spi = ctx->spi;
> >> + int ret;
> >> + u8 buf[2];
> >> + u8 page = (reg & 0x80) ? 0 : 1; /* Page "1" is low range */
> >> +
> >> + if (page == ctx->current_page)
> >> + return 0;
> >> +
> >> + /*
> >> + * Data sheet claims we're only allowed to change bit 4, so we must do
> >> + * a read-modify-write on each and every page select
> >> + */
> >> + buf[0] = BME680_REG_STATUS;
> >> + ret = spi_write_then_read(spi, buf, 1, buf + 1, 1);
> >> + if (ret < 0) {
> >> + dev_err(&spi->dev, "failed to set page %u\n", page);
> >> + return ret;
> >> + }
> >> +
> >> + buf[0] = BME680_REG_STATUS;
> >> + if (page)
> >> + buf[1] |= BME680_SPI_MEM_PAGE_BIT;
> >> + else
> >> + buf[1] &= ~BME680_SPI_MEM_PAGE_BIT;
> >> +
> >> + ret = spi_write(spi, buf, 2);
> >> + if (ret < 0) {
> >> + dev_err(&spi->dev, "failed to set page %u\n", page);
> >> + return ret;
> >> + }
> >> +
> >> + ctx->current_page = page;
> >> +
> >> + return 0;
> >> +}
> >
> > I will take another look at this later as my head is spinning now ...
> >
> >> static int bme680_regmap_spi_write(void *context, const void *data,
> >> size_t count)
> >> {
> >> - struct spi_device *spi = context;
> >> + struct bme680_spi_bus_context *ctx = context;
> >> + struct spi_device *spi = ctx->spi;
> >> + int ret;
> >> u8 buf[2];
> >>
> >> memcpy(buf, data, 2);
> >> +
> >> + ret = bme680_regmap_spi_select_page(ctx, buf[0]);
> >> + if (ret)
> >> + return ret;
> >> +
> >> /*
> >> * The SPI register address (= full register address without bit 7)
> >> * and the write command (bit7 = RW = '0')
> >> */
> >> buf[0] &= ~0x80;
> >>
> >> - return spi_write_then_read(spi, buf, 2, NULL, 0);
> >> + return spi_write(spi, buf, 2);
> >> }
> >>
> >> static int bme680_regmap_spi_read(void *context, const void *reg,
> >> size_t reg_size, void *val, size_t val_size)
> >> {
> >> - struct spi_device *spi = context;
> >> + struct bme680_spi_bus_context *ctx = context;
> >> + struct spi_device *spi = ctx->spi;
> >> + int ret;
> >> + u8 addr = *(const u8 *)reg;
> >> + u8 addr7;
> >
> > Unused variable.
>
> Indeed. Will need a v2.
>
> >
> >> + ret = bme680_regmap_spi_select_page(ctx, addr);
> >> + if (ret)
> >> + return ret;
> >>
> >> - return spi_write_then_read(spi, reg, reg_size, val, val_size);
> >> + addr |= 0x80; /* bit7 = RW = '1' */
> >> +
> >> + return spi_write_then_read(spi, &addr, 1, val, val_size);
> >> }
> >>
> >> static struct regmap_bus bme680_regmap_bus = {
> >> @@ -45,8 +111,8 @@ static int bme680_regmap_spi_read(void *context, const void *reg,
> >> static int bme680_spi_probe(struct spi_device *spi)
> >> {
> >> const struct spi_device_id *id = spi_get_device_id(spi);
> >> + struct bme680_spi_bus_context *bus_context;
> >> struct regmap *regmap;
> >> - unsigned int val;
> >> int ret;
> >>
> >> spi->bits_per_word = 8;
> >> @@ -56,45 +122,21 @@ static int bme680_spi_probe(struct spi_device *spi)
> >> return ret;
> >> }
> >>
> >> + bus_context = devm_kzalloc(&spi->dev, sizeof(*bus_context), GFP_KERNEL);
> >> + if (!bus_context)
> >> + return -ENOMEM;
> >> +
> >> + bus_context->spi = spi;
> >> + bus_context->current_page = 0xff; /* Undefined on warm boot */
> >
> > OK. This is what you observed.
> >
> > If this patch works as expected then I think a "Fixes:" tag should be
> > added while marking it for stable ?
> >
> > Thanks for the patch Mike and apologies again for the whole mess.
> >
>
> You're welcome.

Nice work Mike.

Jonathan