Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid

From: Lino Sanfilippo
Date: Tue Feb 09 2021 - 06:54:02 EST


Hi Jason,

On 05.02.21 18:25, Jason Gunthorpe wrote:
> On Fri, Feb 05, 2021 at 08:48:11AM -0800, James Bottomley wrote:
>>> Thanks for pointing this out. I'd strongly support Jason's proposal:
>>>
>>> https://lore.kernel.org/linux-integrity/20201215175624.GG5487@xxxxxxxx/
>>>
>>> It's the best long-term way to fix this.
>>
>> Really, no it's not. It introduces extra mechanism we don't need.
>
>> To recap the issue: character devices already have an automatic
>> mechanism which holds a reference to the struct device while the
>> character node is open so the default is to release resources on final
>> put of the struct device.
>
> The refcount on the struct device only keeps the memory alive, it
> doesn't say anything about the ops. We still need to lock and check
> the ops each and every time they are used.
>
> The fact cdev goes all the way till fput means we don't need the extra
> get/put I suggested to Lino at all.
>
>> The practical consequence of this model is that if you allocate a chip
>> structure with tpm_chip_alloc() you have to release it again by doing a
>> put of *both* devices.
>
> The final put of the devs should be directly after the
> cdev_device_del(), not in a devm. This became all confused because the
> devs was created during alloc, not register. Having a device that is
> initialized but will never be added is weird.
>
> See sketch below.
>
>> Stefan noticed the latter, so we got the bogus patch 8979b02aaf1d
>> ("tpm: Fix reference count to main device") applied which simply breaks
>> the master/slave model by not taking a reference on the master for the
>> slave. I'm not sure why I didn't notice the problem with this fix at
>> the time, but attention must have been elsewhere.
>
> Well, this is sort of OK because we never use the devs in TPM1, so we
> end up freeing the chip with a positive refcount on the devs, which is
> weird but not a functional bug.
>
> Jason
>
> diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> index ddaeceb7e10910..e07193a0dd4438 100644
> --- a/drivers/char/tpm/tpm-chip.c
> +++ b/drivers/char/tpm/tpm-chip.c
> @@ -344,7 +344,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
> chip->dev_num = rc;
>
> device_initialize(&chip->dev);
> - device_initialize(&chip->devs);
>
> chip->dev.class = tpm_class;
> chip->dev.class->shutdown_pre = tpm_class_shutdown;
> @@ -352,29 +351,12 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
> chip->dev.parent = pdev;
> chip->dev.groups = chip->groups;
>
> - chip->devs.parent = pdev;
> - chip->devs.class = tpmrm_class;
> - chip->devs.release = tpm_devs_release;
> - /* get extra reference on main device to hold on
> - * behalf of devs. This holds the chip structure
> - * while cdevs is in use. The corresponding put
> - * is in the tpm_devs_release (TPM2 only)
> - */
> - if (chip->flags & TPM_CHIP_FLAG_TPM2)
> - get_device(&chip->dev);
> -
> if (chip->dev_num == 0)
> chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
> else
> chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
>
> - chip->devs.devt =
> - MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
> -
> rc = dev_set_name(&chip->dev, "tpm%d", chip->dev_num);
> - if (rc)
> - goto out;
> - rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
> if (rc)
> goto out;
>
> @@ -382,9 +364,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
> chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
>
> cdev_init(&chip->cdev, &tpm_fops);
> - cdev_init(&chip->cdevs, &tpmrm_fops);
> chip->cdev.owner = THIS_MODULE;
> - chip->cdevs.owner = THIS_MODULE;
>
> rc = tpm2_init_space(&chip->work_space, TPM2_SPACE_BUFFER_SIZE);
> if (rc) {
> @@ -396,7 +376,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
> return chip;
>
> out:
> - put_device(&chip->devs);
> put_device(&chip->dev);
> return ERR_PTR(rc);
> }
> @@ -445,13 +424,33 @@ static int tpm_add_char_device(struct tpm_chip *chip)
> }
>
> if (chip->flags & TPM_CHIP_FLAG_TPM2) {
> + device_initialize(&chip->devs);
> + chip->devs.parent = pdev;
> + chip->devs.class = tpmrm_class;
> + rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
> + if (rc)
> + goto out_put_devs;
> +
> + /*
> + * get extra reference on main device to hold on behalf of devs.
> + * This holds the chip structure while cdevs is in use. The
> + * corresponding put is in the tpm_devs_release.
> + */
> + get_device(&chip->dev);
> + chip->devs.release = tpm_devs_release;
> +
> + chip->devs.devt =
> + MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
> + cdev_init(&chip->cdevs, &tpmrm_fops);
> + chip->cdevs.owner = THIS_MODULE;
> +
> rc = cdev_device_add(&chip->cdevs, &chip->devs);
> if (rc) {
> dev_err(&chip->devs,
> "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
> dev_name(&chip->devs), MAJOR(chip->devs.devt),
> MINOR(chip->devs.devt), rc);
> - return rc;
> + goto out_put_devs;
> }
> }
>
> @@ -460,6 +459,10 @@ static int tpm_add_char_device(struct tpm_chip *chip)
> idr_replace(&dev_nums_idr, chip, chip->dev_num);
> mutex_unlock(&idr_lock);
>
> +out_put_devs:
> + put_device(&chip->devs);
> +out_del_dev:
> + cdev_device_del(&chip->cdev);
> return rc;
> }
>
> @@ -640,8 +643,10 @@ void tpm_chip_unregister(struct tpm_chip *chip)
> if (IS_ENABLED(CONFIG_HW_RANDOM_TPM))
> hwrng_unregister(&chip->hwrng);
> tpm_bios_log_teardown(chip);
> - if (chip->flags & TPM_CHIP_FLAG_TPM2)
> + if (chip->flags & TPM_CHIP_FLAG_TPM2) {
> cdev_device_del(&chip->cdevs, &chip->devs);
> + put_device(&chip->devs);
> + }
> tpm_del_char_device(chip);
> }
> EXPORT_SYMBOL_GPL(tpm_chip_unregister);
>

I tested the solution you scetched and it fixes the issue for me. Will you send a (real) patch for this?

Best regards,
Lino