Re: [PATCH] tpm: don't destroy chip device prematurely
From: Jarkko Sakkinen
Date: Wed Oct 05 2016 - 06:09:37 EST
On Tue, Oct 04, 2016 at 10:47:38AM -0600, Jason Gunthorpe wrote:
> On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote:
>
> > > Make the driver uncallable first. The worst race that can happen is that
> > > open("/dev/tpm0", ...) returns -EPIPE. I do not consider this fatal at
> > > all.
> >
> > No responses for this reasonable proposal so I'll show what I mean:
>
> How is this any better than what Thomas proposed? It seems much worse
> to me since now we have even more stuff in the wrong order.
>
> There are three purposes to the ordering as it stands today
> 1) To guarantee that tpm2_shutdown is the last command delivered to
> the TPM. When it is issued all other ways to access the device
> are hard fenced off.
> 2) To hard fence the tpm subsystem for the 'platform' driver. Once
> tpm_del_char_device completes no callback into the driver
> is possible *at all*. The driver can destroy everything
> (iounmap, dereg irq, etc) and the driver module can be unloaded.
> 3) To prevent oopsing with the sysfs code. Recall this comment
>
> /* The sysfs routines rely on an implicit tpm_try_get_ops, device_del
> * is called before ops is null'd and the sysfs core synchronizes this
> * removal so that no callbacks are running or can run again
> */
>
> device_del is what eliminates the sysfs access path, so
> ordering device_del after ops = null is just unconditionally
> wrong.
>
> I still haven't heard an explanation why Thomas's other patches need
> this, or why trying to change this ordering makes any sense at
> all considering how the subsystem is constructed.
>
> Further, if tpm_crb now needs a registered device, how on earth do all
> the chip ops we call work *before* registration? Or is that another
> bug?
>
> Why can't tpm_crb return to the pre-registration operating state
> in the driver remove function before calling unregister?
>
> None of this makes any sense to me.
>
> This whole thing was very carefully constructed to work *correctly*
> during unregister. Many other subsystems have races and bugs during
> remove (eg see the securityfs discussion). TPM has a hard requirement
> to support safe unregister due to the vtpm stuff, so we don't get to
> screw it up just to support one driver.
Obviously a device is needed because it's required by the PM runtime
FW. I'm not following what you're saying about tpm2_shutdown(). With
the change I proposed it's the *very last* command delivered to the
device (because it's fenced by write lock).
> Jason
/Jarkko