Re: [PATCH 2/2] coresight: tmc: Add shutdown callback for TMC ETR/ETF

From: Mike Leach
Date: Wed Jun 03 2020 - 09:51:27 EST


Hi,


On Wed, 3 Jun 2020 at 14:34, Robin Murphy <robin.murphy@xxxxxxx> wrote:
>
> On 2020-06-03 14:22, Mike Leach wrote:
> > Hi Sai,
> >
> > On Wed, 3 Jun 2020 at 13:14, Sai Prakash Ranjan
> > <saiprakash.ranjan@xxxxxxxxxxxxxx> wrote:
> >>
> >> Hi Mike,
> >>
> >> On 2020-06-03 16:57, Mike Leach wrote:
> >>> Hi,
> >>>
> >>> On Wed, 3 Jun 2020 at 11:24, Sai Prakash Ranjan
> >>> <saiprakash.ranjan@xxxxxxxxxxxxxx> wrote:
> >>>>
> >>>> Hi Mike,
> >>>>
> >>>> Thanks again for looking at this.
> >>>>
> >>>> On 2020-06-03 03:42, Mike Leach wrote:
> >>>> [...]
> >>>>
> >>>>>>
> >>>>>> SMMU/IOMMU won't be able to do much here as it is the client's
> >>>>>> responsiblity to
> >>>>>> properly shutdown and SMMU device link just makes sure that
> >>>>>> SMMU(supplier) shutdown is
> >>>>>> called only after its consumers shutdown callbacks are called.
> >>>>>
> >>>>> I think this use case can be handled slightly differently than the
> >>>>> general requirements for modular CoreSight drivers.
> >>>>>
> >>>>> What is needed here is a way of stopping the underlying ETR hardware
> >>>>> from issuing data to the SMMU, until the entire device has been shut
> >>>>> down, in a way that does not remove the driver, breaking existing
> >>>>> references and causing a system crash.
> >>>>>
> >>>>> We could introduce a new mode to the ETR driver - e.g.
> >>>>> CS_MODE_SHUTDOWN.
> >>>>>
> >>>>> At the end of the block tmc_shutdown(struct amba_device *adev), set
> >>>>> drvdata->mode to CS_MODE_SHUTDOWN & remove the coresight_unregister().
> >>>>> This new mode can be used to prevent the underlying hardware from
> >>>>> being able to restart until the device is re-powered.
> >>>>>
> >>>>> This mode can be detected in the code that enables / disables the ETR
> >>>>> and handled appropriately (updates to tmc_enable_etr_sink and
> >>>>> tmc_disable_etr_sink).
> >>>>> This mode will persist until the device is re-started - but because we
> >>>>> are on the device shutdown path this is not an issue.
> >>>>>
> >>>>> This should leave the CoreSight infrastructure stable until the
> >>>>> drivers are shut down normally as part of the device power down
> >>>>> process.
> >>>>>
> >>>>
> >>>> Sounds good to me, but if the coresight_unregister() is the trouble
> >>>> point
> >>>> causing these crashes, then can't we just remove that from
> >>>> tmc_shutdown()
> >>>> callback? This would be like maintaining the same behaviour as now
> >>>> where
> >>>> on reboot/shutdown we basically don't do anything except for disabling
> >>>> ETR.
> >>>
> >>> No - the new mode prevents race conditions where the thread shutting
> >>> down the SMMU does the ETR shutdown, but then another thread happens
> >>> to be trying to start trace and restarts the ETR.
> >>> It also prevents the condition Mathieu discussed where a thread might
> >>> be attempting to shutdown trace - this could try to disable the
> >>> hardware again re-releasing resources/ re-flushing and waiting for
> >>> stop.
> >>>
> >>
> >> I do not think there will a race between SMMU shutdown and ETR shutdown.
> >> Driver core takes care of calling SMMU shutdown after its consumer
> >> shutdown callbacks via device link, otherwise there would already be
> >> bugs in all other client drivers.
> >>
> >
> > I am not saying there could be a race between tmc_shutdowm and
> > Smmu_shutdown - there may be a case if the coresight_disable_path
> > sequence is running and gets to the point of disabling the ETR after
> > the SMMU callback has disabled it.
>
> I'm confused now - there is no "SMMU callback", we're talking about the
> system-wide cleanup from kernel_shutdown_prepare() or
> kernel_restart_prepare(). As far as I'm aware userspace should be long
> gone by that point, so although trace may have been left running ||
((offset >= TRCCIDCVRn(0)) && (offset <= TRCVMIDCVRn(7)), the
> chance of racing against other driver operations seems pretty unlikely.
>

Sorry - bad choice of terminology. I was referring to the SMMU
ensuring that it had all its clients shut-down before if shut down. To
quote Sai...

>>>>> SMMU device link just makes sure that
> >>>>>> SMMU(supplier) shutdown is
> >>>>>> called only after its consumers shutdown callbacks are called.

I agree it is unlikely, but if removing the device from the CoreSight
infrastructure via coresight_unregister() is a potential source of a
crash, it would seem that there is a potential path where some
CoreSight driver side work might be possible. therefore a mode to
prevent this crash, and ensure that the device hardware remains off
and not sending trace to SMMU until such time as shutdown / reboot
restart occurs, seemed prudent.

Mike

> Robin.



--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK