Re: [PATCHv2] of: Add generic handling for ePAPR 1.1 fail-sss states

From: Frank Rowand
Date: Thu Sep 08 2016 - 15:05:46 EST


On 09/08/16 06:38, Rob Herring wrote:
> On Wed, Aug 31, 2016 at 4:41 PM, Tony Lindgren <tony@xxxxxxxxxxx> wrote:
>> * Frank Rowand <frowand.list@xxxxxxxxx> [160831 13:51]:
>>> On 08/29/16 15:35, Tony Lindgren wrote:
>>>> if (of_device_is_incomplete(pdev->dev.of_node, status)) {
>>>> if (!strcmp("hw-incomplete-pins", status)) {
>>>> dev_info(&pdev->dev,
>>>> "Unusable hardware: Not pinned out\n");
>>>> err = -ENODEV;
>>>> goto out;
>>>> }
>>>> if (!strcmp("hw-missing-daughter-card")) {
>>>> err = -EPROBE_DEFER;
>>>> goto out;
>>>> }
>>>> if (!strcmp("hw-buggy-dma")) {
>>>> dev_warn(&pdev->dev,
>>>> "Replace hardware for working DMA\n");
>>>> }
>>>> }
>>>
>>> What if the device has two issues to be reported? You can not
>>> specify two different values for the status property.
>>
>> That's a good point.
>>
>>> What if the firmware wants to report that the hardware failed
>>> self-test (thus status = "fail-sss") but is already using
>>> status to describe the hardware?
>>
>> Yeah that's true. Do you know what the "sss" stands for here?
>> Status Self teSt, or Side Scan Sonar? :)
>
> String String String!!!?
>
> No clue for me.
>
>>
>>>> - Make more generic as suggested by Frank but stick with
>>>> "operational status of a device" approch most people seem
>>>> to prefer that
>>>
>>> I am still opposed to using the status property for this purpose.
>>>
>>> The status property is intended to report an operational problem with
>>> a device or a device that the kernel can cause to be operational (such
>>> as a quiescent cpu being enabled). It is the only property I am aware
>>> of to report _state_.
>
> Yes, in theory a device can go from disabled to okay, but that's
> generally never been supported. Linux takes the simple approach of
> "disabled" means ignore it. I think we'll see that change with
> overlays.
>
>>> It is unfortunate that Linux has adopted the practice of overloading status
>>> to determine whether a piece of hardware exists or does not exist. This
>>> is extremely useful for the way we structure the .dts and .dtsi files but
>>> should have used a new property name. We are stuck with that choice of
>>> using the status property for two purposes, first the state of a device,
>>> and secondly the hardware description of existing or not existing.
>
> I don't agree. Generally, disabled means the h/w is there, but don't
> use it. There may be some cases where the hardware doesn't exist for
> the convenience of having a single dts, but that's the exception.

That it is not an exception, but instead a frequent pattern for .dtsi files.
A quick look in arm:

$ grep status *.dtsi | wc -l
4842

$ grep status *.dtsi | grep '"disabled"' | wc -l
3431


>>> Why not just create a new property that describes the hardware?
>>> Perhaps something like:
>>>
>>> incomplete = "pins_output", "buggy_dma";
>>
>> New property for incomplete works for me. Rob, got any comments here?
>
> Pins not muxed out or connected on the board has to be the #1 reason
> for disabled status. I don't think we need or want another way to
> express that.

How is that expressed now?


> We may have discussed this, but why can't the driver that checks fail
> state just check whatever was used to set the device to fail in the
> first place?
>
> Rob
>