Re: [RFC] ITS fails to allocate on rk3568/rk3566

From: Kever Yang
Date: Tue Apr 20 2021 - 21:41:23 EST


Hi Marc,

On 2021/4/16 下午11:23, Marc Zyngier wrote:
On Fri, 16 Apr 2021 02:13:38 +0100,
Kever Yang <kever.yang@xxxxxxxxxxxxxx> wrote:
Hi Marc,

On 2021/4/15 下午4:11, Marc Zyngier wrote:
Hi Kever,

On Thu, 15 Apr 2021 08:24:33 +0100,
Kever Yang <kever.yang@xxxxxxxxxxxxxx> wrote:
Hi Marc, Peter,

    RK356x GIC has two issues:

1. GIC only support 32bit address while rk356x supports 8GB DDR SDRAM,
so we use ZONE_DMA32 to fix this issue;
What transactions does this affect exactly?
The GIC on rk356x is a 32bit master, which means all the space its
logic need to access should be in the 4GB range.
Well, at least this is consistently broken.

There are many legacy IPs which only support 32bit bus, we have to use them as is in the new

64bit SoCs, I think the 32bit GIC can be considered the same as those case, can we add CONFIG_ZONE_DMA32

support in GIC driver?

eg. Other peripheral driver use dma_alloc_coherent() instead of alloc_pages(), so then can support

ZONE_DMA32 without any code change.


Only some ITS tables? Or
all of them, including the command queue? What about the configuration
and pending tables associated with the redistributors?

2. GIC version is r1p6-00rel0, RK356x interconnect does not support
GIC and CPU snoop to each other, hence the GIC does not support the
shareability feature.  The read of register value for shareability
feature does not return as expect in GICR and GITS, so we have to
workaround for it.
How about the cacheability attribute? Can you please provide the exact
set of attributes that this system actually supports for each of the
ITS and redistributor base registers?
The shareability attributes in GICR_PENDBASEER, GICR_PROPBASER,
GITS_BASERn, GITS_CBASER default value is 0b00, when we set 0b01 then
read returns 0b01.
And I claim that this is a perfectly broken behaviour. How do you
expect software to find about the gory details of the integration?
That's the only way for SW to find out what the HW is capable of...

As a software engineer, I totally agree with you on this point, but when we back to the truth,

we should know that these registers has never been defined as "module capability for shareability"

in hardware. The software use it as capability detect, and it works before, but not means this

match the original hardware design, in this case the new hardware may not follow the legacy

design because it's not a standard or a clear enough guide for it.

The capability register should always read-only instead of read-write, the understanding from

software engineer and from hardware engineer always have a gap.

I would proposal to add a optional dts property for driver to identify if the board's GIC support

shareability or not, which is a method used by many other module drivers.

Since there is no ACE coherency interface for this GIC controller, all
the cacheability in the GIC is not support in hardware.

Also, please provide errata numbers for these two issues so that we
can properly document them and track the workarounds.
What kind of errata do you need, could you please share any kind of
example close to this case?
I would like something that says:

"ROCKCHIP_ERRATUM_123456: The GIC600 integration in RK356x doesn't
support any of the shareability or cacheability attributes, and
requires both values to be set to 0b00 for all the ITS and
Redistributor tables."

This is pretty similar to the bug affecting ThunderX with its "erratum
24313" (covered by CONFIG_CAVIUM_ERRATUM_22375), where the tables have
to be flagged as non-cacheable. The Rockchip one is just worse.

We need an official erratum number so that we can refer to it in the
source, commit log and documentation, as well as cross-reference it
with the TRM. This number will be part of a configuration symbol that
will make the compilation conditional so that people don't have to
carry the extra burden generated by this bug if they don't need to.

Same thing goes for the 32bit bug.

We consider this as a SoC implement design instead of a bug, so we
will add document in RK356X  TRM to describe the GIC design, but no
idea how to provide the errata.

Here is the shareabily attribute from ARM GIC architecture specification:
Shareability, bits [11:10] (from GITS_CBASER)
Indicates the Shareability attributes of accesses to the command
queue. The possible values of this field are:
0b00 Non-shareable.
0b01 Inner Shareable.
0b10 Outer Shareable.
0b11 Reserved. Treated as 0b00.
It is IMPLEMENTATION DEFINED whether this field has a fixed value or
can be programmed by software. Implementing this field with a fixed
value is deprecated.
On a Warm reset, this field resets to an architecturally UNKNOWN value

As you can see, "Implementing this field with a fixed value is
deprecated", so software should program this field to '0b00
Non-shareable' if the SoC design does not support the cache
shareability.
[I really feel special when people quote the GIC spec at me]

That isn't what it says. Hardcoding the field with a fixed value is
indeed deprecated, but that doesn't mean this field should accept
values that the HW cannot support. If anything, what this says is "try
and implement the options that SW is going to use".

But you need to give SW an indication of what is usable, because there
is no other way to *discover* what the SoC is capable of at runtime.
Otherwise, we would need to carry a per-SoC list of what the HW
supports. I don't think that's the right thing to do (and you're about
8 years too late anyway).

Other GIC600 integrations got it perfectly right, by the way. Same for
other GIC implementations, with the notable exception of Cavium and
their first GIC in ThunderX, as described above.

I'm not sure if you still work for ARM or not, but we do have double check again and again

with IP vendor about this point, the GIC500 works because it's hardware bind to a fix value inside the IP,

but GIC600 does not do the same binding, and there is no any config option to hardcode the reg field

to a fixed value in the IP when we don't need the ACE-lite.

This is why I insist that this is a different SoC implementation instead of a SoC bug, we should add the

support in the GIC driver for different hardware case.

When you say we're about 8 years too late, I think there still not much SoCs using GIC600, maybe the GIC600 IP

is available for user at mid of 2017?


Thanks,
- Kever

Thanks,

M.