Re: [PATCH V2 1/2] riscv: Add RISC-V svpbmt extension

From: Nick Kossifidis
Date: Tue Sep 28 2021 - 09:19:20 EST

On 9/28/21 7:26 AM, Atish Patra wrote:
> On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@xxxxxxxxxxxxxx> wrote:
>> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@xxxxxxxxxxxx> wrote:
>>> Στις 2021-09-27 23:13, Atish Patra έγραψε:
>>>>> We need to decide whether we should support the upstream kernel for
>>>>> D1. Few things to consider.
>>>>> – Can it be considered as an errata ?
>>> It's one thing to follow the spec and have an error in the
>>> implementation, and another to not follow the spec.
>>>>> – Does it set a bad precedent and open can of worms in future ?
>>> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and
>>> asking for MMU support, they 've also shipped many chips already. I can
>>> also imagine other vendors in the future coming up with implementations
>>> that violate the spec in which case handling the standard stuff will
>>> become messy and complex, and hurt performance/security. We'll end up
>>> filling the code with exceptions and tweaks all over the place. We need
>>> to be strict about what is "riscv" and what's "draft riscv" or "riscv
>>> inspired", and what we are willing to support upstream. I can understand
>>> supporting vendor extensions upstream but they need to fit within the
>>> standard spec, we can't have for example extensions that use encoding
>>> space/csrs/fields etc reserved for standard use, they may only use
>>> what's reserved for custom/vendor use. At least let's agree on that.
>> Totally agree with Nick here. It's a slippery slope.
>> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V
>> means future hardware which intentionally violates specs will also have to
>> be merged and the RISC-V patch acceptance policy will have no significance.
>>>>> – Can we just ignore D1 given the mass volume ?
>>> IMHO no, we need to find a way to support it upstream but I believe
>>> there is another question to answer:
>>> Do we also guarantee "one image to rule them all" approach, required by
>>> binary distros, for implementations that violate the spec ? Are we ok
>>> for example to support Allwinner D1 upstream but require a custom
>>> configuration/build instead of supporting it with the "generic" image ?
>>> In one case we need to handle the violation at runtime and introduce
>>> overhead for everyone (like looking up __riscv_svpbmt every time we set
>>> a PTE in this case), in the other it's an #ifdef.
>> At least, we should not have hardware violating specs as part of the
>> unified kernel image instead have these intentional deviations/violations
>> under separate kconfig which will not be enabled by default. This means
>> vendors (of such hardware) and distros will have to explicitly enable
>> support for such violations/deviations.
> If we merge the code and are not enabled by default, it would be a
> maintenance nightmare in future.
> These part of the kernel will not be regularly tested but we have to
> carry the changes for a long time.

I don't see a difference between having these features as part of the
generic image vs having them as custom configs/builds. The code will get
executed only on boards that support the custom/non-compliant
implementation anyway. To the contrary we'll have more code to test if
we are doing things at runtime vs at compile time.

> Similar changes will only grow over time causing a lot of custom
> configs that are not enabled by default.

We'll have a lot of custom configs that will only get used on boards
that use them, vs runtime code that will run for no reason on every
board and choose the default/standard-compliant implementation most of
the time. In the end the code will only get tested on specific hardware

> IMHO, if we want to support this board in upstream, we should just
> clearly state that it is one time special exception
> for this board only because of the following reasons
> 1. The board design predates the patch acceptance policy.
> 2. We don't have enough affordable Linux compatible platforms today.
> 3. Allowing running an upstream kernel on D1 helps the RISC-V software
> ecosystem to grow.

The same can be said for Kendryte as well, are we willing to also
support their MMU implementation on the generic image if a patch comes
in ? To be clear I'm not saying we shouldn't support D1 or Kendryte
upstream, I'm just saying that we shouldn't sacrifice the compleity and
performance of the code path for standard-compliant implementations, to
support non-compliant implementations, and instead support non-compliant
implementations with custom kernel builds using compile time options. It
still counts as upstream support, they won't have to maintain their own
forks. It'll also allow custom implementations to have more flexibility
on what they can do since they will be able to use completely
different/custom code paths, instead of trying to fit in the standard
code path (which will become a mess over time). I think this approach is
much more flexible and will allow more customizations to be supported
upstream in the future.