Re: [GIT PULL 2/3] ARM: dts: samsung: DTS for v5.12

From: Bjorn Andersson
Date: Tue Feb 09 2021 - 12:12:52 EST


On Tue 09 Feb 08:27 CST 2021, Rob Herring wrote:

> On Mon, Feb 8, 2021 at 5:10 PM Alexandre Belloni
> <alexandre.belloni@xxxxxxxxxxx> wrote:
> >
> > On 08/02/2021 23:14:02+0100, Arnd Bergmann wrote:
> > > On Mon, Feb 8, 2021 at 10:35 PM Alexandre Belloni
> > > <alexandre.belloni@xxxxxxxxxxx> wrote:
> > > > On 08/02/2021 20:52:37+0100, Arnd Bergmann wrote:
> > > > > On Mon, Feb 8, 2021 at 7:42 PM Krzysztof Kozlowski <krzk@xxxxxxxxxx> wrote:
> > > > > > Let me steer the discussion to original topic - it's about old kernel
> > > > > > and new DTB, assuming that mainline kernel bisectability is not
> > > > > > affected.
> > > > > >
> > > > > > Flow looks like this:
> > > > > >
> > > > > > 0. You have existing bidings and drivers.
> > > > > > 1. Patch changing bindings (with new compatible) and drivers gets
> > > > > > accepted by maintainer.
> > > > > > 2. Patch above (bindings+drivers) goes during merge window to v5.11-rc1.
> > > > > > 3. Patch changing in-tree DTS to new compatible gets accepted by
> > > > > > maintainer and it is sent as v5.12-rc1 material to SoC maintainers.
> > > > > >
> > > > > > So again: old kernel, using old bindings, new DTB.
> > > > > >
> > > >
> > > > I don't think forward compatibility was ever considered. I've seen it
> > > > being mentioned a few times on #armlinux but honestly this simply can't
> > > > be achieved. This would mean being able to write complete DT bindings
> > > > for a particular SoC at day 0 which will realistically never happen. You
> > > > may noteven have a complete datasheet and even if you have a datasheet,
> > > > it may not be complete or it may be missing hw errata that are
> > > > discovered later on and need a new binding to handle.
> > >
> > > You do not have to write the correct DT for this, the only requirement
> > > is that any changes to a node are backward-compatible, which is
> > > typically the case if you add properties or compatible strings without
> > > removing the old one. A bugfix in this case is also backward-compatible.
> > >
> > > The part that can not happen instead is to write a DT that can expose
> > > features that any future kernel will use.
> > >
> >
> > But I think we are speaking about the other way around were you would be
> > e.g. removing properties or splitting a node is multiple different
> > nodes following a different understanding of the hardware.
> > And in this case, any rework of the bindings will be forbidden, like
> > 32b7cfbd4bb2 ("ARM: dts: at91: remove deprecated ADC properties") will
> > break older kernels trying to use the new dtb.
> > 761f6ed85417 ("ARM: dts: at91: sama5d4: use correct rtc compatible") is
> > an other case.
> > I'm not sure want to keep the older properties or the older compatible
> > string as a fallback for this use case.
> >
> > > > > However, once the firmware is updated, it may no longer be possible to
> > > > > go back to the old kernel in case the new one is busted.
> > > > >
> > > >
> > > > Any serious update strategy will update both the kernel and device tree
> > > > at the same time, exactly like you already have to update the initramfs
> > > > with the kernel as soon as it is including kernel modules.
> > > > I would expect any embedded platform to actually use a container format,
> > > > like a FIT image that will ship the kernel, DT and intiramfs in a single
> > > > image and will allow to sign all parts.
> > >
> > > Embedded systems that do this have no requirement for backward
> > > or forward compatibility at all, the only requirement for these is bisectability
> > > of git commits.
> > >
> >
> > Yes and I can't see any drawbacks in this approach.
> >
> > > > > A similar problem can happen with the EBBR boot flow that relies on
> > > > > a uefi-enabled firmware such as a u-boot, while using grub2 as the
> > > > > actual boot loader. This is commonly supported across distros. While
> > > > > grub2 can load a matching set of kernel+initrd+dtb from disk and run
> > > > > that, this often fails in practice because u-boot needs to fill a
> > > > > board specific set of DT properties (bootargs, detected memory,
> > > > > mac address, ...). The usual way this gets handled is that u-boot loads
> > > > > grub2 and the dtb from disk and then passes the modified dtb to grub,
> > > > > which picks only kernel+initrd from disk and boots this with the dtb.
> > > > >
> > > > > The result is similar to case with dtb built into the firmware: after
> > > > > upgrading the dtb that gets loaded by u-boot, grub can still pick
> > > > > old kernels but they may not work as they did in the past. There are
> > > > > obviously ways to work around it, but it does lead to user frustration.
> > > > >
> > > >
> > > > Are there really any platforms with the dtb built into the firmware?
> > > > I feel like this is a mythical creature used to scare people into keeping
> > > > the DTB ABI stable. Aren't all the distribution already able to cope
> > > > with keeping DTB and kernel in sync?
> > >
> > > I think most traditional PowerPC systems fall into this category, most
> >
> > My understanding was that the traditional PPC systems had a small device
> > tree and usually are not affected by driver changes but I may be wrong.
> >
> > > systems that boot using UEFI+grub (as I explained), and anyone who
> > > uses a distro kernel on custom hardware with their own dtb.
> > >
> >
> > Aren't the ones using a distro kernel with a custom dtb more concerned
> > by backward compatibility (i.e. new kernel with old dtb) rather than old
> > kernel on new dtb? If they have an old dtb, an old kernel, and update to
> > a new kernel, backward compatibility will ensure this continues to work.
> > If then they work on updating their dtb, they still have the old one and
> > can make the distribution match dtb and kernel. This is already handled
> > properly by debian and I guess the other distributions as it is anyway
> > already matching kernel and initramfs.
>
> SUSE is doing the opposite AIUI. This is a bit harder because adding
> any new provider breaks compatibility as the old kernel will wait for
> a non-existent driver for the new provider. That was the motivation
> for deferred probe timeouts. Of course, I wouldn't really call a
> platform stable if you are still adding clock, pinctrl, power-domain,
> etc. providers.
>

IMHO "stable" in this context means that we've hit the point in
development when these questions are no longer relevant. Either because
the development is _done_ or more likely it's too old for anyone to
care.

Unfortunately this is the state that we're optimizing for and we're
simply relying on luck to boot Linux on a reasonably complex machine.

Regards,
Bjorn