Re: [RFC PATCH 00/11] Removing Calxeda platform support

From: Andrà Przywara
Date: Wed Feb 19 2020 - 20:39:28 EST


On 19/02/2020 22:54, Olof Johansson wrote:

Hi,

> On Tue, Feb 18, 2020 at 10:14 AM Andre Przywara <andre.przywara@xxxxxxx> wrote:
>>
>> On Tue, 18 Feb 2020 11:13:10 -0600
>> Rob Herring <robh@xxxxxxxxxx> wrote:
>>
>> Hi,
>>
>>> Calxeda has been defunct for 6 years now. Use of Calxeda servers carried
>>> on for some time afterwards primarily as distro builders for 32-bit ARM.
>>> AFAIK, those systems have been retired in favor of 32-bit VMs on 64-bit
>>> hosts.
>>>
>>> The other use of Calxeda Midway I'm aware of was testing 32-bit ARM KVM
>>> support as there are few or no other systems with enough RAM and LPAE. Now
>>> 32-bit KVM host support is getting removed[1].
>>>
>>> While it's not much maintenance to support, I don't care to convert the
>>> Calxeda DT bindings to schema nor fix any resulting errors in the dts files
>>> (which already don't exactly match what's shipping in firmware).
>>
>> While every kernel maintainer seems always happy to take patches with a negative diffstat, I wonder if this is really justification enough to remove a perfectly working platform. I don't really know about any active users, but experience tells that some platforms really are used for quite a long time, even if they are somewhat obscure. N900 or Netwinder, anyone?
>
> One of the only ways we know to confirm whether there are active users
> or not, is to propose removing a platform.
>
> The good news is that if/when you do, and someone cares enough about
> it to want to keep it alive, they should also have access to hardware
> and can help out in maintaining it and keeping it in a working state.
>
> For some hardware platforms, at some point in time it no longer makes
> sense to keep the latest kernel available on them, especially if
> maintainers and others no longer have easy access to hardware and
> resources/time to keep it functional.
>
> It's really more about "If you care about this enough to keep it
> going, please speak up and help out".

I understand that, hence this email ;-)

I just wanted to avoid the impression that, by looking at the replies on
the list, *everybody* seems to be happy with the removal and it just
goes ahead. I have no idea how many actual *users* read this list and
this email.

>> So to not give the impression that actually *everyone* (from that small subset of people actively reading the kernel list) is happy with that, I think that having support for at least Midway would be useful. On the one hand it's a decent LPAE platform (with memory actually exceeding 4GB), and on the other hand it's something with capable I/O (SATA) and networking, so one can actually stress test the system. Which is the reason I was using that for KVM testing, but even with that probably going away now there remain still some use cases, and be it for general ARM(32) testing.
>
> How many bugs have you found on this platform that you would not have
> on a more popular one? And, how many of those bugs only affected this
> platform, i.e. just adding onto the support burden without positive
> impact to the broader community?

I have found and helped fixing (or fixed myself) multiple bugs on the
Midway in the past. The mixture of decent I/O and 8GB of DRAM seemed to
be unique enough to spot bugs that didn't easily show on other systems.
Most were on KVM, but some were generic, and I remember at least one
LPAE related. And some bugs only showed under stress, because you can
actually run something useful on that machine before it goes on its knees.

>> I don't particularly care about the more optional parts like EDAC, cpuidle, or cpufreq, but I wonder if keeping in at least the rather small SATA and XGMAC drivers and basic platform support is feasible.
>
> At what point are you better off just running under QEMU/virtualization?

For many things we are looking at that's not really an option.
If it would be very involved or painful to keep the support alive (as in
the KVM/arm32 case), I would see your point, but just some isolated
drivers (really a few and mostly quite small) don't justify a removal,
IMHO. I think we have far worse and older code in the kernel to worry about.

>> If YAML DT bindings are used as an excuse, I am more than happy to convert those over.
>>
>> And if anyone has any particular gripes with some code, maybe there is a way to fix that instead of removing it? I was always wondering if we could get rid of the mach-highbank directory, for instance. I think most of it is Highbank (Cortex-A9) related.
>
> Again, how do you fix it if nobody has signed up for maintaining and
> keeping it working? Doing blind changes that might or might not work
> is not a way to keep a platform supported.
>
> Just because code is removed, it doesn't mean it can't be reintroduced
> when someone comes along and wants to do that. Look at some of the
> recent additions of old OLPC hardware support, for example. But
> there's a difference between this and keeping the code around hoping
> that someone will care about it. It's not lost, and it's easy to bring
> back.

OK, maybe I should have been more explicit: If Rob does not want to
maintain it anymore, I am happy to throw my hat in the ring.

I have a working Midway system under my desk, with at least four working
nodes, two of them have an SSD connected and are running some
off-the-shelf Ubuntu 18.04 or Debian userland. I mostly run mainline
kernels, but try the distro kernels as well from time to time.
Routinely I test at least every -rc1 for regressions.

I also have updates to the A-15 firmware parts (U-Boot and PSCI runtime,
including PSCI 1.0 support and a Spectre V2 workaround), and have a
working setup to either chainload or actually update the firmware on the
flash. Happy to share that if someone is interested. For U-Boot I wanted
to send updates anyway.
I also have an old Highbank system lying around, but haven't turned that
on in years.

So would just a patch to MAINTAINERS be a solution?

Cheers,
Andre