Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU

From: Arnd Bergmann
Date: Thu Feb 13 2020 - 11:53:11 EST

Next message: Chris Paterson: "RE: [PATCH 4.19 00/52] 4.19.104-stable review"
Previous message: Hannes Reinecke: "Re: [PATCH] scsi: advansys: Replace zero-length array with flexible-array member"
In reply to: Lucas Stach: "Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU"
Next in thread: Geert Uytterhoeven: "Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Feb 12, 2020 at 9:50 AM Russell King - ARM Linux admin
<linux@xxxxxxxxxxxxxxx> wrote:
>
> On Tue, Feb 11, 2020 at 05:03:02PM -0800, Linus Torvalds wrote:
> > On Tue, Feb 11, 2020 at 4:47 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > What's the situation with highmem on ARM?
> >
> > Afaik it's exactly the same as highmem on x86 - only 32-bit ARM ever
> > needed it, and I was ranting at some people for repeating all the
> > mistakes Intel did.
> >
> > But arm64 doesn't need it, and while 32-bit arm is obviosuly still
> > selling, I think that in many ways the switch-over to 64-bit has been
> > quicker on ARM than it was on x86. Partly because it happened later
> > (so all the 64-bit teething pains were dealt with), but largely
> > because everybody ended up actively discouraging 32-bit on the Android
> > side.
> >
> > There were a couple of unfortunate early 32-bit arm server attempts,
> > but they were - predictably - complete garbage and nobody bought them.
> > They don't exist any more.

I'd generally agree with that, the systems with more than 4GB tended to
be high-end systems predating the Cortex-A53/A57 that quickly got
replaced once there were actual 64-bit parts, this would include axm5516
(replaced with x86-64 cores after sale to Intel), hip04 (replaced
with arm64), or ecx-2000 (Calxeda bankruptcy).

The one 32-bit SoC that I can think of that can actually drive lots of
RAM and is still actively marketed is TI Keystone-2/AM5K2.
The embedded AM5K2 is listed supporting up to 8GB of RAM, but
the verison in the HPE ProLiant m800 server could take up to 32GB (!).

I added Santosh and Kishon to Cc, they can probably comment on how
long they think users will upgrade kernels on these. I suspect these
devices can live for a very long time in things like wireless base stations,
but it's possible that they all run on old kernels anyway by now (and are
not worried about y2038).

> > So at least my gut feel is that the arm people don't have any big
> > reason to push for maintaining HIGHMEM support either.
> >
> > But I'm adding a couple of arm people and the arm list just in case
> > they have some input.
> >
> > [ Obvious background for newly added people: we're talking about
> > making CONFIG_HIGHMEM a deprecated feature and saying that if you want
> > to run with lots of memory on a 32-bit kernel, you're doing legacy
> > stuff and can use a legacy kernel ]
>
> Well, the recent 32-bit ARM systems generally have more than 1G
> of memory, so make use of highmem as a rule. You're probably
> talking about crippling support for any 32-bit ARM system produced
> in the last 8 to 10 years.

What I'm observing in the newly added board support is that memory
configurations are actually going down, driven by component cost.
512MB is really cheap (~$4) these days with a single 256Mx16 DDR3
chip or two 128Mx16. Going beyond 1GB is where things get expensive
with either 4+ chips or LPDDR3/LPDDR4 memory.

For designs with 1GB, we're probably better off just using
CONFIG_VMSPLIT_3G_OPT (without LPAE) anyway, completely
avoiding highmem. That is particularly true on systems with a custom
kernel configuration.

2GB machines are less common, but are definitely important, e.g.
MT6580 based Android phones and some industrial embedded machines
that will live a long time. I've recently seen reports of odd behavior
with CONFIG_VMSPLIT_2G and plus CONFIG_HIGHMEM and a 7:1
ratio of lowmem to highmem that apparently causes OOM despite lots
of lowmem being free. I suspect a lot of those workloads would still be
better off with a CONFIG_VMSPLIT_2G_OPT (1.75 GB user, 2GB
linear map). That config unfortunately has a few problems, too:
- nobody has implemented it
- it won't work with LPAE and therefore cannot support hardware
that relies on high physical addresses for RAM or MMIO
(those could run CONFIG_VMSPLIT_2G at the cost of wasting
12.5% of RAM).
- any workload that requires the full 3GB of virtual address space won't
work at all. This might be e.g. MAP_FIXED users, or build servers
linking large binaries.
It will take a while to find out what kinds of workloads suffer the most
from a different vmsplit and what can be done to address that, but we
could start by changing the kernel defconfig and distro builds to see
who complains ;-)

I think 32-bit ARM machines with 3GB or more are getting very rare,
but some still exist:
- The Armada XP development board had a DIMM slot that could take
large memory (possibly up to 8GB with LPAE). This never shipped as
a commercial product, but distro build servers sometimes still run on
this, or on the old Calxeda or Keystone server systems.
- a few early i.MX6 boards (e.g. HummingBoard) came had 4GB of
RAM, though none of these seem to be available any more.
- High-end phones from 2013/2014 had 3GB LPDDR3 before getting
obsoleted by 64-bit phones. Presumably none of these ever ran
Linux-4.x or newer.
- My main laptop is a RK3288 based Chromebook with 4GB that just
got updated to linux-4.19 by Google. Official updates apparently
stop this summer, but it could easily run Debian later on.
- Some people run 32-bit kernels on a 64-bit Raspberry Pi 4 or on
arm64 KVM with lots of RAM. These should probably all
migrate to 64-bit kernels with compat user space anyway.
In theory these could also run on a VMSPLIT_4G_4G-like setup,
but I don't think anyone wants to go there. Deprecating highmem
definitely impacts any such users significantly, though staying on
an LTS kernel may be an option if there are only few of them.

Arnd

Next message: Chris Paterson: "RE: [PATCH 4.19 00/52] 4.19.104-stable review"
Previous message: Hannes Reinecke: "Re: [PATCH] scsi: advansys: Replace zero-length array with flexible-array member"
In reply to: Lucas Stach: "Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU"
Next in thread: Geert Uytterhoeven: "Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]