Re: [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement
From: Rick Lindsley
Date: Wed Jun 24 2020 - 05:06:29 EST
Thanks, Tejun, appreciate the feedback.
On 6/23/20 4:13 PM, Tejun Heo wrote:
The problem is fitting that into an interface which wholly doesn't fit that
particular requirement. It's not that difficult to imagine different ways to
represent however many memory slots, right?
Perhaps we have different views of how this is showing up. systemd is
the primary instigator of the boot issues.
Systemd runs
/usr/lib/systemd/system/systemd-udev-trigger.service
which does a udev trigger, specifically
/usr/bin/udevadm trigger --type=devices --action=add
as part of its post-initramfs coldplug. It then waits for that to
finish, under the watch of a timer.
So, the kernel itself is reporting these devices to systemd. It gets
that information from talking to the hardware. That means, then, that
the obfuscation must either start in the kernel itself (it lies to
systemd), or start in systemd when it handles the devices it got from
the kernel. If the kernel lies, then the actual granularity is not
available to any user utilities.
Unless you're suggesting a new interface be created that would allow
utilities to determine the "real" memory addresses available for
manipulation. But the changes you describe cannot be limited to the
unknown number of auxiliary utilities.
Having one subsystem lie to another seems like the start of a bad idea,
anyway. When the hardware management console, separate from Linux,
reports a memory error, or adds or deletes memory in a guest system,
it's not going to be manipulating spoofed addresses that are only a
Linux construct.
In contrast, the provided patch fixes the observed problem with no
ripple effect to other subsystems or utilities.
Greg had suggested
Treat the system as a whole please, don't go for a short-term
fix that we all know is not solving the real problem here.
Your solution affects multiple subsystems; this one affects one. Which
is the whole system approach in terms of risk? You mentioned you
support 30k scsi disks but only because they are slow so the
inefficiencies of kernfs don't show. That doesn't bother you?
Rick