On Thu, Nov 21, 2024 at 11:15 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
On Thu, Nov 21, 2024 at 10:31:12PM +0800, Yafang Shao wrote:
On Thu, Nov 21, 2024 at 4:51 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
On 20.11.24 13:28, Yafang Shao wrote:
When CONFIG_STRICT_DEVMEM is enabled, writing to /dev/mem to override
kernel data for debugging purposes is prohibited. This configuration is
always enabled on our production servers. However, there are times when we
need to use the crash utility to modify kernel data to analyze complex
issues.
As suggested by Ingo, we can add a boot time knob of soft-enabling it.
Therefore, a new parameter "strict_devmem=" is added. The reuslt are as
follows,
- Before this change
crash> wr panic_on_oops 0
wr: cannot write to /proc/kcore <<<< failed
- After this change
- default
crash> wr panic_on_oops 0
wr: cannot write to /proc/kcore <<<< failed
- strict_devmem=off
crash> p panic_on_oops
panic_on_oops = $1 = 1
crash> wr panic_on_oops 0
crash> p panic_on_oops
panic_on_oops = $2 = 0 <<<< succeeded
- strict_devmem=invalid
[ 0.230052] Invalid option string for strict_devmem: 'invalid'
crash> wr panic_on_oops 0
wr: cannot write to /proc/kcore <<<< failed
Suggested-by: Ingo Molnar <mingo@xxxxxxxxxx>
Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
---
.../admin-guide/kernel-parameters.txt | 16 ++++++++++++++
drivers/char/mem.c | 21 +++++++++++++++++++
2 files changed, 37 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 1518343bbe22..7fe0f66d0dfb 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6563,6 +6563,22 @@
them frequently to increase the rate of SLB faults
on kernel addresses.
+ strict_devmem=
+ [KNL] Under CONFIG_STRICT_DEVMEM, whether strict devmem
+ is enabled for this boot. Strict devmem checking is used
+ to protect the userspace (root) access to all of memory,
+ including kernel and userspace memory. Accidental access
+ to this is obviously disastrous, but specific access can
+ be used by people debugging the kernel. Note that with
+ PAT support enabled, even in this case there are
+ restrictions on /dev/mem use due to the cache aliasing
+ requirements.
+ on If IO_STRICT_DEVMEM=n, the /dev/mem file only allows
+ userspace access to PCI space and the BIOS code and data
+ regions. This is sufficient for dosemu and X and all
+ common users of /dev/mem. (default)
+ off Disable strict devmem checks.
+
sunrpc.min_resvport=
sunrpc.max_resvport=
[NFS,SUNRPC]
This will allow to violate EXCLUSIVE_SYSTEM_RAM, and I am afraid I don't
enjoy seeing devmem handling+config getting more complicated.
That poses a challenge. Perhaps we should also consider disabling
functions that rely on EXCLUSIVE_SYSTEM_RAM when strict_devmem=off,
but implementing such a change seems overly complex.
Our primary goal is to temporarily bypass STRICT_DEVMEM for live
kernel debugging. In an earlier version, I proposed making the
fucntion devmem_is_allowed() error-injectable, but Ingo pointed out
that it violates the principles of STRICT_DEVMEM.
I think that "primary goal" is the problem here. We don't want to do
that, at all, for all the reasons why we implemented STRICT_DEVMEM and
for why people enable it.
Either you enable it because you want the protection and "security" it
provides, or you do not. Don't try to work around it please.
Do you have any suggestions on enabling write access to /dev/mem in
debugging tools like the crash utility, while maintaining
compatibility with the existing rules?
I think you just don't provide write access to /dev/mem for debugging
tools as it's a huge security hole that people realized and have plugged
up. If you want to provide access to this for "debugging" then just
don't enable that option and live with the risk involved, I don't see
how you can have it both ways.
I don’t quite see how STRICT_DEVMEM could pose a significant security
concern. If you’re root, you already have the ability to do whatever
you want on the system if you’re determined to. This option primarily
serves to prevent reckless or accidental writes to kernel memory.
As I understand it, STRICT_DEVMEM is more about enabling functionality
for features like page table checking and virtio_mem than about
enforcing security.