On Wed, 17 Feb 2021 at 10:42, Will Deacon <will@xxxxxxxxxx> wrote:
[Please include arm64 and kvm folks for threads involving the stage-2 MMU]
On Tue, Feb 16, 2021 at 03:47:52PM +0530, Preeti Nagar wrote:
> The changes introduce a new security feature, RunTime Integrity Check
> (RTIC), designed to protect Linux Kernel at runtime. The motivation
> behind these changes is:
> 1. The system protection offered by Security Enhancements(SE) for
> Android relies on the assumption of kernel integrity. If the kernel
> itself is compromised (by a perhaps as yet unknown future vulnerability),
> SE for Android security mechanisms could potentially be disabled and
> rendered ineffective.
> 2. Qualcomm Snapdragon devices use Secure Boot, which adds cryptographic
> checks to each stage of the boot-up process, to assert the authenticity
> of all secure software images that the device executes. However, due to
> various vulnerabilities in SW modules, the integrity of the system can be
> compromised at any time after device boot-up, leading to un-authorized
> SW executing.
>
> The feature's idea is to move some sensitive kernel structures to a
> separate page and monitor further any unauthorized changes to these,
> from higher Exception Levels using stage 2 MMU. Moving these to a
> different page will help avoid getting page faults from un-related data.
> The mechanism we have been working on removes the write permissions for
> HLOS in the stage 2 page tables for the regions to be monitored, such
> that any modification attempts to these will lead to faults being
> generated and handled by handlers. If the protected assets are moved to
> a separate page, faults will be generated corresponding to change attempts
> to these assets only. If not moved to a separate page, write attempts to
> un-related data present on the monitored pages will also be generated.
>
> Using this feature, some sensitive variables of the kernel which are
> initialized after init or are updated rarely can also be protected from
> simple overwrites and attacks trying to modify these.
Although I really like the idea of using stage-2 to protect the kernel, I
think the approach you outline here is deeply flawed. Identifying "sensitive
variables" of the kernel to protect is subjective and doesn't scale.
Furthermore, the triaging of what constitues a valid access is notably
absent from your description and is assumedly implemented in an opaque blob
at EL2.
I think a better approach would be along the lines of:
1. Introduce the protection at stage-1 (like we already have for mapping
e.g. the kernel text R/O)
2. Implement the handlers in the kernel, so the heuristics are clear.
3. Extend this to involve KVM, so that the host can manage its own
stage-2 to firm-up the stage-1 protections.
Agree here. Making an arbitrary set of data structures r/o behind the
OS's back doesn't seem like an easy thing to maintain or reason about,
especially if this r/o-ness is only enforced on a tiny subset of
devices. If something needs to be writable only at boot, we have
__ro_after_init, and having hypervisor assisted enforcement of /that/
might be a worthwhile thing to consider, including perhaps ways to do
controlled patching of this region at runtime.
I also think we should avoid tying this to specific data structures.
Rather, we should introduce a mechanism to make arbitrary data read-only.
I've CC'd Ard and Marc, as I think they've both been thinking about this
sort of thing recently as well.
Will