Re: RFC v2: post-init-read-only protection for data allocated dynamically

From: Michal Hocko
Date: Thu May 04 2017 - 10:01:46 EST

On Thu 04-05-17 16:37:55, Igor Stoppa wrote:
> On 04/05/17 16:11, Michal Hocko wrote:
> > On Thu 04-05-17 15:14:10, Igor Stoppa wrote:
> > I believe that this is a fundamental question. Sealing sounds useful
> > for after-boot usecases as well and it would change the approach
> > considerably. Coming up with an ad-hoc solution for the boot only way
> > seems like a wrong way to me. And as you've said SELinux which is your
> > target already does the thing after the early boot.
> I didn't spend too many thoughts on this so far, because the zone-based
> approach seemed almost doomed, so I wanted to wait for the evolution of
> the discussion :-)
> The main question here is granularity, I think.
> At least, as first cut, the simpler approach would be to have a master
> toggle: when some legitimate operation needs to happen, the seal is
> lifted across the entire range, then it is put back in place, once the
> operation has concluded.
> Simplicity is the main advantage.
> The disadvantage is that anything can happen, undetected, while the seal
> is lifted.

Yes and I think this makes it basically pointless

> OTOH the amount of code that could backfire should be fairly limited, so
> it doesn't seem a huge issue to me.
> The alternative would be to somehow know what a write will change and
> make only the appropriate page(s) writable. But it seems overkill to me.
> Especially because in some cases, with huge pages, everything would fit
> anyway in one page.
> One more option that comes to mind - but I do not know how realistic it
> would be - is to have multiple slabs, to be used for different purposes.
> Ex: one for the monolithic kernel and one for modules.
> It wouldn't help for livepatch, though, as it can modify both, so both
> would have to be unprotected.

Just to make my proposal more clear. I suggest the following workflow

cache = kmem_cache_create(foo, object_size, ..., SLAB_SEAL);

obj = kmem_cache_alloc(cache, gfp_mask);
[more allocations]

All slab pages belonging to the cache would get write protection. All
new allocations from this cache would go to new slab pages. Later
kmem_cache_seal will write protect only those new pages.

The main discomfort with this approach is that you have to create those
caches in advance, obviously. We could help by creating some general
purpose caches for common sizes but this sound like an overkill to me.
The caller will know which objects will need the protection so the
appropriate cache can be created on demand. But this reall depends on
potential users...
Michal Hocko