Re: disable-cap-mlock

From: Andrea Arcangeli
Date: Thu Apr 01 2004 - 13:53:03 EST


On Thu, Apr 01, 2004 at 10:34:25AM -0800, Andrew Morton wrote:
> William Lee Irwin III <wli@xxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Apr 01, 2004 at 08:48:25AM -0800, William Lee Irwin III wrote:
> > >> Something like this would have the minor advantage of zero core impact.
> > >> Testbooted only. vs. 2.6.5-rc3-mm4
> >
> > On Thu, Apr 01, 2004 at 06:59:52PM +0200, Andrea Arcangeli wrote:
> > > I certainly like this too (despite it's more complicated but it might
> > > avoid us to have to add further sysctl in the future), Andrew what do
> > > you prefer to merge? I don't mind either ways.
>
> What is the Oracle requirement in detail?

being able to shmget(SHM_HUGETLB) as normal user. However I cannot
disable the CAP_IPC_LOCK from hugetlbfs/inode.c since that would break
local security w.r.t. mlock.

If I've to disable that single check in hugetlbfs/inode.c then I prefer
to disable all CAP_IPC_LOCK so they can as well use mlock.

> If it's for access to hugetlbfs then there are the uid= and gid= mount
> options.

sure we know, that's not the problem, disabling mlock checks is easy
with hugetlbfs marking the mountpoint 777.

> If it's for access to SHM_HUGETLB then there was some discussion about
> extending the uid= thing to shm, but nothing happened. This could be
> resurrected.

never heard of those discussions sorry, though I'm not completely sure
that one single uid would work, a 777 ownership would certain work
instead, but how to give 777 ownership to not an fs. I mean, the sysctl
is an order of magnitude simpler and it covers the mlock and
shmclt(SHM_LOCK/UNLOCK) usages as well that cannot be covered by rlimit
either.

>
> If it's just generally for the ability to mlock lots of memory then
> RLIMIT_MEMLOCK would be preferable. I don't see why we'd need the sysctl
> when `ulimit -m' is available? (Where is that patch btw?)

the real reason is shmget(SHM_HUGETLB) but if I disable that
specific CAP_IPC_LOCK I can disable them all, and then nobody will have
to use rlimit anymore for this.

Also consider the same user can shmget multiple segments and that can
exceed the 4G limit of rlimit, since that's not address space but
physical memory (either ram or swap).

> I guess we could live with sysctl which simply nukes CAP_IPC_LOCK, but it
> has to be the when-all-else-failed option, yes?

Probably the main advantage is that it doesn't giveup all righs to the
user with fsuid == 0, it only gives it mlock. Though I can imagine some
user may not like to giveup mlock to fsuid == 0 either, so even that may
need to be a config option, but OTOH overcommit and other vm bits (again
stuff that can't be used to do real bad stuff) are already available to
fsuid == 0. So I think the sysctl-cap-mlock is reasonable feature at
least until there's a more finegriend way to give the
shmget(SHM_HUGETLB)/shmctl(SHM_LOCK/UNLOCK) rights.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/