Re: RFC: COW for hugepages

From: David Gibson
Date: Wed Apr 07 2004 - 22:27:30 EST


On Wed, Apr 07, 2004 at 08:21:11AM -0700, Dave Hansen wrote:
> On Wed, 2004-04-07 at 00:42, David Gibson wrote:
> > + /* XXX are there races with checking cpu_vm_mask? - Anton */
> > + tmp = cpumask_of_cpu(smp_processor_id());
> > + if (cpus_equal(vma->vm_mm->cpu_vm_mask, tmp))
> > + local = 1;
> > +
> > + cow = (vma->vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
>
> You're under the pagetable lock for that mm, so you at least don't have
> to worry about preemption. But, that definitely at least deserves a
> comment on why you didn't disable preemption.
>
> Also, for racing with other users of cpu_vm_mask, there aren't any other
> users that clear other cpus' bits other than initialization.

Yes. That comment was actually copied from code in the analagous
normal page path (from where it has since been removed). I've altered
it appropriately.

> One thing that's I didn't see in this patch is any check of capabilities
> or permissions. What if a privileged user sets up a single page huge
> page, then does a setuid() to a lower privilege level? Is that not a
> valid way to get hugetlb pages to an unprivileged user? That other user
> is free to go fork and write to the pages to their heart's content,
> consuming as many huge pages as are present in the system.

That seems to me like a "don't do that" situation (you can avoid the
security problem by only giving MAP_SHARED pages to the unprivileged
process). I don't see a better way to solve this without implementing
some sort of rlimit system for hugepages, which seems like overkill:
As far as I can tell all the real applications (huge databases, HPC,
benchmarks) for hugepages tend to be the sorts of cases where the
hugepage-using program is the only thing you really care about on the
system .

> It looks to me like dup_mmap() drops VM_LOCKED on mlock()'d VMAs at fork
> time. Do we need to do the same for hugetlb pages?

Given that all huge pages are unswappable and therefore effectively
mlock()ed, I don't think so.

--
David Gibson | For every complex problem there is a
david AT gibson.dropbear.id.au | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/