Re: Could not mount sysfs when enable userns but disable netns

From: Eric W. Biederman
Date: Mon Jul 14 2014 - 13:27:19 EST


"chenhanxiao@xxxxxxxxxxxxxx" <chenhanxiao@xxxxxxxxxxxxxx> writes:

>> -----Original Message-----
>> From: Eric W. Biederman [mailto:ebiederm@xxxxxxxxxxxx]
>> Sent: Saturday, July 12, 2014 12:29 AM
>> To: Serge E. Hallyn
>> Cc: Chen, Hanxiao/é æé; Serge Hallyn (serge.hallyn@xxxxxxxxxx); Greg
>> Kroah-Hartman; containers@xxxxxxxxxxxxxxxxxxxxxxxxxx;
>> linux-kernel@xxxxxxxxxxxxxxx
>> Subject: Re: Could not mount sysfs when enable userns but disable netns
>>
>> "Serge E. Hallyn" <serge@xxxxxxxxxx> writes:
>>
>> > Quoting chenhanxiao@xxxxxxxxxxxxxx (chenhanxiao@xxxxxxxxxxxxxx):
>> >> Hello,
>> >>
>> >> How to reproduce:
>> >> 1. Prepare a container, enable userns and disable netns
>> >> 2. use libvirt-lxc to start a container
>> >> 3. libvirt could not mount sysfs then failed to start.
>> >>
>> >> Then I found that
>> >> commit 7dc5dbc879bd0779924b5132a48b731a0bc04a1e says:
>> >> "Don't allow mounting sysfs unless the caller has CAP_SYS_ADMIN rights
>> >> over the net namespace."
>> >>
>> >> But why should we check sysfs mouont permission over net namespace?
>> >> We've already checked CAP_SYS_ADMIN though.
>>
>> We already checked capable(CAP_SYS_ADMIN) and it failed.
>
> But on my machine, capable(CAP_SYS_ADMIN) passed
> but failed in kobj_ns_current_may_mount.

No. capable(CAP_SYS_ADMIN) did not pass.
fs_fully_visible did passed.

There is a significant distinction. If capable(CAP_SYS_ADMIN) had
passed kobj_ns_current_may_mount (which is a fancy way of saying
ns_capable(net->user_ns, CAP_SYS_ADMIN)) would also have passed.

> I added some printks in sysfs_mount:
> if (!(flags & MS_KERNMOUNT)) {
> - if (!capable(CAP_SYS_ADMIN) && !fs_fully_visible(fs_type))
> + if (!capable(CAP_SYS_ADMIN) && !fs_fully_visible(fs_type)) {
> + printk(KERN_WARNING "Failed in capable\n");
> return ERR_PTR(-EPERM);
> + }
>
> - if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
> + if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) {
> + printk(KERN_WARNING "Failed in kobj_ns_current_may_mount\n");
> return ERR_PTR(-EPERM);
> + }
>
> And found:
> Jul 14 09:55:26 localhost systemd: Starting Container lxc-chx.
> Jul 14 09:55:26 localhost systemd-machined: New machine lxc-chx.
> Jul 14 09:55:26 localhost systemd: Started Container lxc-chx.
> Jul 14 09:55:26 localhost kernel: [ 784.044709] Failed in kobj_ns_current_may_mount
> Jul 14 09:55:26 localhost systemd-machined: Machine lxc-chx terminated.
>
>>
>> >> What the relationship between sysfs and net namespace,
>> >> or this check is a little redundant?
>>
>> You want a bind mount not a new fresh mount.
>>
>
> Yes, we need to modify libvirt's codes to deal with sysfs
> when enable userns but disable netns.

Please go for it. I don't have any insignt into libvirt so I can't help
you there.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/