Re: pivot_root(".", ".") and the fchdir() dance

From: Eric W. Biederman
Date: Tue Oct 08 2019 - 18:17:55 EST


"Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx> writes:

> On 10/8/19 9:40 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx> writes:
>>
>>> Hello Eric,
>>>
>>>>>> Creating of a mount namespace in a user namespace automatically does
>>>>>> 'mount("", "/", MS_SLAVE | MS_REC, NULL);' if the starting mount
>>>>>> namespace was not created in that user namespace. AKA creating
>>>>>> a mount namespace in a user namespace does the unshare for you.
>>>>>
>>>>> Oh -- I had forgotten that detail. But it is documented
>>>>> (by you, I think) in mount_namespaces(7):
>>>>>
>>>>> * A mount namespace has an owner user namespace. A
>>>>> mount namespace whose owner user namespace is differâ
>>>>> ent from the owner user namespace of its parent mount
>>>>> namespace is considered a less privileged mount namesâ
>>>>> pace.
>>>>>
>>>>> * When creating a less privileged mount namespace,
>>>>> shared mounts are reduced to slave mounts. (Shared
>>>>> and slave mounts are discussed below.) This ensures
>>>>> that mappings performed in less privileged mount
>>>>> namespaces will not propagate to more privileged mount
>>>>> namespaces.
>>>>>
>>>>> There's one point that description that troubles me. There is a
>>>>> reference to "parent mount namespace", but as I understand things
>>>>> there is no parental relationship among mount namespaces instances
>>>>> (or am I wrong?). Should that wording not be rather something
>>>>> like "the mount namespace of the process that created this mount
>>>>> namespace"?
>>>>
>>>> How about "the mount namespace this mount namespace started as a copy of"
>>>>
>>>> You are absolutely correct there is no relationship between mount
>>>> namespaces. There is just the propagation tree between mounts. (Which
>>>> acts similarly to a parent/child relationship but is not at all the same
>>>> thing).
>>>
>>> Thanks. I made the text as follows:
>>>
>>> * Each mount namespace has an owner user namespace. As noted
>>> above, when a new mount namespace is created, it inherits a
>>> copy of the mount points from the mount namespace of the
>>> process that created the new mount namespace. If the two mount
>>> namespaces are owned by different user namespaces, then the new
>>> mount namespace is considered less privileged.
>>
>> I hate to nitpick,
>
> I love it when you nitpick. Thanks for your attention to the details
> of my wording.
>
>> but I am going to say that when I read the text above
>> the phrase "mount namespace of the process that created the new mount
>> namespace" feels wrong.
>>
>> Either you use unshare(2) and the mount namespace of the process that
>> created the mount namespace changes.
>>
>> Or you use clone(2) and you could argue it is the new child that created
>> the mount namespace.
>>
>> Having a different mount namespace at the end of the creation operation
>> feels like it makes your phrase confusing about what the starting
>> mount namespace is. I hate to use references that are ambiguous when
>> things are changing.
>>
>> I agree that the term parent is also wrong.
>
> I see what you mean. My wording is imprecise.
>
> So, I tweaked text earlier in the page so that it now reads
> as follows:
>
> A new mount namespace is created using either clone(2) or
> unshare(2) with the CLONE_NEWNS flag. When a new mount namespace
> is created, its mount point list is initialized as follows:
>
> * If the namespace is created using clone(2), the mount point
> list of the child's namespace is a copy of the mount point list
> in the parent's namespace.
>
> * If the namespace is created using unshare(2), the mount point
> list of the new namespace is a copy of the mount point list in
> the caller's previous mount namespace.
>
> And then I tweaked the text that we are currently discussing to read:
>
> * Each mount namespace has an owner user namespace. As explained
> above, when a new mount namespace is created, its mount point
> list is initialized as a copy of the mount point list of
> another mount namespace. If the new namespaces and the namesâ
> pace from which the mount point list was copied are owned by
> different user namespaces, then the new mount namespace is conâ
> sidered less privileged.
>
> How does this look to you now?

Much better thank you.

Eric