Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas
From: Michal Hocko
Date: Mon Feb 15 2021 - 04:15:08 EST
On Sun 14-02-21 11:21:02, James Bottomley wrote:
> On Sun, 2021-02-14 at 10:58 +0100, David Hildenbrand wrote:
> [...]
> > > And here we come to the question "what are the differences that
> > > justify a new system call?" and the answer to this is very
> > > subjective. And as such we can continue bikeshedding forever.
> >
> > I think this fits into the existing memfd_create() syscall just fine,
> > and I heard no compelling argument why it shouldn‘t. That‘s all I can
> > say.
>
> OK, so let's review history. In the first two incarnations of the
> patch, it was an extension of memfd_create(). The specific objection
> by Kirill Shutemov was that it doesn't share any code in common with
> memfd and so should be a separate system call:
>
> https://lore.kernel.org/linux-api/20200713105812.dnwtdhsuyj3xbh4f@box/
Thanks for the pointer. But this argument hasn't been challenged at all.
It hasn't been brought up that the overlap would be considerable higher
by the hugetlb/sealing support. And so far nobody has claimed those
combinations as unviable.
> The other objection raised offlist is that if we do use memfd_create,
> then we have to add all the secret memory flags as an additional ioctl,
> whereas they can be specified on open if we do a separate system call.
> The container people violently objected to the ioctl because it can't
> be properly analysed by seccomp and much preferred the syscall version.
>
> Since we're dumping the uncached variant, the ioctl problem disappears
> but so does the possibility of ever adding it back if we take on the
> container peoples' objection. This argues for a separate syscall
> because we can add additional features and extend the API with flags
> without causing anti-ioctl riots.
I am sorry but I do not understand this argument. What kind of flags are
we talking about and why would that be a problem with memfd_create
interface? Could you be more specific please?
--
Michal Hocko
SUSE Labs