On Thu 11-02-21 09:13:19, Mike Rapoport wrote:
On Tue, Feb 09, 2021 at 02:17:11PM +0100, Michal Hocko wrote:[...]
On Tue 09-02-21 11:09:38, Mike Rapoport wrote:
Citing my older email:
I've hesitated whether to continue to use new flags to memfd_create() or to
add a new system call and I've decided to use a new system call after I've
started to look into man pages update. There would have been two completely
independent descriptions and I think it would have been very confusing.
Could you elaborate? Unmapping from the kernel address space can work
both for sealed or hugetlb memfds, no? Those features are completely
orthogonal AFAICS. With a dedicated syscall you will need to introduce
this functionality on top if that is required. Have you considered that?
I mean hugetlb pages are used to back guest memory very often. Is this
something that will be a secret memory usecase?
Please be really specific when giving arguments to back a new syscall
decision.
Isn't "syscalls have completely independent description" specific enough?
No, it's not as you can see from questions I've had above. More on that
below.
We are talking about API here, not the implementation details whether
secretmem supports large pages or not.
The purpose of memfd_create() is to create a file-like access to memory.
The purpose of memfd_secret() is to create a way to access memory hidden
from the kernel.
I don't think overloading memfd_create() with the secretmem flags because
they happen to return a file descriptor will be better for users, but
rather will be more confusing.
This is quite a subjective conclusion. I could very well argue that it
would be much better to have a single syscall to get a fd backed memory
with spedific requirements (sealing, unmapping from the kernel address
space). Neither of us would be clearly right or wrong. A more important
point is a future extensibility and usability, though. So let's just
think of few usecases I have outlined above. Is it unrealistic to expect
that secret memory should be sealable? What about hugetlb? Because if
the answer is no then a new API is a clear win as the combination of
flags would never work and then we would just suffer from the syscall
multiplexing without much gain. On the other hand if combination of the
functionality is to be expected then you will have to jam it into
memfd_create and copy the interface likely causing more confusion. See
what I mean?
I by no means do not insist one way or the other but from what I have
seen so far I have a feeling that the interface hasn't been thought
through enough. Sure you have landed with fd based approach and that
seems fair. But how to get that fd seems to still have some gaps IMHO.