Re: [PATCH v8 0/4] Introduce mseal

From: Linus Torvalds
Date: Fri Feb 02 2024 - 15:37:20 EST


On Fri, 2 Feb 2024 at 11:32, Theo de Raadt <deraadt@xxxxxxxxxxx> wrote:
>
> Unix system calls must be atomic.
>
> They either return an error, and that is a promise they made no changes.

That's actually not true, and never has been.

It's a good thing to aim for, but several errors means "some or all
may have been done".

EFAULT (for various system calls), ENOMEM and other errors are all
things that can happen after some of the system call has already been
done, and the rest failed.

There are lots of examples, but to pick one obvious VM example,
something like mlock() may well return an error after the area has
been successfully locked, but then the population of said pages failed
for some reason.

Of course, implementations can differ, and POSIX sometimes has insane
language that is actively incorrect.

Furthermore, the definition of "atomic" is unclear. For example, POSIX
claims that a "write()" system call is one atomic thing for regular
files, and some people think that means that you see all or nothing.
That's simply not true, and you'll see the write progress in various
indirect ways (look at intermediate file size with 'stat', look at
intermediate contents with 'mmap' etc etc).

So I agree that atomicity is something that people should always
*strive* for, but it's not some kind of final truth or absolute
requirement.

In the specific case of mseal(), I suspect there are very few reasons
ever *not* to be atomic, so in this particular context atomicity is
likely always something that should be guaranteed. But I just wanted
to point out that it's most definitely not a black-and-white issue in
the general case.

Linus