Re: [PATCH 00/45] C++: Convert the kernel to C++

From: Arsen Arsenović
Date: Fri Jan 12 2024 - 17:53:32 EST

Next message: Barry Song: "Re: [RFC PATCH v1] mm/filemap: Allow arch to request folio size for exec memory"
Previous message: John Stultz: "Re: [PATCH v4 3/7] dma-buf: heaps: restricted_heap: Add private heap ops"
In reply to: David Howells: "Re: [PATCH 00/45] C++: Convert the kernel to C++"
Next in thread: David Howells: "Re: [PATCH 00/45] C++: Convert the kernel to C++"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

David Howells <dhowells@xxxxxxxxxx> writes:

> Arsen Arsenović <arsen@xxxxxxxxx> wrote:
>
>> > (2) Constructors and destructors. Nests of implicit code makes the code less
>> > obvious, and the replacement of static initialisation with constructor
>> > calls would make the code size larger.
>>
>> This also disallows the primary benefit of C++ (RAII), though. A lot of
>> static initialization can be achieved using constexpr and consteval,
>> too.
>
> Okay, let me downgrade that to "I wouldn't allow it at first". The primary
> need for destructors, I think, is exception handling.

I'm not sure I agree, the amount of 'goto err' constructs in the kernel
seems to indicate otherwise to me. This feels like the exact same code,
except more error prone.

> And don't get me wrong, I like the idea of exception handling - so
> many bugs come because we mischeck or forget to check the error.

C++ also provides possible alternative avenues for solving such
problems, such as, for instance, an expected type with monadic
operations: https://en.cppreference.com/w/cpp/utility/expected

IIRC, using std::expected in managarm (where we previously used the IMO
far less nice Frigg expected type) is what initially prompted me to
start enabling the use of a lot of libstdc++ in kernel contexts, and
indeed, it is enabled there:
https://gcc.gnu.org/cgit/gcc/tree/libstdc++-v3/include/Makefile.am#n25

>> It is incredibly useful to be able to express resource ownership in
>> terms of automatic storage duration.
>
> Oh, indeed, yes - but you also have to be careful:
>
> (1) You don't always want to wait till the end of the scope before releasing
> resources.

One could move a resource out, or call a function akin to the 'reset()'
method of std::unique_ptr.

> (2) Expressing ownership of something like a lock so that it is automatically
> undone may require extra memory is currently unnecessary:
>
> struct foo {
> struct rwsem sem;
> };
>
>
> myfunc(struct foo *foo)
> {
> ...
> struct foo_shared_lock mylock(foo->sem);
> ...
> }
>
> This looks like a nice way to automatically take and hold a lock, but I
> don't think it can be done without storing the address of the semaphore
> in mylock - something that isn't strictly necessary since we can find sem
> from foo.

The compiler can often get rid of it. Here's an example:
https://godbolt.org/z/1W7bnYY7a

Simple enough wrapper classes like these combined with a modern
compilers IPA and inlining can really do magic :-)

> (3) We could implement a magic pointer class that automatically does
> reference wangling (kref done right) - but we would have to be very
> careful using it because we want to do the minimum number of atomic ops
> on its refcount that we can manage, firstly because atomic ops are slow
> and secondly because the atomic counter must not overflow.

With move semantics, this could be quite effective and general. The
shared_ptr from the standard library, for instance, won't bump
reference counts if moved. And temporaries are automatically moved.

You could make the class move-only so that *all* reference incrementing
requires a method call (and hence, is clear and obvious), while still
permitting auto-decrementing and preventing reference leakage.

>> > (5) Function overloading (except in special inline cases).
>>
>> Generic code, another significant benefit of C++, requires function
>> overloading, though.
>
> I know. But I was thinking that we might want to disable name mangling if we
> can so as not to bloat the size of the kernel image. That said, I do like the
> idea of being able to have related functions of the same name with different
> arguments rather than having to name each one differently.

Hmm, I can understand the symbol table size being an issue.

>> > (7) 'class', 'private', 'namespace'.
>>
>> 'class' does nothing that struct doesn't do, private and namespace serve
>> simply for encapsulation, so I don't see why banning these is useful.
>
> Namespaces would lead to image bloat as they make the symbols bigger.
> Remember, the symbol list uses up unswappable memory.

Ah, I was not aware of this restriction of the kernel (my understanding
was that the symbol table is outside of the kernel image). That poses a
problem, yes. I wonder if a big part of the symbol table (or even the
entirety of it) could be dropped from the kernel. I must say, I do not
know why the kernel has it, so I cannot speak on this issue.

> We use class and private a lot as symbols already, so to get my stuff to
> compile I had to #define them. Granted there's nothing intrinsically
> different about classes and we could rename every instance of the symbol in
> the kernel first.

I see. That is quite understandable then, especially if temporary.

> When it comes to 'private', actually, I might withdraw my objection to it: it
> would help delineate internal fields - but we would then have to change
> out-of-line functions that use it to be members of the class - again
> potentially increasing the size of the symbol table.

This is what I like about it too.

>> > (8) 'virtual'. Don't want virtual base classes, though virtual function
>> > tables might make operations tables more efficient.
>>
>> Virtual base classes are seldom useful, but I see no reason to
>> blanket-ban them (and I suspect you'll never notice that they're not
>> banned).
>
> You can end up increasing the size of your structure as you may need multiple
> virtual method pointer tables - and we have to be very careful about that as
> some structures (dentry, inode and page for example) we have a *lot* of
> instances of in a running kernel.

I retract what I said about virtual classes - I had, indeed, forgotten
about that issue (but, again, I doubt anyone will miss them ;-) ).

>> > (2) Direct assignment of pointers to/from void* isn't allowed by C++, though
>> > g++ grudgingly permits it with -fpermissive. I would imagine that a
>> > compiler option could easily be added to hide the error entirely.
>>
>> This should never be useful.
>
> It's not a matter of whether it should be useful - we do this an awful lot and
> every case of assigning to/from a void pointer would require some sort of
> cast.

I see. That could pose significant trouble.

Ideally, nearly all uses of void* could be lost sooner or later, as C++
has a more flexible (despite being stricter) type system.

Have a lovely day!

> David

--
Arsen Arsenović

Attachment: signature.asc
Description: PGP signature

Next message: Barry Song: "Re: [RFC PATCH v1] mm/filemap: Allow arch to request folio size for exec memory"
Previous message: John Stultz: "Re: [PATCH v4 3/7] dma-buf: heaps: restricted_heap: Add private heap ops"
In reply to: David Howells: "Re: [PATCH 00/45] C++: Convert the kernel to C++"
Next in thread: David Howells: "Re: [PATCH 00/45] C++: Convert the kernel to C++"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]