Re: Rust and Kernel Vendoring [Was Re: [PATCH v2 1/8] lib/printbuf: New data structure for heap-allocated strings]
From: Kent Overstreet
Date: Sun Apr 24 2022 - 16:36:39 EST
On Sat, Apr 23, 2022 at 10:16:37AM -0400, James Bottomley wrote:
> You stripped the nuance of that. I said many no_std crates could be
> used in the kernel. I also said that the async crate couldn't because
> the rust compiler itself would have to support the kernel threading
I just scanned through that thread and that's not what you said. What you said
> The above is also the rust crate problem in miniature: the crates grow
> API features the kernel will never care about and importing them
> wholesale is going to take forever because of the internal kernel
> support issue. In the end, to take rust async as an example, it will
> be much better to do for rust what we've done for zlib: take the core
> that can support the kernel threading model and reimplement that in the
> kernel crate. The act of doing that will a) prove people care enough
> about the functionality and b) allow us to refine it nicely.
> I also don't think rust would really want to import crates wholesale.
> The reason for no_std is that rust is trying to adapt to embedded
> environments, which the somewhat harsh constraints of the kernel is
> very similar to.
But maybe your position has changed somewhat? It sounds like you've been
arguing against just directly depending on foreign reposotories and for the
staus quo of just ad-hoc copying of code.
I'll help by stating my own position: I think we should be coming up with a
process for how dependencies on other git repositories are going to work,
something better than just cut and paste. Whether or not we vendorize code isn't
really that important, but I'd say that if we are vendorizing code and we're not
including entire sub-repositories (like cargo vendor does) we ought to still
make this a scripted process that takes as an input a list of files we're
pulling and a remote repository we're pulling from, and the file list and the
remote repo (and commit ID we're pulling from) should all be checked in.
I think using cargo would be _great_ because it would handle this part for us
(perhaps minus pulling of individual files? haven't checked) instead of
home-growing our own. However, I'd like something for C repositories too, and I
don't think we want to start depending on cargo for non Rust development... :)
But much more important that the technical details of how we import code is just
having good answers for people who aren't embedded in Linux kernel development
culture, so we aren't just telling them "no" by default because we haven't
thought about this stuff and having them walk away frustrated.
> > I think Linus said recently that Rust in the kernel is something that
> > could fail, and he's right - but if it fails, it won't just be the
> > failure of the Rust people to do the required work, it'll be _our_
> > failure too, a failure to work with them.
> The big risk is that rust needs to adapt to the kernel environment.
> This isn't rust specific, llvm had similar issues as an alternative C
> compiler. I think rust in the kernel would fail if it were only the
> rust kernel people asking. Fortunately the pressure to support rust in
> embedded leading to the rise in no_std crates is a force which can also
> get rust in the kernel over the finish line because of the focus it
> puts on getting the language and crates to adapt to non standard
It's both! It's on all of us to make this work.
> > The kernel community has a lot of that going on here. Again, sorry to
> > pick on you James, but I wanted to make the argument that - maybe the
> > kernel _should_ be adopting a more structured way of using code from
> > outside repositories, like cargo, or git submodules (except I've
> > never had a positive experience with git submodules, so ignore that
> > suggestion, unless I've just been using them wrong, in which case
> > someone please teach me). To read you and Greg saying "nah, just
> > copy code from other repos, it's fine" - it felt like being back in
> > the old days when we were still trying to get people to use source
> > control, and having that one older colleague who _insisted_ on not
> > using source control of any kind, and that's a bit disheartening.
> Even in C terms, the kernel is a nostdlib environment. If a C project
> has too much libc dependency it's not going to work directly in the
> kernel, nor should it. Let's look at zstd (which is pretty much a
> nostdlib project) as a great example: the facebook people didn't
> actually port the top of their tree (1.5) to the kernel, they
> backported bug fixes to the 1.4 branch and made a special release
> (1.4.10) just for us. Why did they do this? It was because the 1.5
> version vastly increased stack use to the extent it would run off the
> end of the limited kernel stack so couldn't be ported directly into the
> kernel. A lot of C libraries that are nostdlib have problems like this
> as well (you can use recursion, but not in the kernel). There's no
> easy way of shimming environmental constraints like this.
I wonder if we might have come up with a better solution if there'd been more
cross-project communication and less siloing. Small stacks aren't particular to
the kernel - it's definitely not unheard of to write userspace code where you
want to have a lot of small stacks (especially if you're doing some kind of
coroutine style threading; I've done stuff like this in the past) - and to me,
as someone who's been incrementing on and maintaining a codebase in active use
for 10 years, having multiple older versions in active use that need bugfixes
gives me cold shivers.
I wouldn't be surprised if at some point the zstd people walk back some of their
changes or make it configurable at some point :)