Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?

From: Chris Mason
Date: Thu Sep 16 2021 - 12:51:23 EST



> On Sep 15, 2021, at 3:15 PM, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, 2021-09-15 at 18:41 +0000, Chris Mason wrote:
>>> On Sep 15, 2021, at 2:20 PM, Theodore Ts'o <tytso@xxxxxxx> wrote:
>>>
>>> On Wed, Sep 15, 2021 at 02:03:46PM -0400, James Bottomley wrote:
>>>> On Wed, 2021-09-15 at 13:42 -0400, Theodore Ts'o wrote:
>>>> [...]
>>>>> Would this be helpful? (Or Linus could pull either the folio
>>>>> or pageset branch, and make this proposal obsolete, which would
>>>>> be great. :-)
>>>>
>>>> This is a technical rather than process issue isn't it? You
>>>> don't have enough technical people at the Maintainer summit to
>>>> help meaningfully. The ideal location, of course, was LSF/MM
>>>> which is now not happening.
>>>>
>>>> However, we did offer the Plumbers BBB infrastructure to willy
>>>> for a MM gathering which could be expanded to include this.
>>>
>>> Well, that's why I was suggesting doing this as a LPC BOF, and
>>> using an LPC BOF session on Friday --- I'm very much aware we don't
>>> have the right tehcnical people at the Maintainer Summit.
>>>
>>> It's not clear we will have enough MM folks at the LPC, and I agree
>>> LSF/MM would be a better venue --- but as you say, it's not
>>> happening. We could also use the BBB infrastructure after the LPC
>>> as well, if we can't get everyone lined up and available on short
>>> notice. There are a lot of different possibilities; I'm for
>>> anything where all of the stakeholders agree will work, so we can
>>> make forward progress.
>>
>> I think the two different questions are:
>>
>> * What work is left for merging folios?
>
> My reading of the email threads is that they're iterating to an actual
> conclusion (I admit, I'm surprised) ... or at least the disagreements
> are getting less. Since the merge window closed this is now a 5.16
> thing, so there's no huge urgency to getting it resolved next week.
>

I think the urgency is mostly around clarity for others with out of tree work, or who are depending on folios in some other way. Setting up a clear set of conditions for the path forward should also be part of saying not-yet to merging them.

>> * What process should we use to make the overall development of folio
>> sized changes more predictable and rewarding for everyone involved?
>
> Well, the current one seems to be working (admittedly eventually, so
> achieving faster resolution next time might be good) ... but I'm sure
> you could propose alternatives ... especially in the time to resolution
> department.

It feels like these patches are moving forward, but with a pretty heavy emotional cost for the people involved. I'll definitely agree this has been our process for a long time, but I'm struggling to understand why we'd call it working.

In general, we've all come to terms with huge changes being a slog through consensus building, design compromise, the actual technical work, and the rebase/test/fix iteration cycle. It's stressful, both because of technical difficulty and because the whole process is filled with uncertainty.

With folios, we don't have general consensus on:

* Which problems are being solved? Kent's writeup makes it pretty clear filesystems and memory management developers have diverging opinions on this. Our process in general is to put this into patch 0. It mostly works, but there's an intermediate step between patch 0 and the full lwn article that would be really nice to have.

* Who is responsible for accepting the design, and which acks must be obtained before it goes upstream? Our process here is pretty similar to waiting for answers to messages in bottles. We consistently leave it implicit and poorly defined.

* What work is left before it can go upstream? Our process could be effectively modeled by postit notes on one person's monitor, which they may or may not share with the group. Also, since we don't have agreement on which acks are required, there's no way to have any certainty about what work is left. It leaves authors feeling derailed when discussion shifts and reviewers feeling frustrated and ignored.

* How do we divide up the long term future direction into individual steps that we can merge? This also goes back to consensus on the design. We can't decide which parts are going to get layered in future merge windows until we know if we're building a car or a banana stand.

* What tests will we use to validate it all? Work this spread out is too big for one developer to test alone. We need ways for people sign up and agree on which tests/benchmarks provide meaningful results.

The end result of all of this is that missing a merge window isn't just about a time delay. You add N months of total uncertainty, where every new email could result in having to start over from scratch. Willy's do-whatever-the-fuck-you-want-I'm-going-on-vacation email is probably the least surprising part of the whole thread.

Internally, we tend to use a simple shared document to nail all of this down. A two page google doc for folios could probably have avoided a lot of pain here, especially if we’re able to agree on stakeholders.

-chris