Re: [RFC patch] checkpatch: Add a test for long function definitions (>200 lines)

From: Luc Van Oostenryck
Date: Sun Dec 17 2017 - 17:33:09 EST


On Sun, Dec 17, 2017 at 01:46:45PM -0800, Linus Torvalds wrote:
> On Sat, Dec 16, 2017 at 5:26 PM, Joe Perches <joe@xxxxxxxxxxx> wrote:
> >>
> >> I'm not expecting you to be able to write a perl script that checks
> >> the first line, but we have way too many 200-plus line functions in
> >> the kernel. I'd like a warning on anything over 200 lines (a factor
> >> of 4 over Linus's stated goal).
> >
> > In response to Matthew's request:
> >
> > This is a possible checkpatch warning for long
> > function definitions.
>
> So I'm not sure a line count makes sense.
>
> Sometimes long functions can be sensible, if they are basically just
> one big case-statement or similar.
>
> Looking at one of your examples: futex_requeue() is indeed a long
> function, but that's mainly because it has a lot of comments about
> exactly what is going on, and while it only has one (fairly small)
> case statement, the rest of it is very similar (ie "in this case, do
> XYZ").
>
> Another case I looked at - try_to_unmap_one() - had very similar
> behavior. It's long, but it's not long for the wrong reasons.
>
> And yes, "copy_process()" is disgusting, and probably _could_ be split
> up a bit, but at the same time the bulk of the lines there really is
> just the "initialize all the parts of the "struct task_struct".
>
> And other times, I suspect even a 50-line function is way too dense,
> just because it's doing crazy things.
>
> So I have a really hard time with some arbitrary line limit. At eh
> very least, I think it should ignore comments and whitespace lines.
>
> And yes, some real "complexity analysis" might give a much more sane
> limit, but I don't even know what that would be or how it would work.
>

It would be very easy to let sparse calculate the cyclomatic complexity
of each function (and then either printing it or warn if too high), but:
- warning would also need a hard limit
- cyclomatic complexity of a function with a big (but simple) switch
will also be high.

I far from sure that the cyclomatic complexity is very useful but maybe
some variation of it (like counting a switch as a single edge) could
have some value here.

-- Luc Van Oostenryck