Re: [PATCH] scripts: add a tool to produce a compile_commands.json file

From: Tom Roeder
Date: Mon Dec 17 2018 - 16:40:30 EST


On Sat, Dec 15, 2018 at 06:37:49PM +0900, Masahiro Yamada wrote:
> On Fri, Dec 7, 2018 at 7:24 AM Tom Roeder <tmroeder@xxxxxxxxxx> wrote:
> >
> > The LLVM/Clang project provides many tools for analyzing C source code.
> > Many of these tools are based on LibTooling
> > (https://clang.llvm.org/docs/LibTooling.html), which depends on a
> > database of compiler flags. The standard container for this database is
> > compile_commands.json, which consists of a list of JSON objects, each
> > with "directory", "file", and "command" fields.
> >
> > Some build systems, like cmake or bazel, produce this compilation
> > information directly. Naturally, Makefiles don't. However, the kernel
> > makefiles already create .<target>.o.cmd files that contain all the
> > information needed to build a compile_commands.json file.
> >
> > So, this commit adds scripts/gen_compile_commands.py, which recursively
> > searches through a directory for .<target>.o.cmd files and extracts
> > appropriate compile commands from them. It writes a
> > compile_commands.json file that LibTooling-based tools can use.
> >
> > By default, gen_compile_commands.py starts its search in its working
> > directory and (over)writes compile_commands.json in the working
> > directory. However, it also supports --output and --directory flags for
> > out-of-tree use.
> >
> > Note that while gen_compile_commands.py enables the use of clang-based
> > tools, it does not require the kernel to be compiled with clang. E.g.,
> > the following sequence of commands produces a compile_commands.json file
> > that works correctly with LibTooling.
> >
> > make defconfig
> > make
> > scripts/gen_compile_commands.py
> >
> > Also note that this script is written to work correctly in both Python 2
> > and Python 3, so it does not specify the Python version in its first
> > line.
> >
> > For an example of the utility of this script: after running
> > gen_compile_commands.json on the latest kernel version, I was able to
> > use Vim + the YouCompleteMe pluging + clangd to automatically jump to
> > definitions and declarations. Obviously, cscope and ctags provide some
> > of this functionality; the advantage of supporting LibTooling is that it
> > opens the door to many other clang-based tools that understand the code
> > directly and do not rely on regular expressions and heuristics.
> >
> > Tested: Built several recent kernel versions and ran the script against
> > them, testing tools like clangd (for editor/LSP support) and clang-check
> > (for static analysis). Also extracted some test .cmd files from a kernel
> > build and wrote a test script to check that the script behaved correctly
> > with all permutations of the --output and --directory flags.
> >
> > Signed-off-by: Tom Roeder <tmroeder@xxxxxxxxxx>
>
>
> I am fine with this,
> but I have one question.
>
> The generated compile_commands.json
> contains $(pound)

To make sure we're talking about the same thing: the instances that I've
seen of "#" occur in macro definitions in the "command" field in some of
the JSON objects. For example, I see things like
-D\"KBUILD_STR(s)=\\#s\".

>
> How is it handled?

The Python json module takes care of escaping the output to make a valid
JSON string for the "command" field. The gen_compile_commands.py script
doesn't take any special action for that or any other character in its
output.

> Should it be replaced with '\#' ?

I don't think it needs to be changed, given my experience with this
script and its testing so far: the output seems to work for me. However,
are you running into problems due to the presence of this character or
inadequate escaping? Please let me know, and I'd be happy to look into
it.

Tom