Re: get_maintainers.pl subsystem output

From: Duda, Sebastian
Date: Tue Jul 23 2019 - 03:30:07 EST


Hi Joe,

when analyzing the patch `<20150128012747.824898918@xxxxxxxxxxxxxxxxxxx>` [1] with `get_maintainers.pl --subsystem --status --separator , /tmp/patch`, there is the following output:

Chris Mason <clm@xxxxxx> (maintainer:BTRFS FILE SYSTEM),Josef Bacik <jbacik@xxxxxx> (maintainer:BTRFS FILE SYSTEM),David Sterba <dsterba@xxxxxxx> (maintainer:BTRFS FILE SYSTEM),Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> (maintainer:FILESYSTEMS (VFS and infrastructure)),"Theodore Ts'o" <tytso@xxxxxxx> (maintainer:EXT4 FILE SYSTEM),Andreas Dilger <adilger.kernel@xxxxxxxxx> (maintainer:EXT4 FILE SYSTEM),Jaegeuk Kim <jaegeuk@xxxxxxxxxx> (maintainer:F2FS FILE SYSTEM),Changman Lee <cm224.lee@xxxxxxxxxxx> (maintainer:F2FS FILE SYSTEM),Miklos Szeredi <miklos@xxxxxxxxxx> (maintainer:FUSE: FILESYSTEM IN USERSPACE),Steven Whitehouse <swhiteho@xxxxxxxxxx> (supporter:GFS2 FILE SYSTEM),Anton Altaparmakov <anton@xxxxxxxxxx> (supporter:NTFS FILESYSTEM),Hugh Dickins <hughd@xxxxxxxxxx> (maintainer:TMPFS (SHMEM FILESYSTEM)),linux-btrfs@xxxxxxxxxxxxxxx (open list:BTRFS FILE SYSTEM),linux-kernel@xxxxxxxxxxxxxxx (open list),linux-fsdevel@xxxxxxxxxxxxxxx (open list:FILESYSTEMS (VFS and infrastructure)),linux-ext4@xxxxxxxxxxxxxxx (open list:EXT4 FILE SYSTEM),linux-f2fs-devel@xxxxxxxxxxxxxxxxxxxxx (open list:F2FS FILE SYSTEM),fuse-devel@xxxxxxxxxxxxxxxxxxxxx (open list:FUSE: FILESYSTEM IN USERSPACE),cluster-devel@xxxxxxxxxx (open list:GFS2 FILE SYSTEM),linux-ntfs-dev@xxxxxxxxxxxxxxxxxxxxx (open list:NTFS FILESYSTEM),linux-mm@xxxxxxxxx (open list:MEMORY MANAGEMENT)
Maintained,Buried alive in reporters,Supported
BTRFS FILE SYSTEM,THE REST,FILESYSTEMS (VFS and infrastructure),EXT4 FILE SYSTEM,F2FS FILE SYSTEM,FUSE: FILESYSTEM IN USERSPACE,GFS2 FILE SYSTEM,NTFS FILESYSTEM,MEMORY MANAGEMENT,TMPFS (SHMEM FILESYSTEM)

How can I parse this output automatically? or how can I generate a parsable output?

I need the tuples of subsystems and status:
(THE REST, Buried alive in reporters)
(TMPFS, Maintained)
(BTRFS FILE SYSTEM, Maintained)
â
(GFS2 FILE SYSTEM, Supported)

I'm not aware how to reliably assign the statuses to the subsystems.

Thank you in advance
Kind regards

Sebastian Duda

[1] https://lore.kernel.org/patchwork/patch/537252/

On 2019-07-19 10:50, Joe Perches wrote:
On Fri, 2019-07-19 at 07:35 +0000, Duda, Sebastian wrote:
Hi Joe,

I'm conducting a large-scale patch analysis of the LKML with 1.8 million
patch emails. I'm using the `get_maintainer.pl` script to know which
patch is related to which subsystem.

The MAINTAINERS file is updated frequently.

Are you also using the MAINTAINERS file used
at the time each patch was submitted?

I ran into two issues while using the script:

1. When I use the script the trivial way

$ scripts/get_maintainer.pl --subsystem --status --separator ,
drivers/media/i2c/adv748x/
Kieran Bingham <kieran.bingham@xxxxxxxxxxxxxxxx> (maintainer:ANALOG
DEVICES INC ADV748X DRIVER),Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>
(maintainer:MEDIA INPUT INFRASTRUCTURE
(V4L/DVB)),linux-media@xxxxxxxxxxxxxxx (open list:ANALOG DEVICES INC
ADV748X DRIVER),linux-kernel@xxxxxxxxxxxxxxx (open list)
Maintained,Buried alive in reporters
ANALOG DEVICES INC ADV748X DRIVER,MEDIA INPUT INFRASTRUCTURE
(V4L/DVB),THE REST

the output is hard to parse because the status `Maintained` is displayed
only once but related to two subsystems.

I'd prefer a more table like representation, like this:

Kieran Bingham <kieran.bingham@xxxxxxxxxxxxxxxx> (maintainer:ANALOG
DEVICES INC ADV748X DRIVER),linux-media@xxxxxxxxxxxxxxx (open
list:ANALOG DEVICES INC ADV748X DRIVER),ANALOG DEVICES INC ADV748X
DRIVER,Maintained
Mauro Carvalho Chehab <mchehab@xxxxxxxxxx> (maintainer:MEDIA INPUT
INFRASTRUCTURE (V4L/DVB)),MEDIA INPUT INFRASTRUCTURE
(V4L/DVB),Maintained
linux-kernel@xxxxxxxxxxxxxxx (open list),THE REST,Buried alive in
reporters


2. I want to analyze multiple patches, currently I am calling the script
once per patch. When calling the script with multiple files the files
output is merged

$ scripts/get_maintainer.pl --subsystem --status --separator ','
drivers/media/i2c/adv748x/ include/uapi/linux/wmi.h
Kieran Bingham <kieran.bingham@xxxxxxxxxxxxxxxx> (maintainer:ANALOG
DEVICES INC ADV748X DRIVER),Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>
(maintainer:MEDIA INPUT INFRASTRUCTURE
(V4L/DVB)),linux-media@xxxxxxxxxxxxxxx (open list:ANALOG DEVICES INC
ADV748X DRIVER),linux-kernel@xxxxxxxxxxxxxxx (open
list),platform-driver-x86@xxxxxxxxxxxxxxx (open list:ACPI WMI DRIVER)
Maintained,Buried alive in reporters,Orphan
ANALOG DEVICES INC ADV748X DRIVER,MEDIA INPUT INFRASTRUCTURE
(V4L/DVB),THE REST,ACPI WMI DRIVER

I'd like to run the script with all files but separated output, like
this:

$ scripts/get_maintainer.pl --subsystem --status --separator ','
--separate-files drivers/media/i2c/adv748x/ include/uapi/linux/wmi.h
Kieran Bingham <kieran.bingham@xxxxxxxxxxxxxxxx> (maintainer:ANALOG
DEVICES INC ADV748X DRIVER),Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>
(maintainer:MEDIA INPUT INFRASTRUCTURE
(V4L/DVB)),linux-media@xxxxxxxxxxxxxxx (open list:ANALOG DEVICES INC
ADV748X DRIVER),linux-kernel@xxxxxxxxxxxxxxx (open list)
Maintained,Buried alive in reporters
ANALOG DEVICES INC ADV748X DRIVER,MEDIA INPUT INFRASTRUCTURE
(V4L/DVB),THE REST

platform-driver-x86@xxxxxxxxxxxxxxx (open list:ACPI WMI
DRIVER),linux-kernel@xxxxxxxxxxxxxxx (open list)
Orphan,Buried alive in reporters
ACPI WMI DRIVER,THE REST


My Questions are:
1. How can I make get_maintainer's output to be more table-like?

I suggest adding --nogit --nogit-fallback --roles --norolestats

2. How can I make get_maintainer.pl to separate each file's output?

Run the script with multiple invocations. once for each file
modified by the patch.