Re: [RFC][PATCH v2 06/11] perf: core, export pmus via sysfs
From: Ingo Molnar
Date: Thu May 20 2010 - 16:15:25 EST
* Greg KH <greg@xxxxxxxxx> wrote:
> [...]
>
> I can always knock up a eventfs for you do mount at /sys/kernel/events/ or
> something if you want :)
eventfs was my first idea, until Peter convinced me that we want sysfs :-)
One important aspect would be to move it into the physical topology. Graphics
card? It might have events. PCI device? It might have events. Southbridge? It
might have a PMU and events. CPU? It has a PMU.
Especially when it comes to complex physical topologies on larger systems, we
eventually want to visualize things in tooling as well - as a tree of the
physical topology. Also, physical topologies will only become more complex, so
we dont want to detach events from them.
> sysfs exports single values just fine. If you are starting to do more
> complex things, like you currently are, maybe you shouldn't be in sysfs...
This is really like a read-only attributes, and it would be multi-line only
for the event format descriptor - a genuinely new aspect: a flexible ABI
descriptor.
It's an attribute for a very good purpose: flexible ABI with a user-space that
interprets new format descriptions automatically. This is not just theory, for
example perf trace does this today, and you can write scripts with old tools
for a new event that shows up in a new kernel, without rebuilding the tools.
Here is an example of a format descriptor:
# cat /debug/tracing/events/sched/sched_wakeup/format
name: sched_wakeup
ID: 59
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int common_lock_depth; offset:8; size:4; signed:1;
field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:1;
field:pid_t pid; offset:28; size:4; signed:1;
field:int prio; offset:32; size:4; signed:1;
field:int success; offset:36; size:4; signed:1;
field:int target_cpu; offset:40; size:4; signed:1;
print fmt: "comm=%s pid=%d prio=%d success=%d target_cpu=%03d", REC->comm, REC->pid, REC->prio, REC->success, REC->target_cpu
Also, we already have quite a few multi-line files in sysfs, for example:
$ cat /sys/devices/pnp0/00:09/options
Dependent: 00 - Priority preferred
port 0x378-0x378, align 0x0, size 0x8, 16-bit address decoding
port 0x778-0x778, align 0x0, size 0x8, 16-bit address decoding
irq 7 High-Edge
dma 3 8-bit compatible
Dependent: 01 - Priority acceptable
port 0x378-0x378, align 0x0, size 0x8, 16-bit address decoding
port 0x778-0x778, align 0x0, size 0x8, 16-bit address decoding
irq 3,4,5,6,7,10,11,12 High-Edge
dma 0,1,2,3 8-bit compatible
Dependent: 02 - Priority acceptable
port 0x278-0x278, align 0x0, size 0x8, 16-bit address decoding
port 0x678-0x678, align 0x0, size 0x8, 16-bit address decoding
irq 3,4,5,6,7,10,11,12 High-Edge
dma 0,1,2,3 8-bit compatible
Dependent: 03 - Priority acceptable
port 0x3bc-0x3bc, align 0x0, size 0x4, 16-bit address decoding
port 0x7bc-0x7bc, align 0x0, size 0x4, 16-bit address decoding
irq 3,4,5,6,7,10,11,12 High-Edge
dma 0,1,2,3 8-bit compatible
$ cat /sys/devices/pci0000:00/0000:00:1a.7/pools
poolinfo - 0.1
ehci_sitd 0 0 96 0
ehci_itd 0 0 160 0
ehci_qh 4 42 96 1
ehci_qtd 4 42 96 1
buffer-2048 0 0 2048 0
buffer-512 0 0 512 0
buffer-128 0 0 128 0
buffer-32 1 128 32 1
In fact uevents have multi-line attributes as well:
$ cat /sys/devices/pci0000:00/0000:00:1a.1/usb4/uevent
MAJOR=189
MINOR=384
DEVNAME=bus/usb/004/001
DEVTYPE=usb_device
DRIVER=usb
DEVICE=/proc/bus/usb/004/001
PRODUCT=1d6b/1/206
TYPE=9/0/0
BUSNUM=004
DEVNUM=001
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/