Re: [PATCH 0/8] perf: add ability to sample physical data addresses

From: Stephane Eranian
Date: Tue Jul 30 2013 - 09:09:21 EST


On Tue, Jul 30, 2013 at 11:02 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Tue, Jul 30, 2013 at 10:51:46AM +0200, Stephane Eranian wrote:
>> On Tue, Jul 30, 2013 at 10:37 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> > On Tue, Jul 30, 2013 at 10:02:01AM +0200, Stephane Eranian wrote:
>> >> > Ahh. We don't put the useful bits in the mmap event; we'll need to fix
>> >> > that too then ;-)
>> >> >
>> >> > Doing so is going to be a bit of a bother since we use the tail of
>> >> > PERF_RECORD_MMAP for filenames and thus aren't particularly extensible.
>> >> >
>> >> > This would mean doing something like PERF_RECORD_MMAP2 and some means
>> >> > for userspace to requrest the new events instead of the old one.
>> >> >
>> >> Tracking mmaps even for shmat() won't cover the paging cases. When you page a
>> >> page back in, it most likely gets a different physical page. How would
>> >> we track that
>> >> case too using the same approach?
>> >
>> > It doesn't matter. Even if a page ends up being a different physical
>> > page, it will always be the same sb:inode:pgoffset. You should be able
>> > to always uniquely identify a (shared) page by that triplet.
>> >
>> Ok, so you're saying that triplet uniquely identifies a virtual page
>> regardless of
>> the physical page it is mapped onto. If the physical page changes because
>> of paging, we keep the same triplet and therefore we can still detect the false
>> sharing.
>
> Exactly.
>
I see this for my program:

7f0a59cbe000-7f0a59cc1000 rw-p 00000000 00:00 0
7f0a59cd3000-7f0a59cd4000 rw-p 00000000 00:00 0
7f0a59cd4000-7f0a59cd5000 rw-s 00000000 00:04 458753
/SYSV00000000 (deleted)
7f0a59cd5000-7f0a59cd6000 rw-s 00000000 00:04 425984
/SYSV00000000 (deleted)
7f0a59cd6000-7f0a59cd7000 rw-s 00000000 00:04 425984
/SYSV00000000 (deleted)

The first 2 lines are heap. There is nothing useful coming out of maj:min ino.
However for shared segment we can use the ino number. Shared memory segment
appear as file in the vma therefore, the kernel does use the ino, maj,
min number.
And in my program I map the same segment twice, and we see the last two mappings
are identical.

But in the case of regular paging, there is no useful info there. But
thenI suspect for a private
heap page we only care about multi-threaded and there the physical
page is irrelevant.
So it seems all we care about is to cover the shared segment case and
we can get the
info from the vma and creates a MMAP2 record for it.

Do we agree?


>> > So if we create a net MMAP record that includes the device (substitute
>> > for the superblock) and inode information we should be good.
>>
>> I will try that. I am not familiar with mm, so where do we find the
>> device? Inside
>> the vma?
>
> Take a peek at fs/proc/task_mmu.c:show_map_vma(), its the code used to
> print /proc/$PID/maps and displays all stuff we want.

That is what I see in that function:

if (file) {
struct inode *inode = file_inode(vma->vm_file);
dev = inode->i_sb->s_dev;
ino = inode->i_ino;
pgoff = ((loff_t)vma->vm_pgoff) << PAGE_SHIFT;
}

It works for anything associated with a file.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/