Re: [PATCH v3 2.6.39-rc1-tip 12/26] 12: uprobes: slot allocationfor uprobes

From: Srikar Dronamraju
Date: Tue Apr 19 2011 - 02:40:44 EST


* Peter Zijlstra <peterz@xxxxxxxxxxxxx> [2011-04-18 18:46:11]:

> On Fri, 2011-04-01 at 20:04 +0530, Srikar Dronamraju wrote:
> > Every task is allocated a fixed slot. When a probe is hit, the original
> > instruction corresponding to the probe hit is copied to per-task fixed
> > slot. Currently we allocate one page of slots for each mm. Bitmaps are
> > used to know which slots are free. Each slot is made of 128 bytes so
> > that its cache aligned.
> >
> > TODO: On massively threaded processes (or if a huge number of processes
> > share the same mm), there is a possiblilty of running out of slots.
> > One alternative could be to extend the slots as when slots are required.
>
> As long as you're single stepping things and not using boosted probes
> you can fully serialize the slot usage. Claim a slot on trap and release
> the slot on finish. Claiming can wait on a free slot since you already
> have the whole SLEEPY thing.
>

Yes, thats certainly one approach but that approach makes every
breakpoint hit contend for spinlock. (Infact we will have to change it
to mutex lock (as you rightly pointed out) so that we allow threads to
wait when slots are not free). Assuming a 4K page, we would be taxing
applications that have less than 32 threads (which is probably the
default case). If we continue with the current approach, then we
could only add additional page(s) for apps which has more than 32
threads and only when more than 32 __live__ threads have actually hit a
breakpoint.

>
> > +static int xol_add_vma(struct uprobes_xol_area *area)
> > +{
> > + struct vm_area_struct *vma;
> > + struct mm_struct *mm;
> > + struct file *file;
> > + unsigned long addr;
> > + int ret = -ENOMEM;
> > +
> > + mm = get_task_mm(current);
> > + if (!mm)
> > + return -ESRCH;
> > +
> > + down_write(&mm->mmap_sem);
> > + if (mm->uprobes_xol_area) {
> > + ret = -EALREADY;
> > + goto fail;
> > + }
> > +
> > + /*
> > + * Find the end of the top mapping and skip a page.
> > + * If there is no space for PAGE_SIZE above
> > + * that, mmap will ignore our address hint.
> > + *
> > + * We allocate a "fake" unlinked shmem file because
> > + * anonymous memory might not be granted execute
> > + * permission when the selinux security hooks have
> > + * their way.
> > + */
>
> That just annoys me, so we're working around some stupid sekurity crap,
> executable anonymous maps are perfectly fine, also what do JITs do?

Yes, we are working around selinux security hooks, but do we have a
choice.

James can you please comment on this.

>
> > + vma = rb_entry(rb_last(&mm->mm_rb), struct vm_area_struct, vm_rb);
> > + addr = vma->vm_end + PAGE_SIZE;
> > + file = shmem_file_setup("uprobes/xol", PAGE_SIZE, VM_NORESERVE);
> > + if (!file) {
> > + printk(KERN_ERR "uprobes_xol failed to setup shmem_file "
> > + "while allocating vma for pid/tgid %d/%d for "
> > + "single-stepping out of line.\n",
> > + current->pid, current->tgid);
> > + goto fail;
> > + }
> > + addr = do_mmap_pgoff(file, addr, PAGE_SIZE, PROT_EXEC, MAP_PRIVATE, 0);
> > + fput(file);
> > +
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/