Re: [1/3 PATCH] Kprobes: User space probes support- base interface

From: Andrew Morton
Date: Mon Mar 20 2006 - 05:43:17 EST


Prasanna S Panchamukhi <prasanna@xxxxxxxxxx> wrote:
>
> This patch provides two interfaces to insert and remove
> user space probes. Each probe is uniquely identified by
> inode and offset within that executable/library file.
> Insertion of a probe involves getting the code page for
> a given offset, mapping it into the memory and then insert
> the breakpoint at the given offset. Also the probe is added
> to the uprobe_table hash list. A uprobe_module data strcture
> is allocated for every probed application/library image on disk.
> Removal of a probe involves getting the code page for a given
> offset, mapping that page into the memory and then replacing
> the breakpoint instruction with a the original opcode.
> This patch also provides aggrigate probe handler feature,
> where user can define multiple handlers per probe.
>
> +/**
> + * Finds a uprobe at the specified user-space address in the current task.
> + * Points current_uprobe at that uprobe and returns the corresponding kprobe.
> + */
> +struct kprobe __kprobes *get_uprobe(void *addr)
> +{
> + struct mm_struct *mm = current->mm;
> + struct vm_area_struct *vma;
> + struct inode *inode;
> + unsigned long offset;
> + struct kprobe *p, *kpr;
> + struct uprobe *uprobe;
> +
> + vma = find_vma(mm, (unsigned long)addr);
> +
> + BUG_ON(!vma); /* this should not happen, not in our memory map */
> +
> + offset = (unsigned long)addr - (vma->vm_start +
> + (vma->vm_pgoff << PAGE_SHIFT));
> + if (!vma->vm_file)
> + return NULL;

All this appears to be happening without mmap_sem held?

> +/**
> + * Wait for the page to be unlocked if someone else had locked it,
> + * then map the page and insert or remove the breakpoint.
> + */
> +static int __kprobes map_uprobe_page(struct page *page, struct uprobe *uprobe,
> + process_uprobe_func_t process_kprobe_user)
> +{
> + int ret = 0;
> + kprobe_opcode_t *uprobe_address;
> +
> + if (!page)
> + return -EINVAL; /* TODO: more suitable errno */
> +
> + wait_on_page_locked(page);
> + /* could probably retry readpage here. */
> + if (!PageUptodate(page))
> + return -EINVAL; /* TODO: more suitable errno */
> +
> + lock_page(page);

That's a lot of fuss and might be racy with truncate.

Why not just run lock_page()?

> + uprobe_address = (kprobe_opcode_t *)kmap(page);
> + uprobe_address = (kprobe_opcode_t *)((unsigned long)uprobe_address +
> + (uprobe->offset & ~PAGE_MASK));
> + ret = (*process_kprobe_user)(uprobe, uprobe_address);
> + kunmap(page);

kmap_atomic() is preferred.

> +/**
> + * Gets exclusive write access to the given inode to ensure that the file
> + * on which probes are currently applied does not change. Use the function,
> + * deny_write_access_to_inode() we added in fs/namei.c.
> + */
> +static inline int ex_write_lock(struct inode *inode)
> +{
> + return deny_write_access_to_inode(inode);
> +}

hm, this code likes to poke around in VFS internals. It would be nice to
have an overall description of what it's trying to do, why and how.

> +/**
> + * unregister_uprobe: Disarms the probe, removes the uprobe
> + * pointers from the hash list and unhooks readpage routines.
> + */
> +void __kprobes unregister_uprobe(struct uprobe *uprobe)
> +{
> + struct address_space *mapping;
> + struct uprobe_module *umodule;
> + struct page *page;
> + unsigned long flags;
> + int ret = 0;
> +
> + if (!uprobe->inode)
> + return;
> +
> + mapping = uprobe->inode->i_mapping;
> +
> + page = find_get_page(mapping, uprobe->offset >> PAGE_CACHE_SHIFT);
> +
> + spin_lock_irqsave(&uprobe_lock, flags);
> + ret = remove_uprobe(uprobe);
> + spin_unlock_irqrestore(&uprobe_lock, flags);
> +
> + mutex_lock(&uprobe_mutex);
> + if (!(umodule = get_module_by_inode(uprobe->inode)))
> + goto out;
> +
> + hlist_del(&uprobe->ulist);
> + if (hlist_empty(&umodule->ulist_head)) {
> + list_del(&umodule->mlist);
> + ex_write_unlock(uprobe->inode);
> + path_release(&umodule->nd);
> + kfree(umodule);
> + }
> +
> +out:
> + mutex_unlock(&uprobe_mutex);
> + if (ret)
> + ret = map_uprobe_page(page, uprobe, remove_kprobe_user);
> +
> + if (ret == -EINVAL)
> + return;
> + /*
> + * TODO: unregister_uprobe should not fail, need to handle
> + * if it fails.
> + */
> + flush_vma(mapping, page, uprobe);
> +
> + if (page)
> + page_cache_release(page);
> +}

That's some pretty awkward coding. Buggy too. It doesn't drop the
refcount on the page if map_uprobe_page() returned -EINVAL because it's
assuming that EINVAL meant "there was no page". But there are multiple
reasons for map_uprobe_page() to return -EINVAL. If that page isn't
uptodate, we leak a ref.

This function should be doing the checking for a find_get_page() failure,
not map_uprobe_page().

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/