Re: [PATCH v3 09/10] kallsyms: Hide layout

From: Jann Horn
Date: Wed Jun 24 2020 - 06:22:31 EST


On Tue, Jun 23, 2020 at 7:26 PM Kristen Carlson Accardi
<kristen@xxxxxxxxxxxxxxx> wrote:
> This patch makes /proc/kallsyms display alphabetically by symbol
> name rather than sorted by address in order to hide the newly
> randomized address layout.
[...]
> +static int sorted_show(struct seq_file *m, void *p)
> +{
> + struct list_head *list = m->private;
> + struct kallsyms_iter_list *iter;
> + int rc;
> +
> + if (list_empty(list))
> + return 0;
> +
> + iter = list_first_entry(list, struct kallsyms_iter_list, next);
> +
> + m->private = iter;
> + rc = s_show(m, p);
> + m->private = list;
> +
> + list_del(&iter->next);
> + kfree(iter);

Does anything like this kfree() happen if someone only reads the start
of kallsyms and then closes the file? IOW, does "while true; do head
-n1 /proc/kallsyms; done" leak memory?

> + return rc;
> +}
[...]
> +static int kallsyms_list_cmp(void *priv, struct list_head *a,
> + struct list_head *b)
> +{
> + struct kallsyms_iter_list *iter_a, *iter_b;
> +
> + iter_a = list_entry(a, struct kallsyms_iter_list, next);
> + iter_b = list_entry(b, struct kallsyms_iter_list, next);
> +
> + return strcmp(iter_a->iter.name, iter_b->iter.name);
> +}

This sorts only by name, but kallsyms prints more information (module
names and types). This means that if there are elements whose names
are the same, but whose module names or types are different, then some
amount of information will still be leaked by the ordering of elements
with the same name. In practice, since list_sort() is stable, this
means you can see the ordering of many modules, and you can see the
ordering of symbols with same name but different visibility (e.g. "t
user_read" from security/selinux/ss/policydb.c vs "T user_read" from
security/keys/user_defined.c, and a couple other similar cases).

[...]
> +#if defined(CONFIG_FG_KASLR)
> +/*
> + * When fine grained kaslr is enabled, we need to
> + * print out the symbols sorted by name rather than by
> + * by address, because this reveals the randomization order.
> + */
> +static int kallsyms_open(struct inode *inode, struct file *file)
> +{
> + int ret;
> + struct list_head *list;
> +
> + list = __seq_open_private(file, &kallsyms_sorted_op, sizeof(*list));
> + if (!list)
> + return -ENOMEM;
> +
> + INIT_LIST_HEAD(list);
> +
> + ret = kallsyms_on_each_symbol(get_all_symbol_name, list);
> + if (ret != 0)
> + return ret;
> +
> + list_sort(NULL, list, kallsyms_list_cmp);

This permits running an algorithm (essentially mergesort) with
secret-dependent branches and memory addresses on essentially secret
data, triggerable and arbitrarily repeatable (although with partly
different addresses on each run) by the attacker, and probably a
fairly low throughput (comparisons go through indirect function calls,
which are slowed down by retpolines, and linked list iteration implies
slow pointer chases). Those are fairly favorable conditions for
typical side-channel attacks.

Do you have estimates of how hard it would be to leverage such side
channels to recover function ordering (both on old hardware that only
has microcode fixes for Spectre and such, and on newer hardware with
enhanced IBRS and such)?

> + return 0;
> +}