Re: use-after-free in sock_wake_async
From: Hannes Frederic Sowa
Date: Thu Nov 26 2015 - 12:03:26 EST
On Thu, Nov 26, 2015, at 16:51, Eric Dumazet wrote:
> On Thu, 2015-11-26 at 14:32 +0100, Hannes Frederic Sowa wrote:
> > Hannes Frederic Sowa <hannes@xxxxxxxxxxxxxxxxxxx> writes:
> >
> >
> > > I have seen filesystems already doing so in .destroy_inode, that's why I
> > > am asking. The allocation happens the same way as we do with sock_alloc,
> > > e.g. shmem. I actually thought that struct inode already provides an
> > > rcu_head for exactly that reason.
> >
> > E.g.:
>
> > +static void sock_destroy_inode(struct inode *inode)
> > +{
> > + call_rcu(&inode->i_rcu, sock_cache_free_rcu);
> > +}
>
> I guess you missed few years back why we had to implement
> SLAB_DESTROY_BY_RCU for TCP sockets to not destroy performance.
I think I wasn't even subscribed to netdev@ at that time, so I probably
missed it. Few years back is 7. :}
> By adding RCU grace period before reuse of this inode (about 640 bytes
> today), you are asking the CPU to evict from its cache precious content,
> and slow down some workloads, adding lot of ram pressure, as the cpu
> allocating a TCP socket will have to populate its cache for a cold
> inode.
My rationale was like this: we already have rcu to free the wq, so we
don't add any more callbacks as current code. sock_alloc is right now
1136 bytes, which is huge, like 18 cachelines. I wouldn't think it does
matter a lot as we thrash anyway. tcp_sock is like 45 cachelines right
now, hui.
Also isn't the reason why slub exists so it can track memory regions
per-cpu.
Anyway, I am only speculating why it could be tried. I probably need to
do some performance experiments.
> The reason we put in a small object the RCU protected fields should be
> pretty clear.
Yes, I thought about that.
> Do not copy code that people wrote in other layers without understanding
> the performance implications.
Duuh. :)
Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/