Re: [BUG]kernel softlockup due to sidtab_search_context run for long time because of too many sidtab context node

From: Stephen Smalley
Date: Thu Dec 14 2017 - 08:17:30 EST


On Thu, 2017-12-14 at 03:19 +0000, yangjihong wrote:
> Hello,
>
> > ÂSo, does docker just keep allocating a unique category set for
> > every new container, never reusing them even if the container is
> > destroyed?Â
> > ÂThat would be a bug in docker IMHO.ÂÂOr are you creating an
> > unbounded number of containers and never destroying the older ones?
>
> I creat a containers, then destroy it,ÂÂand create second one,
> destroy it.......
> When docker created, it will mount overlay fs, because every
> containers has different selinux context, so a new sidtab node is
> generated and insert into the sidtab listÂÂ
> When docker destroyed, it will umount overlay fs, but umount
> operation does not seem relevant to "delete the node" hooks function,
> resulting in longer and longer sidtab list
> I think when umount, its selinux context will never reuse, so sidtab
> node is useless, it is best to delete i

The "selinux context will never reuse" is IMHO a bug in docker; if you
truly destroy the container (i.e. don't just stop its execution, but
delete it entirely), then the context should be reusable.

> > Âsidtab_search_context() could no doubt be optimized for the
> > negative case; there was an earlier optimization for the positive
> > case by adding a cache to sidtab_context_to_sid() prior to calling
> > it.ÂÂIt's a reverse lookup in the sidtab.
>
> I think add cache may be not very userful, because every containers
> has different selinux context, so when one docker created, it will
> search the whole sidtab list, until compare the last node, When a new
> node arrives, it is always necessary to compare all the nodes first,
> and then insert.Â
> All as long as the list does not delete the node, list will always
> increase, and search time will longer and longer, eventually leading
> to softlockup
>
>
> Is there any solution to this problem?

On the kernel side, we could certainly implement a reverse lookup hash
table. And there could be a faster way to quickly check whether a
given category set has ever been used if we wanted to specialize in
that manner. But that won't fix the fact that docker is allocating
unbounded security contexts.