Re: Process Hang in __read_seqcount_begin

From: Peter LaDow
Date: Fri Oct 26 2012 - 17:25:45 EST


On Fri, Oct 26, 2012 at 2:05 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> Do you know what is per cpu data in linux kernel ?

I sorta did. But since your response, I did more reading, and now I
see what you mean. But I don't think this is a per cpu issue. More
below.

> Because its not needed. Really I dont know why you want that.
>
> Once you are sure a thread cannot be interrupted by a softirq, and
> cannot migrate to another cpu, access to percpu data doesnt need other
> synchronization at all.

Because there are multiple entry points on the same CPU. In
net/ipv4/netfilter/ip_tables, there are two entries to
xt_write_recseq_begin(). The first is in ipt_do_table and the other
is in get_counters. Where we are seeing the lockup is with a
getsockopt syscall leading to do_counters. The other path is through
ipt_do_table, which is installed as a hook. I'm not sure from what
context the hooks are called, but it is certainly from a different
context than the syscall.

> Following sequence is safe :
>
> addend = (__this_cpu_read(xt_recseq.sequence) + 1) & 1;
> /*
> * This is kind of a write_seqcount_begin(), but addend is 0 or 1
> * We dont check addend value to avoid a test and conditional jump,
> * since addend is most likely 1
> */
> __this_cpu_add(xt_recseq.sequence, addend);

If this were safe, we wouldn't be seeing this lockup and your patch
wouldn't be needed. So it seems that your patch doesn't really
address the issue that we are not "sure a thread cannot be interrupted
by a softirq, and cannot migrate to another cpu". Well, we know it
cannot migrate to another CPU, because there isn't another CPU. So
apparently, it can be interrupted by a softirq. So local_bh_disable
isn't doing anything useful in the RT patches with regard to this.

As I mentioned earlier, I think perhaps what your patch did was ensure
an atomic update of the sequence counter. But it does nothing to
synchronize two writers. If they were already synchronized (such as
via the call to local_bh_disable), then we wouldn't see sequence
counter corruption, no?

Pete
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/