panic at rb_next when do pick_next_task_fair

From: 蒋雄伟(蒋冲)
Date: Tue Nov 08 2016 - 05:02:57 EST


Hi,all

We encountered the following panic in 2.6.32 and 3.10.0:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [] rb_next+0x1/0x50

Call Trace:
? pick_next_task_fair+0x129/0x1d0
[] __schedule+0x12a/0x910
[] schedule+0x29/0x70
[] futex_wait_queue_me+0xc4/0x120
[] futex_wait+0x179/0x280
[] ? hrtimer_get_res+0x50/0x50
[] ? futex_wait_queue_me+0xa2/0x120
[] do_futex+0xfe/0x5b0
[] ? blk_finish_plug+0x14/0x40
[] ? SyS_madvise+0x3c5/0x820
[] SyS_futex+0x80/0x180
[] system_call_fastpath+0x16/0x1b


It seems that after passing the check "if (!cfs_rq->nr_running)" ,
__pick_first_entity sometimes still return NULL, then leads to panic.

Would you please give any advise?
Thanks a lot.

Best regards.