RE: sched/deadline: warning in migrate_enable for boosted tasks

From: Ma, Jiping
Date: Fri Mar 14 2025 - 06:39:31 EST




-----Original Message-----
From: juri.lelli@xxxxxxxxxx <juri.lelli@xxxxxxxxxx>
Sent: Friday, March 14, 2025 6:06 PM
To: Ma, Jiping <Jiping.Ma2@xxxxxxxxxxxxx>
Cc: gregkh@xxxxxxxxxxxxxxxxxxx; peterz@xxxxxxxxxxxxx; sashal@xxxxxxxxxx; stable@xxxxxxxxxxxxxxx; wander@xxxxxxxxxx; mingo@xxxxxxxxxx; vincent.guittot@xxxxxxxxxx; dietmar.eggemann@xxxxxxx; rostedt@xxxxxxxxxxx; bsegall@xxxxxxxxxx; mgorman@xxxxxxx; vschneid@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; jiping.ma2@xxxxxxxxxxxxxx; Tao, Yue <Yue.Tao@xxxxxxxxxxxxx>
Subject: Re: sched/deadline: warning in migrate_enable for boosted tasks

CAUTION: This email comes from a non Wind River email account!
Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hello,

On 14/03/25 04:02, Ma, Jiping wrote:
> Hi, All
>
> We encounter this kernel warning, it looks similar with the one you
> are discussing [PATCH 6.6 331/356] sched/deadline: Fix warning in
> migrate_enable for boosted tasks - Greg
> Kroah-Hartman<https://lore.kernel.org/all/20241212144257.639344223@xxxxxxxxxxxxxxxxxxx/>.
> Do you have any idea for the issue?
>
> kernel: warning [ 998.494702] WARNING: CPU: 19 PID: 217 at
> kernel/sched/deadline.c:277 dequeue_task_dl+0x16c/0x1f0
> kernel: warning [ 998.494705] Modules linked in: iptable_nat ceph
> netfs macvlan igb_uio(O) uio nbd rbd libceph dns_resolver
> nf_conntrack_netlink nfnetlink_queue xt_NFQUEUE xt_set xt_multiport
> ipt_rpfilter ip6t_rpfilter ip_set_hash_net ip_set_hash_ip ip_set veth
> wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64
> curve25519_x86_64 libcurve25519_generic libchacha xt_nat xt_MASQUERADE
> xt_mark ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat xt_conntrack
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment vfio_pci
> vfio_pci_core binfmt_misc iscsi_target_mod target_core_mod drbd
> dm_crypt trusted asn1_encoder xt_addrtype nft_compat nf_tables
> nfnetlink br_netfilter bridge virtio_net net_failover failover nfsd
> auth_rpcgss nfs_acl lockd grace overlay 8021q garp stp mrp llc xfs
> vfio_iommu_type1 vfio sctp ip6_udp_tunnel udp_tunnel xprtrdma(O)
> svcrdma(O) rpcrdma(O) nvmet_rdma(O) nvme_rdma(O) ib_srp(O) ib_isert(O)
> ib_iser(O) rdma_rxe(O) mlx5_ib(O) mlx5_core(O) mlxfw(O) mlxdevm(O)
> psample tls macsec rdma_ucm(O) rdma_cm(O) iw_cm(O)
> kernel: warning [ 998.494748] ib_uverbs(O) ib_ucm(O) ib_cm(O) lru_cache libcrc32c fuse drm sunrpc efivarfs ip_tables ext4 mbcache jbd2 dm_multipath dm_mod sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 iTCO_wdt wmi_bmof iTCO_vendor_support dell_smbios dell_wmi_descriptor ledtrig_audio rfkill video intel_uncore_frequency intel_uncore_frequency_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul bnxt_re(O) crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 ib_core(O) rapl mlx_compat(O) uas intel_cstate intel_uncore acpi_ipmi mei_me ahci i2c_i801 bnxt_en(O) usb_storage i2c_smbus intel_pch_thermal mei libahci intel_vsec wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter iavf(O) i40e(O) ice(O) [last unloaded: drbd]
> kernel: warning [ 998.494781] CPU: 19 PID: 217 Comm: ktimers/19 Kdump: loaded Tainted: G W O 6.6.0-1-rt-amd64 #1 Debian 6.6.52-1.stx.95
> kernel: warning [ 998.494783] Hardware name: Dell Inc. PowerEdge
> XR11/0P2RNT, BIOS 1.15.2 09/10/2024
> kernel: warning [ 998.494784] RIP: 0010:dequeue_task_dl+0x16c/0x1f0
> kernel: warning [ 998.494786] Code: 48 c7 c7 e0 eb 89 ae c6 05 fd a2
> 6d 01 01 e8 9b ac f9 ff 0f 0b eb 81 48 c7 c7 c5 39 83 ae c6 05 e7 a2
> 6d 01 01 e8 84 ac f9 ff <0f> 0b 48 8b 83 28 09 00 00 49 39 c5 0f 83 53
> ff ff ff 48 c7 83 28
> kernel: warning [ 998.494788] RSP: 0000:ff32fa3140adbca8 EFLAGS:
> 00010082
> kernel: warning [ 998.494790] RAX: 0000000000000000 RBX:
> ff2ef1f93faf23c0 RCX: ff2ef1f93fae0608
> kernel: warning [ 998.494791] RDX: 00000000ffffffd8 RSI:
> 0000000000000027 RDI: ff2ef1f93fae0600
> kernel: warning [ 998.494793] RBP: ff2ef1e9c19eaf80 R08:
> 0000000000000000 R09: ff32fa3140adbc30
> kernel: warning [ 998.494794] R10: 0000000000000001 R11:
> 0000000000000015 R12: 000000000000000e
> kernel: warning [ 998.494795] R13: 0000000000000000 R14:
> 00000000ffffffff R15: 000000000000000e
> kernel: warning [ 998.494796] FS: 0000000000000000(0000)
> GS:ff2ef1f93fac0000(0000) knlGS:0000000000000000
> kernel: warning [ 998.494798] CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> kernel: warning [ 998.494799] CR2: 00000000181b50a0 CR3:
> 000000063db68005 CR4: 0000000000771ee0
> kernel: warning [ 998.494801] DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> kernel: warning [ 998.494802] DR3: 0000000000000000 DR6:
> 00000000fffe0ff0 DR7: 0000000000000400
> kernel: warning [ 998.494803] PKRU: 55555554
> kernel: warning [ 998.494804] Call Trace:
> kernel: warning [ 998.494805] <TASK>
> kernel: warning [ 998.494805] ? __warn+0x89/0x140
> kernel: warning [ 998.494808] ? dequeue_task_dl+0x16c/0x1f0
> kernel: warning [ 998.494810] ? report_bug+0x198/0x1b0
> kernel: warning [ 998.494814] ? handle_bug+0x3c/0x70
> kernel: warning [ 998.494816] ? exc_invalid_op+0x18/0x70
> kernel: warning [ 998.494818] ? asm_exc_invalid_op+0x1a/0x20
> kernel: warning [ 998.494821] ? dequeue_task_dl+0x16c/0x1f0
> kernel: warning [ 998.494823] ? dequeue_task_dl+0x16c/0x1f0
> kernel: warning [ 998.494825] rt_mutex_setprio+0x240/0x460
> kernel: warning [ 998.494828] rt_mutex_slowunlock+0x143/0x280
> kernel: warning [ 998.494831] ? __pfx_mce_timer_fn+0x10/0x10
> kernel: warning [ 998.494833] mce_timer_fn+0x90/0xe0
> kernel: warning [ 998.494835] ? __pfx_mce_timer_fn+0x10/0x10
> kernel: warning [ 998.494837] call_timer_fn+0x24/0x130
> kernel: warning [ 998.494840] expire_timers+0xd3/0x1c0

Not sure it's the same issue. Stacktrace looks different. Anyway, are you maybe able to verify if the issue is still reproducible with latest v6.6-rt stable (v6.6.78-rt51 at the time of writing)?

Looks like you are on v6.6-rt, thus the question.

-------Yes, we use v6.6.52-rt. It is not easy to reproduce. We will try to test it with v6.6.78-rt51.

Thanks,
Juri