On Tue, Sep 24, 2024 at 5:00 AM Aboorva Devarajan
<aboorvad@xxxxxxxxxxxxx> wrote:
On Tue, 2024-09-24 at 10:03 +0200, Alexei Starovoitov wrote:
On Mon, Sep 23, 2024 at 8:21 PM Tejun Heo <tj@xxxxxxxxxx> wrote:
Hello,
(cc'ing Alexei and Andrii for the BPF part)
On Mon, Sep 23, 2024 at 08:26:32PM +0530, Aboorva Devarajan wrote:
Sharing the crash logs observed in PowerPC here for general reference, FYI:
[ 8638.891964] Kernel attempted to read user page (a8) - exploit attempt? (uid: 0)
[ 8638.892002] BUG: Kernel NULL pointer dereference on read at 0x000000a8
[ 8638.892019] Faulting instruction address: 0xc0000000004e7cc0
[ 8638.892038] Oops: Kernel access of bad area, sig: 11 [#1]
[ 8638.892060] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[ 8638.892080] Modules linked in: nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype
br_netfilter xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 ebtable_filter ebtables vhost_vsock vmw_vsock_virtio_transport_common ip6tabl
e_filter ip6_tables vhost vhost_iotlb iptable_filter vsock bridge stp llc kvm_hv kvm joydev
input_leds mac_hid at24 ofpart cmdlinepart uio_pdrv_genirq ibmpowernv opal_prd ipmi_powernv
powernv_flash uio binfmt_misc sch_fq_codel nfsd mtd ipmi_devintf ipmi_msghandler auth_rpcgss
jc42 ramoops reed_solomon ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async
_memcpy async_pq async_xor async_tx raid1 raid0 dm_mirror dm_region_hash dm_log mlx5_ib ib_uverbs
ib_core mlx5_core hid_generic usbhid hid ast i2c_algo_bit drm_shmem_helper drm_kms_hel
per vmx_crypto drm mlxfw crct10dif_vpmsum crc32c_vpmsum psample tls tg3 ahci libahci
drm_panel_orientation_quirks
[ 8638.892621] CPU: 62 UID: 0 PID: 5591 Comm: kworker/62:2 Not tainted 6.11.0-rc4+ #2
[ 8638.892663] Hardware name: 8335-GTW POWER9 0x4e1203 opal:skiboot-v6.5.3-35-g1851b2a06 PowerNV
[ 8638.892693] Workqueue: events bpf_prog_free_deferred
[ 8638.892735] NIP: c0000000004e7cc0 LR: c0000000004e7bbc CTR: c0000000003a9b30
[ 8638.892798] REGS: c000000ea4cbf7f0 TRAP: 0300 Not tainted (6.11.0-rc4+)
[ 8638.892862] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 42a00284 XER: 00000000
[ 8638.892915] CFAR: c0000000004e7bb8 DAR: 00000000000000a8 DSISR: 40000000 IRQMASK: 1
[ 8638.892915] GPR00: c0000000004e7bbc c000000ea4cbfa90 c000000002837f00 0000000000000005
[ 8638.892915] GPR04: 0000000000000015 0000000000000009 0000000000000009 c000000004840b00
[ 8638.892915] GPR08: ffffffffffffffff 00000000ffffe000 ffffffffffffffff 000001937b55db50
[ 8638.892915] GPR12: 0000000000200000 c000007ffdfac300 c0000000031b1fc8 0000000000010000
[ 8638.892915] GPR16: c00000000000018e 000000007fffffff 0000000000000000 000000000000e1c0
[ 8638.892915] GPR20: 61c8864680b583eb 0000000000000000 0000000000000000 00000000000de1d5
[ 8638.892915] GPR24: 0000000000000000 c000000003da4408 c000000003da4400 c000000003da43f8
[ 8638.892915] GPR24: 0000000000000000 c000000003da4408 c000000003da4400 c000000003da43f8
[ 8638.892915] GPR28: 0000000000000000 0000000000000000 0000000000000000 c000000ea4cbfa90
[ 8638.893350] NIP [c0000000004e7cc0] walk_to_pmd+0x80/0x240
With "BUG: Kernel NULL pointer dereference on read at 0x000000a8" (from above),
it appears bpf_arch_text_invalidate() is racing with
text_area_cpu_down_mm(), which
sets cpu_patching_context.mm to NULL?
Am I going in the right direction?