In the Linux kernel, the following vulnerability has been resolved:
amd/amdkfd: enhance kfd process check in switch partition
current switch partition only check if kfd_processes_table is empty. kfd_prcesses_table entry is deleted in kfd_process_notifier_release, but kfd_process tear down is in kfd_process_wq_release.
consider two processes:
Process A (workqueue) -> kfd_process_wq_release -> Access kfd_node member Process B switch partition -> amdgpu_xcp_pre_partition_switch -> amdgpu_amdkfd_device_fini_sw -> kfd_node tear down.
Process A and B may trigger a race as shown in dmesg log.
This patch is to resolve the race by adding an atomic kfd_process counter kfd_processes_count, it increment as create kfd process, decrement as finish kfd_process_wq_release.
v2: Put kfd_processes_count per kfd_dev, move decrement to kfd_process_destroy_pdds and bug fix. (Philip Yang)
[3966658.307702] divide error: 0000 [#1] SMP NOPTI [3966658.350818] i10nm_edac [3966658.356318] CPU: 124 PID: 38435 Comm: kworker/124:0 Kdump: loaded Tainted [3966658.356890] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu] [3966658.362839] nfit [3966658.366457] RIP: 0010:kfd_get_num_sdma_engines+0x17/0x40 [amdgpu] [3966658.366460] Code: 00 00 e9 ac 81 02 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 4f 08 48 8b b7 00 01 00 00 8b 81 58 26 03 00 99 be b8 01 00 00 80 b9 70 2e 00 00 00 74 0b 83 f8 02 ba 02 00 00 [3966658.380967] x86_pkg_temp_thermal [3966658.391529] RSP: 0018:ffffc900a0edfdd8 EFLAGS: 00010246 [3966658.391531] RAX: 0000000000000008 RBX: ffff8974e593b800 RCX: ffff888645900000 [3966658.391531] RDX: 0000000000000000 RSI: ffff888129154400 RDI: ffff888129151c00 [3966658.391532] RBP: ffff8883ad79d400 R08: 0000000000000000 R09: ffff8890d2750af4 [3966658.391532] R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000 [3966658.391533] R13: ffff8883ad79d400 R14: ffffe87ff662ba00 R15: ffff8974e593b800 [3966658.391533] FS: 0000000000000000(0000) GS:ffff88fe7f600000(0000) knlGS:0000000000000000 [3966658.391534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3966658.391534] CR2: 0000000000d71000 CR3: 000000dd0e970004 CR4: 0000000002770ee0 [3966658.391535] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [3966658.391535] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [3966658.391536] PKRU: 55555554 [3966658.391536] Call Trace: [3966658.391674] deallocate_sdma_queue+0x38/0xa0 [amdgpu] [3966658.391762] process_termination_cpsch+0x1ed/0x480 [amdgpu] [3966658.399754] intel_powerclamp [3966658.402831] kfd_process_dequeue_from_all_devices+0x5b/0xc0 [amdgpu] [3966658.402908] kfd_process_wq_release+0x1a/0x1a0 [amdgpu] [3966658.410516] coretemp [3966658.434016] process_one_work+0x1ad/0x380 [3966658.434021] worker_thread+0x49/0x310 [3966658.438963] kvm_intel [3966658.446041] ? process_one_work+0x380/0x380 [3966658.446045] kthread+0x118/0x140 [3966658.446047] ? __kthread_bind_mask+0x60/0x60 [3966658.446050] ret_from_fork+0x1f/0x30 [3966658.446053] Modules linked in: kpatch_20765354(OEK) [3966658.455310] kvm [3966658.464534] mptcp_diag xsk_diag raw_diag unix_diag af_packet_diag netlink_diag udp_diag act_pedit act_mirred act_vlan cls_flower kpatch_21951273(OEK) kpatch_18424469(OEK) kpatch_19749756(OEK) [3966658.473462] idxd_mdev [3966658.482306] kpatch_17971294(OEK) sch_ingress xt_conntrack amdgpu(OE) amdxcp(OE) amddrm_buddy(OE) amd_sched(OE) amdttm(OE) amdkcl(OE) intel_ifs iptable_mangle tcm_loop target_core_pscsi tcp_diag target_core_file inet_diag target_core_iblock target_core_user target_core_mod coldpgs kpatch_18383292(OEK) ip6table_nat ip6table_filter ip6_tables ip_set_hash_ipportip ip_set_hash_ipportnet ip_set_hash_ipport ip_set_bitmap_port xt_comment iptable_nat nf_nat iptable_filter ip_tables ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sn_core_odd(OE) i40e overlay binfmt_misc tun bonding(OE) aisqos(OE) aisqo —truncated—
| Name | Vendor | Start Version | End Version |
|---|---|---|---|
| Linux-allwinner-5.19 | Ubuntu | jammy | * |
| Linux-allwinner-5.19 | Ubuntu | upstream | * |
| Linux-aws-5.0 | Ubuntu | esm-infra/bionic | * |
| Linux-aws-5.0 | Ubuntu | upstream | * |
| Linux-aws-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-aws-5.11 | Ubuntu | upstream | * |
| Linux-aws-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-aws-5.13 | Ubuntu | upstream | * |
| Linux-aws-5.19 | Ubuntu | jammy | * |
| Linux-aws-5.19 | Ubuntu | upstream | * |
| Linux-aws-5.3 | Ubuntu | esm-infra/bionic | * |
| Linux-aws-5.3 | Ubuntu | upstream | * |
| Linux-aws-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-aws-5.8 | Ubuntu | upstream | * |
| Linux-aws-6.2 | Ubuntu | jammy | * |
| Linux-aws-6.2 | Ubuntu | upstream | * |
| Linux-aws-6.5 | Ubuntu | jammy | * |
| Linux-aws-6.5 | Ubuntu | upstream | * |
| Linux-azure | Ubuntu | esm-infra/bionic | * |
| Linux-azure-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-azure-5.11 | Ubuntu | upstream | * |
| Linux-azure-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-azure-5.13 | Ubuntu | upstream | * |
| Linux-azure-5.19 | Ubuntu | jammy | * |
| Linux-azure-5.19 | Ubuntu | upstream | * |
| Linux-azure-5.3 | Ubuntu | esm-infra/bionic | * |
| Linux-azure-5.3 | Ubuntu | upstream | * |
| Linux-azure-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-azure-5.8 | Ubuntu | upstream | * |
| Linux-azure-6.11 | Ubuntu | noble | * |
| Linux-azure-6.11 | Ubuntu | upstream | * |
| Linux-azure-6.2 | Ubuntu | jammy | * |
| Linux-azure-6.2 | Ubuntu | upstream | * |
| Linux-azure-6.5 | Ubuntu | jammy | * |
| Linux-azure-6.5 | Ubuntu | upstream | * |
| Linux-azure-edge | Ubuntu | esm-infra/bionic | * |
| Linux-azure-edge | Ubuntu | upstream | * |
| Linux-azure-fde | Ubuntu | esm-infra/focal | * |
| Linux-azure-fde-5.19 | Ubuntu | jammy | * |
| Linux-azure-fde-5.19 | Ubuntu | upstream | * |
| Linux-azure-fde-6.2 | Ubuntu | jammy | * |
| Linux-azure-fde-6.2 | Ubuntu | upstream | * |
| Linux-gcp | Ubuntu | esm-infra/bionic | * |
| Linux-gcp-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-gcp-5.11 | Ubuntu | upstream | * |
| Linux-gcp-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-gcp-5.13 | Ubuntu | upstream | * |
| Linux-gcp-5.19 | Ubuntu | jammy | * |
| Linux-gcp-5.19 | Ubuntu | upstream | * |
| Linux-gcp-5.3 | Ubuntu | esm-infra/bionic | * |
| Linux-gcp-5.3 | Ubuntu | upstream | * |
| Linux-gcp-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-gcp-5.8 | Ubuntu | upstream | * |
| Linux-gcp-6.11 | Ubuntu | noble | * |
| Linux-gcp-6.11 | Ubuntu | upstream | * |
| Linux-gcp-6.2 | Ubuntu | jammy | * |
| Linux-gcp-6.2 | Ubuntu | upstream | * |
| Linux-gcp-6.5 | Ubuntu | jammy | * |
| Linux-gcp-6.5 | Ubuntu | upstream | * |
| Linux-gke | Ubuntu | esm-infra/focal | * |
| Linux-gke-4.15 | Ubuntu | esm-infra/bionic | * |
| Linux-gke-4.15 | Ubuntu | upstream | * |
| Linux-gke-5.15 | Ubuntu | esm-infra/focal | * |
| Linux-gke-5.15 | Ubuntu | upstream | * |
| Linux-gke-5.4 | Ubuntu | esm-infra/bionic | * |
| Linux-gke-5.4 | Ubuntu | upstream | * |
| Linux-gkeop | Ubuntu | esm-infra/focal | * |
| Linux-gkeop-5.15 | Ubuntu | esm-infra/focal | * |
| Linux-gkeop-5.4 | Ubuntu | esm-infra/bionic | * |
| Linux-gkeop-5.4 | Ubuntu | upstream | * |
| Linux-hwe | Ubuntu | esm-infra/bionic | * |
| Linux-hwe-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-hwe-5.11 | Ubuntu | upstream | * |
| Linux-hwe-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-hwe-5.13 | Ubuntu | upstream | * |
| Linux-hwe-5.19 | Ubuntu | jammy | * |
| Linux-hwe-5.19 | Ubuntu | upstream | * |
| Linux-hwe-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-hwe-5.8 | Ubuntu | upstream | * |
| Linux-hwe-6.11 | Ubuntu | noble | * |
| Linux-hwe-6.11 | Ubuntu | upstream | * |
| Linux-hwe-6.2 | Ubuntu | jammy | * |
| Linux-hwe-6.2 | Ubuntu | upstream | * |
| Linux-hwe-6.5 | Ubuntu | jammy | * |
| Linux-hwe-6.5 | Ubuntu | upstream | * |
| Linux-hwe-edge | Ubuntu | esm-infra/bionic | * |
| Linux-hwe-edge | Ubuntu | esm-infra/xenial | * |
| Linux-hwe-edge | Ubuntu | upstream | * |
| Linux-intel-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-intel-5.13 | Ubuntu | upstream | * |
| Linux-intel-iot-realtime | Ubuntu | jammy | * |
| Linux-lowlatency-hwe-5.19 | Ubuntu | jammy | * |
| Linux-lowlatency-hwe-5.19 | Ubuntu | upstream | * |
| Linux-lowlatency-hwe-6.11 | Ubuntu | noble | * |
| Linux-lowlatency-hwe-6.11 | Ubuntu | upstream | * |
| Linux-lowlatency-hwe-6.2 | Ubuntu | jammy | * |
| Linux-lowlatency-hwe-6.2 | Ubuntu | upstream | * |
| Linux-lowlatency-hwe-6.5 | Ubuntu | jammy | * |
| Linux-lowlatency-hwe-6.5 | Ubuntu | upstream | * |
| Linux-nvidia-6.11 | Ubuntu | noble | * |
| Linux-nvidia-6.11 | Ubuntu | upstream | * |
| Linux-nvidia-6.2 | Ubuntu | jammy | * |
| Linux-nvidia-6.2 | Ubuntu | upstream | * |
| Linux-nvidia-6.5 | Ubuntu | jammy | * |
| Linux-nvidia-6.5 | Ubuntu | upstream | * |
| Linux-oem | Ubuntu | esm-infra/bionic | * |
| Linux-oem | Ubuntu | upstream | * |
| Linux-oem-5.10 | Ubuntu | esm-infra/focal | * |
| Linux-oem-5.10 | Ubuntu | upstream | * |
| Linux-oem-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-oem-5.13 | Ubuntu | upstream | * |
| Linux-oem-5.14 | Ubuntu | esm-infra/focal | * |
| Linux-oem-5.14 | Ubuntu | upstream | * |
| Linux-oem-5.17 | Ubuntu | jammy | * |
| Linux-oem-5.17 | Ubuntu | upstream | * |
| Linux-oem-5.6 | Ubuntu | esm-infra/focal | * |
| Linux-oem-5.6 | Ubuntu | upstream | * |
| Linux-oem-6.0 | Ubuntu | jammy | * |
| Linux-oem-6.0 | Ubuntu | upstream | * |
| Linux-oem-6.1 | Ubuntu | jammy | * |
| Linux-oem-6.1 | Ubuntu | upstream | * |
| Linux-oem-6.11 | Ubuntu | noble | * |
| Linux-oem-6.11 | Ubuntu | upstream | * |
| Linux-oem-6.5 | Ubuntu | jammy | * |
| Linux-oem-6.5 | Ubuntu | upstream | * |
| Linux-oem-6.8 | Ubuntu | noble | * |
| Linux-oem-6.8 | Ubuntu | upstream | * |
| Linux-oracle-5.0 | Ubuntu | esm-infra/bionic | * |
| Linux-oracle-5.0 | Ubuntu | upstream | * |
| Linux-oracle-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-oracle-5.11 | Ubuntu | upstream | * |
| Linux-oracle-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-oracle-5.13 | Ubuntu | upstream | * |
| Linux-oracle-5.3 | Ubuntu | esm-infra/bionic | * |
| Linux-oracle-5.3 | Ubuntu | upstream | * |
| Linux-oracle-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-oracle-5.8 | Ubuntu | upstream | * |
| Linux-oracle-6.5 | Ubuntu | jammy | * |
| Linux-oracle-6.5 | Ubuntu | upstream | * |
| Linux-raspi-realtime | Ubuntu | noble | * |
| Linux-raspi2 | Ubuntu | esm-infra/focal | * |
| Linux-raspi2 | Ubuntu | upstream | * |
| Linux-realtime | Ubuntu | jammy | * |
| Linux-realtime | Ubuntu | noble | * |
| Linux-riscv | Ubuntu | esm-infra/focal | * |
| Linux-riscv | Ubuntu | jammy | * |
| Linux-riscv | Ubuntu | noble | * |
| Linux-riscv-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-riscv-5.11 | Ubuntu | upstream | * |
| Linux-riscv-5.19 | Ubuntu | jammy | * |
| Linux-riscv-5.19 | Ubuntu | upstream | * |
| Linux-riscv-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-riscv-5.8 | Ubuntu | upstream | * |
| Linux-riscv-6.5 | Ubuntu | jammy | * |
| Linux-riscv-6.5 | Ubuntu | upstream | * |
| Linux-starfive-5.19 | Ubuntu | jammy | * |
| Linux-starfive-5.19 | Ubuntu | upstream | * |
| Linux-starfive-6.2 | Ubuntu | jammy | * |
| Linux-starfive-6.2 | Ubuntu | upstream | * |
| Linux-starfive-6.5 | Ubuntu | jammy | * |
| Linux-starfive-6.5 | Ubuntu | upstream | * |