In the Linux kernel, the following vulnerability has been resolved:
s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump
Do not block PCI config accesses through pci_cfg_access_lock() when executing the s390 variant of PCI error recovery: Acquire just device_lock() instead of pci_dev_lock() as powerpcs EEH and generig PCI AER processing do.
During error recovery testing a pair of tasks was reported to be hung:
mlx5_core 0000:00:00.1: mlx5_health_try_recover:338:(pid 5553): health recovery flow aborted, PCI reads still not working INFO: task kmcheck:72 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 echo 0 > /proc/sys/kernel/hung_task_timeout_secs disables this message. task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000 Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<000000065256f572>] schedule_preempt_disabled+0x22/0x30 [<0000000652570a94>] __mutex_lock.constprop.0+0x484/0x8a8 [<000003ff800673a4>] mlx5_unload_one+0x34/0x58 [mlx5_core] [<000003ff8006745c>] mlx5_pci_err_detected+0x94/0x140 [mlx5_core] [<0000000652556c5a>] zpci_event_attempt_error_recovery+0xf2/0x398 [<0000000651b9184a>] __zpci_event_error+0x23a/0x2c0 INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 echo 0 > /proc/sys/kernel/hung_task_timeout_secs disables this message. task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000 Workqueue: mlx5_health0000:00:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core] Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<0000000652172e28>] pci_wait_cfg+0x80/0xe8 [<0000000652172f94>] pci_cfg_access_lock+0x74/0x88 [<000003ff800916b6>] mlx5_vsc_gw_lock+0x36/0x178 [mlx5_core] [<000003ff80098824>] mlx5_crdump_collect+0x34/0x1c8 [mlx5_core] [<000003ff80074b62>] mlx5_fw_fatal_reporter_dump+0x6a/0xe8 [mlx5_core] [<0000000652512242>] devlink_health_do_dump.part.0+0x82/0x168 [<0000000652513212>] devlink_health_report+0x19a/0x230 [<000003ff80075a12>] mlx5_fw_fatal_reporter_err_work+0xba/0x1b0 [mlx5_core]
No kernel log of the exact same error with an upstream kernel is available - but the very same deadlock situation can be constructed there, too:
A similar deadlock situation can be reproduced by requesting a crdump with
devlink health dump show pci/ reporter fw_fatal
while PCI error recovery is executed on the same physical function by mlx5_cores pci_error_handlers. On s390 this can be injected with
zpcictl –reset-fw
Tests with this patch failed to reproduce that second deadlock situation, the devlink command is rejected with kernel answers: Permission denied - and we get a kernel log message of:
mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5
because the config read of VSC_SEMAPHORE is rejected by the underlying hardware.
Two prior attempts to address this issue have been discussed and ultimately rejected [see link], with the primary argument that s390s implementation of PCI error recovery is imposing restrictions that neither powerpcs EEH nor PCI AER handling need. Tests show that PCI error recovery on s390 is running to completion even without blocking access to PCI config space.
| Name | Vendor | Start Version | End Version |
|---|---|---|---|
| Linux-allwinner-5.19 | Ubuntu | jammy | * |
| Linux-allwinner-5.19 | Ubuntu | upstream | * |
| Linux-aws-5.0 | Ubuntu | esm-infra/bionic | * |
| Linux-aws-5.0 | Ubuntu | upstream | * |
| Linux-aws-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-aws-5.11 | Ubuntu | upstream | * |
| Linux-aws-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-aws-5.13 | Ubuntu | upstream | * |
| Linux-aws-5.19 | Ubuntu | jammy | * |
| Linux-aws-5.19 | Ubuntu | upstream | * |
| Linux-aws-5.3 | Ubuntu | esm-infra/bionic | * |
| Linux-aws-5.3 | Ubuntu | upstream | * |
| Linux-aws-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-aws-5.8 | Ubuntu | upstream | * |
| Linux-aws-6.2 | Ubuntu | jammy | * |
| Linux-aws-6.2 | Ubuntu | upstream | * |
| Linux-aws-6.5 | Ubuntu | jammy | * |
| Linux-aws-6.5 | Ubuntu | upstream | * |
| Linux-azure | Ubuntu | esm-infra/bionic | * |
| Linux-azure-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-azure-5.11 | Ubuntu | upstream | * |
| Linux-azure-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-azure-5.13 | Ubuntu | upstream | * |
| Linux-azure-5.19 | Ubuntu | jammy | * |
| Linux-azure-5.19 | Ubuntu | upstream | * |
| Linux-azure-5.3 | Ubuntu | esm-infra/bionic | * |
| Linux-azure-5.3 | Ubuntu | upstream | * |
| Linux-azure-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-azure-5.8 | Ubuntu | upstream | * |
| Linux-azure-6.11 | Ubuntu | noble | * |
| Linux-azure-6.11 | Ubuntu | upstream | * |
| Linux-azure-6.2 | Ubuntu | jammy | * |
| Linux-azure-6.2 | Ubuntu | upstream | * |
| Linux-azure-6.5 | Ubuntu | jammy | * |
| Linux-azure-6.5 | Ubuntu | upstream | * |
| Linux-azure-edge | Ubuntu | esm-infra/bionic | * |
| Linux-azure-edge | Ubuntu | upstream | * |
| Linux-azure-fde | Ubuntu | esm-infra/focal | * |
| Linux-azure-fde-5.19 | Ubuntu | jammy | * |
| Linux-azure-fde-5.19 | Ubuntu | upstream | * |
| Linux-azure-fde-6.2 | Ubuntu | jammy | * |
| Linux-azure-fde-6.2 | Ubuntu | upstream | * |
| Linux-gcp | Ubuntu | esm-infra/bionic | * |
| Linux-gcp-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-gcp-5.11 | Ubuntu | upstream | * |
| Linux-gcp-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-gcp-5.13 | Ubuntu | upstream | * |
| Linux-gcp-5.19 | Ubuntu | jammy | * |
| Linux-gcp-5.19 | Ubuntu | upstream | * |
| Linux-gcp-5.3 | Ubuntu | esm-infra/bionic | * |
| Linux-gcp-5.3 | Ubuntu | upstream | * |
| Linux-gcp-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-gcp-5.8 | Ubuntu | upstream | * |
| Linux-gcp-6.11 | Ubuntu | noble | * |
| Linux-gcp-6.11 | Ubuntu | upstream | * |
| Linux-gcp-6.2 | Ubuntu | jammy | * |
| Linux-gcp-6.2 | Ubuntu | upstream | * |
| Linux-gcp-6.5 | Ubuntu | jammy | * |
| Linux-gcp-6.5 | Ubuntu | upstream | * |
| Linux-gke | Ubuntu | esm-infra/focal | * |
| Linux-gke-4.15 | Ubuntu | esm-infra/bionic | * |
| Linux-gke-4.15 | Ubuntu | upstream | * |
| Linux-gke-5.15 | Ubuntu | esm-infra/focal | * |
| Linux-gke-5.15 | Ubuntu | upstream | * |
| Linux-gke-5.4 | Ubuntu | esm-infra/bionic | * |
| Linux-gke-5.4 | Ubuntu | upstream | * |
| Linux-gkeop | Ubuntu | esm-infra/focal | * |
| Linux-gkeop-5.15 | Ubuntu | esm-infra/focal | * |
| Linux-gkeop-5.4 | Ubuntu | esm-infra/bionic | * |
| Linux-gkeop-5.4 | Ubuntu | upstream | * |
| Linux-hwe | Ubuntu | esm-infra/bionic | * |
| Linux-hwe-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-hwe-5.11 | Ubuntu | upstream | * |
| Linux-hwe-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-hwe-5.13 | Ubuntu | upstream | * |
| Linux-hwe-5.19 | Ubuntu | jammy | * |
| Linux-hwe-5.19 | Ubuntu | upstream | * |
| Linux-hwe-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-hwe-5.8 | Ubuntu | upstream | * |
| Linux-hwe-6.11 | Ubuntu | noble | * |
| Linux-hwe-6.11 | Ubuntu | upstream | * |
| Linux-hwe-6.2 | Ubuntu | jammy | * |
| Linux-hwe-6.2 | Ubuntu | upstream | * |
| Linux-hwe-6.5 | Ubuntu | jammy | * |
| Linux-hwe-6.5 | Ubuntu | upstream | * |
| Linux-hwe-edge | Ubuntu | esm-infra/bionic | * |
| Linux-hwe-edge | Ubuntu | esm-infra/xenial | * |
| Linux-hwe-edge | Ubuntu | upstream | * |
| Linux-intel-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-intel-5.13 | Ubuntu | upstream | * |
| Linux-intel-iot-realtime | Ubuntu | jammy | * |
| Linux-lowlatency-hwe-5.19 | Ubuntu | jammy | * |
| Linux-lowlatency-hwe-5.19 | Ubuntu | upstream | * |
| Linux-lowlatency-hwe-6.11 | Ubuntu | noble | * |
| Linux-lowlatency-hwe-6.11 | Ubuntu | upstream | * |
| Linux-lowlatency-hwe-6.2 | Ubuntu | jammy | * |
| Linux-lowlatency-hwe-6.2 | Ubuntu | upstream | * |
| Linux-lowlatency-hwe-6.5 | Ubuntu | jammy | * |
| Linux-lowlatency-hwe-6.5 | Ubuntu | upstream | * |
| Linux-nvidia-6.11 | Ubuntu | noble | * |
| Linux-nvidia-6.11 | Ubuntu | upstream | * |
| Linux-nvidia-6.2 | Ubuntu | jammy | * |
| Linux-nvidia-6.2 | Ubuntu | upstream | * |
| Linux-nvidia-6.5 | Ubuntu | jammy | * |
| Linux-nvidia-6.5 | Ubuntu | upstream | * |
| Linux-oem | Ubuntu | esm-infra/bionic | * |
| Linux-oem | Ubuntu | upstream | * |
| Linux-oem-5.10 | Ubuntu | esm-infra/focal | * |
| Linux-oem-5.10 | Ubuntu | upstream | * |
| Linux-oem-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-oem-5.13 | Ubuntu | upstream | * |
| Linux-oem-5.14 | Ubuntu | esm-infra/focal | * |
| Linux-oem-5.14 | Ubuntu | upstream | * |
| Linux-oem-5.17 | Ubuntu | jammy | * |
| Linux-oem-5.17 | Ubuntu | upstream | * |
| Linux-oem-5.6 | Ubuntu | esm-infra/focal | * |
| Linux-oem-5.6 | Ubuntu | upstream | * |
| Linux-oem-6.0 | Ubuntu | jammy | * |
| Linux-oem-6.0 | Ubuntu | upstream | * |
| Linux-oem-6.1 | Ubuntu | jammy | * |
| Linux-oem-6.1 | Ubuntu | upstream | * |
| Linux-oem-6.11 | Ubuntu | noble | * |
| Linux-oem-6.11 | Ubuntu | upstream | * |
| Linux-oem-6.5 | Ubuntu | jammy | * |
| Linux-oem-6.5 | Ubuntu | upstream | * |
| Linux-oem-6.8 | Ubuntu | noble | * |
| Linux-oem-6.8 | Ubuntu | upstream | * |
| Linux-oracle-5.0 | Ubuntu | esm-infra/bionic | * |
| Linux-oracle-5.0 | Ubuntu | upstream | * |
| Linux-oracle-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-oracle-5.11 | Ubuntu | upstream | * |
| Linux-oracle-5.13 | Ubuntu | esm-infra/focal | * |
| Linux-oracle-5.13 | Ubuntu | upstream | * |
| Linux-oracle-5.3 | Ubuntu | esm-infra/bionic | * |
| Linux-oracle-5.3 | Ubuntu | upstream | * |
| Linux-oracle-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-oracle-5.8 | Ubuntu | upstream | * |
| Linux-oracle-6.5 | Ubuntu | jammy | * |
| Linux-oracle-6.5 | Ubuntu | upstream | * |
| Linux-raspi-realtime | Ubuntu | noble | * |
| Linux-raspi2 | Ubuntu | esm-infra/focal | * |
| Linux-raspi2 | Ubuntu | upstream | * |
| Linux-realtime | Ubuntu | jammy | * |
| Linux-realtime | Ubuntu | noble | * |
| Linux-riscv | Ubuntu | esm-infra/focal | * |
| Linux-riscv | Ubuntu | jammy | * |
| Linux-riscv | Ubuntu | noble | * |
| Linux-riscv-5.11 | Ubuntu | esm-infra/focal | * |
| Linux-riscv-5.11 | Ubuntu | upstream | * |
| Linux-riscv-5.19 | Ubuntu | jammy | * |
| Linux-riscv-5.19 | Ubuntu | upstream | * |
| Linux-riscv-5.8 | Ubuntu | esm-infra/focal | * |
| Linux-riscv-5.8 | Ubuntu | upstream | * |
| Linux-riscv-6.5 | Ubuntu | jammy | * |
| Linux-riscv-6.5 | Ubuntu | upstream | * |
| Linux-starfive-5.19 | Ubuntu | jammy | * |
| Linux-starfive-5.19 | Ubuntu | upstream | * |
| Linux-starfive-6.2 | Ubuntu | jammy | * |
| Linux-starfive-6.2 | Ubuntu | upstream | * |
| Linux-starfive-6.5 | Ubuntu | jammy | * |
| Linux-starfive-6.5 | Ubuntu | upstream | * |