[PATCH AUTOSEL 7.0-5.10] drm/amdgpu: validate fence_count in wait_fences ioctl

From: Sasha Levin

Date: Mon Apr 20 2026 - 11:46:09 EST


From: "Jesse.Zhang" <Jesse.Zhang@xxxxxxx>

[ Upstream commit 2cef848812a071991c20090cbe051a0a96c50a0c ]

Add an early parameter check in amdgpu_cs_wait_fences_ioctl() to reject
a zero fence_count with -EINVAL.

dma_fence_wait_any_timeout() requires count > 0. When userspace passes
fence_count == 0, the call propagates down to dma_fence core which does
not expect a zero-length array and triggers a WARN_ON.

Return -EINVAL immediately so the caller gets a clear error instead of
hitting an unexpected warning in the DMA fence subsystem.

No functional change for well-formed userspace callers.

v2:
- Reworked commit message to clarify the parameter validation rationale
- Removed verbose crash log from commit description
- Simplified inline code comment

Reviewed-by: Vitaly Prosyak <vitaly.prosyak@xxxxxxx>
Reviewed-by: Christian König <christian.koenig@xxxxxxx>
Signed-off-by: Jesse Zhang <jesse.zhang@xxxxxxx>
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---

LLM Generated explanations, may be completely bogus:

I now have comprehensive information to complete the full analysis. Let
me compile the findings.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `drm/amdgpu`
- Action verb: "validate" (adding missing input validation = bug fix)
- Summary: Validates `fence_count` parameter in `wait_fences` ioctl to
reject zero.
Record: [drm/amdgpu] [validate] [Adding missing input validation for
fence_count in wait_fences ioctl]

**Step 1.2: Tags**
- Reviewed-by: Vitaly Prosyak <vitaly.prosyak@xxxxxxx> (AMD developer)
- Reviewed-by: Christian König <christian.koenig@xxxxxxx> (DMA-fence
subsystem maintainer)
- Signed-off-by: Jesse Zhang <jesse.zhang@xxxxxxx> (author)
- Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> (AMD DRM
maintainer, committer)
- No Fixes: tag (expected for autosel review)
- No Cc: stable (expected)
Record: Strong review signal — Christian König is the DRM scheduler /
DMA-fence maintainer. Alex Deucher is the AMD GPU maintainer.

**Step 1.3: Commit Body**
Bug: Userspace passes `fence_count == 0` to `AMDGPU_WAIT_FENCES` ioctl.
This propagates to `dma_fence_wait_any_timeout()`, which has
`WARN_ON(!count)`. The WARN_ON fires in the DMA fence subsystem.
Symptom: Kernel WARN_ON triggered from userspace input. On
`panic_on_warn` systems, this causes a kernel crash.
Fix: Return `-EINVAL` early when `fence_count == 0`.
Record: [Bug: WARN_ON trigger from userspace-controlled input] [Symptom:
kernel warning/crash] [Author's root cause: dma_fence_wait_any_timeout
requires count > 0]

**Step 1.4: Hidden Bug Fix Detection**
"Validate" = adding missing parameter check. This IS a bug fix: it
prevents a WARN_ON (and potential crash) from userspace-controlled
input.
Record: [Yes, this is a bug fix — adds missing input validation to
prevent WARN_ON from ioctl with zero count]

---

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- Single file modified: `drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c`
- +7 lines (4-line comment + 2 lines of code + 1 blank line)
- Function modified: `amdgpu_cs_wait_fences_ioctl()`
Record: [1 file, +7 lines, single function, surgical fix]

**Step 2.2: Code Flow Change**
BEFORE: `fence_count == 0` passes through to `memdup_array_user`
(returns ZERO_SIZE_PTR), then reaches `dma_fence_wait_any_timeout()`
which fires `WARN_ON(!count)`.
AFTER: `fence_count == 0` is caught at ioctl entry, returns `-EINVAL`
immediately.
Record: [Before: WARN_ON triggered. After: clean EINVAL return]

**Step 2.3: Bug Mechanism**
Category: Missing input validation / parameter check.
Mechanism: The ioctl fails to validate a user-controlled parameter
before passing it to a core kernel API that has a `WARN_ON`
precondition. Verified at line 894 of `dma-fence.c`:

```894:894:drivers/dma-buf/dma-fence.c
if (WARN_ON(!fences || !count || timeout < 0))
```

Record: [Missing input validation] [User-controlled count==0 triggers
WARN_ON in dma_fence_wait_any_timeout]

**Step 2.4: Fix Quality**
- Obviously correct: a simple zero-check before further processing.
- Minimal/surgical: 7 lines total including comments.
- Regression risk: None. `fence_count == 0` is meaningless ("wait for
zero fences"), and the ioctl already failed (with WARN) in this case.
Returning `-EINVAL` is the correct behavior.
- No API change for well-formed callers (as stated in the commit
message).
Record: [Fix is obviously correct, minimal, no regression risk]

---

## PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1: Blame**
`amdgpu_cs_wait_fences_ioctl` was introduced by commit `eef18a827a9ec5`
(Junwei Zhang, 2016-11-04, "drm/amdgpu: add the interface of waiting
multiple fences (v4)"). This is v4.10-era code, present in ALL active
stable trees.
Record: [Bug introduced in eef18a827a9ec5, v4.10 timeframe, present in
all stable trees]

**Step 3.2: Fixes Tag**
No Fixes: tag present. The implicit target is `eef18a827a9ec5` which
introduced the ioctl without the validation.
Record: [No Fixes: tag. Original code from 2016.]

**Step 3.3: File History**
Recent changes to `amdgpu_cs.c` include `dea75df7afe14`
(memdup_array_user conversion) and `69050f8d6d075` (kzalloc_objs
treewide change). These are cosmetic/API modernizations that don't
affect the bug or fix logic.
Record: [Recent changes are cosmetic. Fix is standalone.]

**Step 3.4: Author**
Jesse Zhang is a regular AMD GPU contributor with multiple fix commits
in the subsystem (SDMA fixes, out-of-bounds fixes, etc.).
Record: [Active AMD subsystem contributor]

**Step 3.5: Dependencies**
The fix adds a simple `if` check at the start of the function, before
any recently-changed code. It does NOT depend on patches 2/3 in the
series (which touch different files/functions entirely). The series
patches are independent input validation improvements.
Record: [Standalone fix, no dependencies on other patches]

---

## PHASE 4: MAILING LIST RESEARCH

**Step 4.1: Original Submission**
Found at https://lists.freedesktop.org/archives/amd-
gfx/2026-March/140748.html. This is v2 of the patch with reworked commit
message. No NAKs found.
Record: [Patch submitted March 2026. v2 incorporated review feedback on
commit message clarity.]

**Step 4.2: Reviewers**
Reviewed by Vitaly Prosyak (AMD) and Christian König (DMA-fence/DRM
scheduler maintainer). Committed by Alex Deucher (AMD DRM maintainer).
Record: [Subsystem maintainer reviewed and approved]

**Step 4.3: Bug Report**
No separate bug report link. The bug was found by code inspection (the
WARN_ON contract in `dma_fence_wait_any_timeout` is explicit).
Record: [Found by code review, not user report]

**Step 4.4: Series Context**
3-patch series, all independent input validation improvements. Patch 2/3
changes WARN to DRM_ERROR in `amdgpu_sched_ioctl` (separate
file/function). Each is standalone.
Record: [Independent patches in the series. This one is self-contained.]

**Step 4.5: Stable Discussion**
No explicit stable nomination found in the thread.
Record: [No explicit stable discussion, which is expected for autosel
candidates.]

---

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Functions Modified**
Only `amdgpu_cs_wait_fences_ioctl()`.

**Step 5.2: Callers**
This is a DRM ioctl handler registered via
`DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_FENCES, ...)` with
`DRM_AUTH|DRM_RENDER_ALLOW`. It is reachable by any process with access
to `/dev/dri/renderDNN` — no special privileges required beyond
DRM_AUTH.
Record: [Ioctl handler, reachable from unprivileged userspace via render
node]

**Step 5.3-5.4: Call Chain**
Userspace ioctl -> `drm_ioctl` -> `amdgpu_cs_wait_fences_ioctl` -> (if
!wait_all) `amdgpu_cs_wait_any_fence` -> `dma_fence_wait_any_timeout` ->
`WARN_ON(!count)`.
Record: [Direct ioctl path, user-controlled trigger, WARN_ON reached
with fence_count=0]

**Step 5.5: Similar Patterns**
The `amdgpu_cs_wait_all_fences` path with count==0 doesn't hit a WARN_ON
(the for loop simply doesn't execute), but returns success for a
meaningless request. The fix correctly catches both paths by validating
at the ioctl entry point.
Record: [Fix covers both wait_all and wait_any paths]

---

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Buggy Code in Stable**
The `amdgpu_cs_wait_fences_ioctl` function has existed since
`eef18a827a9ec5` (v4.10, November 2016). It exists in ALL active stable
trees.
Record: [Bug exists in all stable trees (v5.4, v5.10, v5.15, v6.1, v6.6,
v6.12)]

**Step 6.2: Backport Complications**
The fix adds a check at the very start of the function body, before any
code that has been recently modified. In older stable trees, the
`memdup_array_user` line would be `memdup_user` with a manual size
calculation instead, but the added check comes BEFORE that line. Minor
context adjustment may be needed for the surrounding `memdup` call, but
the fix itself is trivially applicable.
Record: [Clean or near-clean apply expected. Minor context difference in
older trees.]

**Step 6.3: Related Fixes in Stable**
No prior fix for this zero-count issue was found.
Record: [No prior fix exists in stable.]

---

## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT

**Step 7.1: Subsystem**
DRM/AMDGPU — a widely used GPU driver (AMD is one of two major discrete
GPU vendors on Linux). Criticality: IMPORTANT.
Record: [drm/amdgpu, IMPORTANT — widely used GPU driver]

**Step 7.2: Activity**
Actively developed (many recent commits). The file has had multiple
changes since v6.6.
Record: [Very active subsystem]

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1: Affected Users**
All users with AMD GPUs (a very large population). Any process with DRM
render node access.
Record: [All AMD GPU users, unprivileged trigger]

**Step 8.2: Trigger Conditions**
Any process that opens `/dev/dri/renderDNN` and issues the
`AMDGPU_WAIT_FENCES` ioctl with `fence_count == 0`. This is trivially
triggerable — no special timing, no race, no specific hardware needed. A
single malformed ioctl call from any render-node-capable process
triggers it.
Record: [Trivially triggerable from unprivileged userspace.
Deterministic, no race needed.]

**Step 8.3: Failure Mode**
- `WARN_ON` fires in `dma_fence_wait_any_timeout()`: produces stack
trace in kernel log.
- On systems with `panic_on_warn=1` (common in syzbot testing, some
hardened deployments): full kernel panic.
- Even without `panic_on_warn`, repeated triggering can flood kernel
logs and potentially be used for denial of service.
- Severity: **MEDIUM-HIGH** (WARN from unprivileged userspace, potential
crash with panic_on_warn)
Record: [WARN_ON from userspace, CRITICAL with panic_on_warn, MEDIUM
otherwise]

**Step 8.4: Risk-Benefit**
- BENEFIT: High — prevents kernel warning/crash from trivial
unprivileged userspace input on all AMD GPU systems.
- RISK: Very low — 2 lines of actual code (a simple zero-check),
obviously correct, no functional change for valid callers, reviewed by
subsystem maintainer.
Record: [High benefit, very low risk]

---

## PHASE 9: FINAL SYNTHESIS

**Step 9.1: Evidence**
FOR backporting:
- Fixes a WARN_ON triggerable from unprivileged userspace via ioctl
- Trivial to trigger (deterministic, single ioctl call)
- Crash on `panic_on_warn` systems
- Reviewed by DMA-fence subsystem maintainer (Christian König) and AMD
maintainer
- 7 lines total, 2 lines of logic — minimal and obviously correct
- Bug exists since v4.10 (2016), affects all stable trees
- Standalone fix with no dependencies
- No functional change for well-formed callers

AGAINST backporting:
- Not a crash for default kernel configuration (WARN, not BUG)
- No user report (found by code inspection)
- Minor context may differ in older stable trees

**Step 9.2: Stable Rules Checklist**
1. Obviously correct and tested? **YES** — trivial zero-check, reviewed
by maintainer
2. Fixes a real bug? **YES** — WARN_ON from userspace input
3. Important issue? **YES** — userspace-triggerable warning, crash with
panic_on_warn
4. Small and contained? **YES** — 7 lines in one function
5. No new features or APIs? **YES** — purely input validation
6. Can apply to stable? **YES** — should apply cleanly or near-cleanly

**Step 9.3: Exception Categories**
N/A — this is a straightforward bug fix, no exception needed.

**Step 9.4: Decision**
The fix is small, surgical, obviously correct, reviewed by the subsystem
maintainer, and prevents a WARN_ON (potential crash) triggerable from
unprivileged userspace. It has existed in all stable trees since 2016.
The risk is negligible and the benefit is clear.

---

## Verification

- [Phase 1] Parsed tags: Reviewed-by Christian König (DMA-fence
maintainer), Reviewed-by Vitaly Prosyak, Signed-off Alex Deucher (AMD
DRM maintainer)
- [Phase 2] Diff analysis: +7 lines in `amdgpu_cs_wait_fences_ioctl()`,
adds `if (!wait->in.fence_count) return -EINVAL` before allocation
- [Phase 2] Confirmed WARN_ON at dma-fence.c:894: `WARN_ON(!fences ||
!count || timeout < 0)` — verified by reading the source
- [Phase 2] Confirmed `memdup_array_user(ptr, 0, size)` returns
ZERO_SIZE_PTR (not error), so zero count passes through to
`dma_fence_wait_any_timeout`
- [Phase 3] git blame: `amdgpu_cs_wait_fences_ioctl` introduced in
eef18a827a9ec5 (2016-11-04), confirmed present in p-5.10, p-5.15 tags
- [Phase 3] git log: no prior fix for this zero-count issue in file
history
- [Phase 4] Found original submission at lists.freedesktop.org amd-gfx
March 2026. v2 with reworked commit message. No NAKs.
- [Phase 4] Series is 3 independent patches; patch 2/3 touches different
file (amdgpu_sched.c). This patch is standalone.
- [Phase 5] Ioctl registered with DRM_AUTH|DRM_RENDER_ALLOW — confirmed
reachable from unprivileged userspace
- [Phase 5] Traced call chain: ioctl -> wait_any_fence ->
dma_fence_wait_any_timeout -> WARN_ON(!count)
- [Phase 6] Bug exists in all active stable trees (code from 2016)
- [Phase 6] Fix applies before any recently-changed code; near-clean
apply expected
- [Phase 8] Severity: WARN_ON from unprivileged userspace, crash with
panic_on_warn; benefit high, risk very low

**YES**

drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 24e4b4fc91564..142022295fe15 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1747,6 +1747,13 @@ int amdgpu_cs_wait_fences_ioctl(struct drm_device *dev, void *data,
struct drm_amdgpu_fence *fences;
int r;

+ /*
+ * fence_count must be non-zero; dma_fence_wait_any_timeout()
+ * does not accept an empty fence array.
+ */
+ if (!wait->in.fence_count)
+ return -EINVAL;
+
/* Get the fences from userspace */
fences = memdup_array_user(u64_to_user_ptr(wait->in.fences),
wait->in.fence_count,
--
2.53.0