Re: [PATCH] drbd: fix a null-pointer dereference when the request event in drbd_request_endio() is READ_COMPLETED_WITH_ERROR

From: Christoph Böhmwalder

Date: Thu Feb 19 2026 - 09:54:42 EST


On 1/4/26 17:53, Tuo Li wrote:
In drbd_request_endio(), the request event what can be set to
READ_COMPLETED_WITH_ERROR. In this case, __req_mod() is invoked with a NULL
peer_device:

__req_mod(req, what, NULL, &m);

When handling READ_COMPLETED_WITH_ERROR, __req_mod() unconditionally calls
drbd_set_out_of_sync():

case READ_COMPLETED_WITH_ERROR:
drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);

The drbd_set_out_of_sync() macro expands to __drbd_change_sync():

#define drbd_set_out_of_sync(peer_device, sector, size) \
__drbd_change_sync(peer_device, sector, size, SET_OUT_OF_SYNC)

However, __drbd_change_sync() assumes a valid peer_device and immediately
dereferences it:

struct drbd_device *device = peer_device->device;

If peer_device is NULL, this results in a NULL-pointer dereference.

Fix this by adding a NULL check in __req_mod() before calling
drbd_set_out_of_sync().

Thank you for the report and patch.
The bug analysis is correct, but the fix is not.


Signed-off-by: Tuo Li <islituo@xxxxxxxxx>
---
drivers/block/drbd/drbd_req.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index d15826f6ee81..aa3da2733f14 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -621,7 +621,8 @@ int __req_mod(struct drbd_request *req, enum drbd_req_event what,
break;
case READ_COMPLETED_WITH_ERROR:
- drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
+ if (peer_device)
+ drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
drbd_report_io_error(device, req);
__drbd_chk_io_error(device, DRBD_READ_ERROR);
fallthrough;

In this code path, peer_device is *always* NULL -- the only caller that
sets READ_COMPLETED_WITH_ERROR is drbd_request_endio(), which always
passes NULL for peer_device. So this NULL check effectively turns the
drbd_set_out_of_sync() call into dead code.

Silently skipping the call here means we lose out-of-sync tracking
for local read errors, which is a data consistency problem.

The proper fix is to obtain the peer_device via first_peer_device(device), like in a similar path in drbd_req_destroy (drbd_req.c:125).

case READ_COMPLETED_WITH_ERROR:
drbd_set_out_of_sync(first_peer_device(device),
req->i.sector, req->i.size);

Regards,
Christoph

--
Christoph Böhmwalder
LINBIT | Keeping the Digital World Running
DRBD HA — Disaster Recovery — Software defined Storage