Re: [PATCH v2 1/7] tools lib api: Add io_dir an allocation free readdir alternative
From: Namhyung Kim
Date: Fri Feb 21 2025 - 01:31:15 EST
On Wed, Feb 19, 2025 at 02:21:45PM -0800, Ian Rogers wrote:
> On Wed, Feb 19, 2025 at 1:51 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >
> > On Fri, Feb 07, 2025 at 03:24:42PM -0800, Ian Rogers wrote:
> > > glibc's opendir allocates a minimum of 32kb, when called recursively
> > > for a directory tree the memory consumption can add up - nearly 300kb
> > > during perf start-up when processing modules. Add a stack allocated
> > > variant of readdir sized a little more than 1kb.
> > >
> > > As getdents64 may be missing from libc, add support using syscall.
> > > Note, an earlier version of this patch had a feature test for
> > > getdents64 but there were problems on certains distros where
> > > getdents64 would be #define renamed to getdents breaking the code. The
> > > syscall use was made uncondtional to work around this. There is
> > > context in:
> > > https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@xxxxxxxxxx/
> > >
> > > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > > ---
> > > tools/lib/api/Makefile | 2 +-
> > > tools/lib/api/io_dir.h | 93 ++++++++++++++++++++++++++++++++++++++++++
> > > 2 files changed, 94 insertions(+), 1 deletion(-)
> > > create mode 100644 tools/lib/api/io_dir.h
> > >
> > > diff --git a/tools/lib/api/Makefile b/tools/lib/api/Makefile
> > > index 7f6396087b46..8665c799e0fa 100644
> > > --- a/tools/lib/api/Makefile
> > > +++ b/tools/lib/api/Makefile
> > > @@ -95,7 +95,7 @@ install_lib: $(LIBFILE)
> > > $(call do_install_mkdir,$(libdir_SQ)); \
> > > cp -fpR $(LIBFILE) $(DESTDIR)$(libdir_SQ)
> > >
> > > -HDRS := cpu.h debug.h io.h
> > > +HDRS := cpu.h debug.h io.h io_dir.h
> > > FD_HDRS := fd/array.h
> > > FS_HDRS := fs/fs.h fs/tracing_path.h
> > > INSTALL_HDRS_PFX := $(DESTDIR)$(prefix)/include/api
> > > diff --git a/tools/lib/api/io_dir.h b/tools/lib/api/io_dir.h
> > > new file mode 100644
> > > index 000000000000..c84738923c96
> > > --- /dev/null
> > > +++ b/tools/lib/api/io_dir.h
> > > @@ -0,0 +1,93 @@
> > > +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
> > > +/*
> > > + * Lightweight directory reading library.
> > > + */
> > > +#ifndef __API_IO_DIR__
> > > +#define __API_IO_DIR__
> > > +
> > > +#include <dirent.h>
> > > +#include <fcntl.h>
> > > +#include <stdlib.h>
> > > +#include <unistd.h>
> > > +#include <sys/stat.h>
> > > +#include <sys/syscall.h>
> > > +
> > > +#if !defined(SYS_getdents64)
> > > +#if defined(__x86_64__)
> > > +#define SYS_getdents64 217
> > > +#elif defined(__aarch64__)
> > > +#define SYS_getdents64 61
> > > +#endif
> > > +#endif
> > > +
> > > +static inline ssize_t perf_getdents64(int fd, void *dirp, size_t count)
> > > +{
> > > +#ifdef MEMORY_SANITIZER
> > > + memset(dirp, 0, count);
> > > +#endif
> > > + return syscall(SYS_getdents64, fd, dirp, count);
> >
> > Unfortunately this fails to build on my i386 vm (and probably other old
> > archs don't have SYS_getdents64 yet).
> >
> > In file included from util/pmus.c:6:
> > /build/libapi/include/api/io_dir.h: In function 'perf_getdents64':
> > /build/libapi/include/api/io_dir.h:28:24: error: 'SYS_getdents64' undeclared (first use in this function); did you mean 'perf_getdents64'?
> > 28 | return syscall(SYS_getdents64, fd, dirp, count);
> > | ^~~~~~~~~~~~~~
> > | perf_getdents64
> >
> > > +}
> > > +#endif
> >
> > Maybe mismatched.
>
> So even on 32-bit systems we want getdents64 as getdents encodes the
> d_type at the end of dirent making it hard to index. On i386 we know
> the number of the syscall for perf trace:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/arch/x86/entry/syscalls/syscall_32.tbl?h=perf-tools-next#n235
> So we can presumably change:
> ```
> #if !defined(SYS_getdents64)
> #if defined(__x86_64__)
> #define SYS_getdents64 217
> #elif defined(__aarch64__)
> #define SYS_getdents64 61
> #endif
> #endif
> ```
> to also have:
> ```
> #elif defined(__i386__)
> #define SYS_getdents64 220
> ```
> Could you test this so that I don't need to resend 7 patches for each
> architecture you test upon? The man page says <sys/syscall.h> and
> <unistd.h> should be sufficient for the code to work, so I think
> addressing this is adding workarounds for distros that aren't
> conformant - ie its the distro's fault the code fails to compile and
> not the tool's.
It fixes the issue on my machine but I'm afraid others will see the same
issue on other archs. I think <sys/syscall.h> should provide the number
for the syscall but the problem is old distros which didn't ship recent
headers. So it's a matter of how long the tool needs to support such an
old one. :(
Thanks,
Namhyung
> >
> > > +
> > > +struct io_dirent64 {
> > > + ino64_t d_ino; /* 64-bit inode number */
> > > + off64_t d_off; /* 64-bit offset to next structure */
> > > + unsigned short d_reclen; /* Size of this dirent */
> > > + unsigned char d_type; /* File type */
> > > + char d_name[NAME_MAX + 1]; /* Filename (null-terminated) */
> > > +};
> > > +
> > > +struct io_dir {
> > > + int dirfd;
> > > + ssize_t available_bytes;
> > > + struct io_dirent64 *next;
> > > + struct io_dirent64 buff[4];
> > > +};
> > > +
> > > +static inline void io_dir__init(struct io_dir *iod, int dirfd)
> > > +{
> > > + iod->dirfd = dirfd;
> > > + iod->available_bytes = 0;
> > > +}
> > > +
> > > +static inline void io_dir__rewinddir(struct io_dir *iod)
> > > +{
> > > + lseek(iod->dirfd, 0, SEEK_SET);
> > > + iod->available_bytes = 0;
> > > +}
> > > +
> > > +static inline struct io_dirent64 *io_dir__readdir(struct io_dir *iod)
> > > +{
> > > + struct io_dirent64 *entry;
> > > +
> > > + if (iod->available_bytes <= 0) {
> > > + ssize_t rc = perf_getdents64(iod->dirfd, iod->buff, sizeof(iod->buff));
> > > +
> > > + if (rc <= 0)
> > > + return NULL;
> > > + iod->available_bytes = rc;
> > > + iod->next = iod->buff;
> > > + }
> > > + entry = iod->next;
> > > + iod->next = (struct io_dirent64 *)((char *)entry + entry->d_reclen);
> > > + iod->available_bytes -= entry->d_reclen;
> > > + return entry;
> > > +}
> > > +
> > > +static inline bool io_dir__is_dir(const struct io_dir *iod, struct io_dirent64 *dent)
> > > +{
> > > + if (dent->d_type == DT_UNKNOWN) {
> > > + struct stat st;
> > > +
> > > + if (fstatat(iod->dirfd, dent->d_name, &st, /*flags=*/0))
> > > + return false;
> > > +
> > > + if (S_ISDIR(st.st_mode)) {
> > > + dent->d_type = DT_DIR;
> > > + return true;
> > > + }
> > > + }
> > > + return dent->d_type == DT_DIR;
> > > +}
> > > +
> > > +#endif
> > > --
> > > 2.48.1.502.g6dc24dfdaf-goog
> > >