Re: [RFC PATCH 0/3] vfs: convert inode cache to hlist-bl

From: Kent Overstreet
Date: Tue Apr 06 2021 - 18:03:52 EST


On Tue, Apr 06, 2021 at 10:33:40PM +1000, Dave Chinner wrote:
> Hi folks,
>
> Recently I've been doing some scalability characterisation of
> various filesystems, and one of the limiting factors that has
> prevented me from exploring filesystem characteristics is the
> inode hash table. namely, the global inode_hash_lock that protects
> it.
>
> This has long been a problem, but I personally haven't cared about
> it because, well, XFS doesn't use it and so it's not a limiting
> factor for most of my work. However, in trying to characterise the
> scalability boundaries of bcachefs, I kept hitting against VFS
> limitations first. bcachefs hits the inode hash table pretty hard
> and it becaomse a contention point a lot sooner than it does for
> ext4. Btrfs also uses the inode hash, but it's namespace doesn't
> have the capability to stress the indoe hash lock due to it hitting
> internal contention first.
>
> Long story short, I did what should have been done a decade or more
> ago - I converted the inode hash table to use hlist-bl to split up
> the global lock. This is modelled on the dentry cache, with one
> minor tweak. That is, the inode hash value cannot be calculated from
> the inode, so we have to keep a record of either the hash value or a
> pointer to the hlist-bl list head that the inode is hashed into so
> taht we can lock the corect list on removal.
>
> Other than that, this is mostly just a mechanical conversion from
> one list and lock type to another. None of the algorithms have
> changed and none of the RCU behaviours have changed. But it removes
> the inode_hash_lock from the picture and so performance for bcachefs
> goes way up and CPU usage for ext4 halves at 16 and 32 threads. At
> higher thread counts, we start to hit filesystem and other VFS locks
> as the limiting factors. Profiles and performance numbers are in
> patch 3 for those that are curious.
>
> I've been running this in benchmarks and perf testing across
> bcachefs, btrfs and ext4 for a couple of weeks, and it passes
> fstests on ext4 and btrfs without regressions. So now it needs more
> eyes and testing and hopefully merging....

These patches have been in the bcachefs repo for a bit with no issues, and they
definitely do help with performance - thanks, Dave!