On Tue, 2024-07-09 at 12:17 +0200, Mateusz Guzik wrote:
Right, forgot to respond.I think this approach based on next_fd quick check is more generic and scalable.
I suspect the different result is either because of mere variance
between reboots or blogbench using significantly less than 100 fds at
any given time -- I don't have an easy way to test at your scale at
the moment. You could probably test that by benching both approaches
while switching them at runtime with a static_branch. However, I don't
know if that effort is warranted atm.
So happens I'm busy with other stuff and it is not my call to either
block or let this in, so I'm buggering off.
On Tue, Jul 9, 2024 at 10:32 AM Ma, Yu <yu.ma@xxxxxxxxx> wrote:
On 7/5/2024 3:56 PM, Ma, Yu wrote:
I had something like this in mind:
diff --git a/fs/file.c b/fs/file.c
index a3b72aa64f11..4d3307e39db7 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -489,6 +489,16 @@ static unsigned int find_next_fd(struct fdtable
*fdt, unsigned int start)
unsigned int maxfd = fdt->max_fds; /* always multiple of
BITS_PER_LONG */
unsigned int maxbit = maxfd / BITS_PER_LONG;
unsigned int bitbit = start / BITS_PER_LONG;
+ unsigned int bit;
+
+ /*
+ * Try to avoid looking at the second level map.
+ */
+ bit = find_next_zero_bit(&fdt->open_fds[bitbit], BITS_PER_LONG,
+ start & (BITS_PER_LONG - 1));
+ if (bit < BITS_PER_LONG) {
+ return bit + bitbit * BITS_PER_LONG;
+ }
It just happen for blogbench, just checking the first 64 bit allow a quicker
skip to the two level search where this approach, next_fd may be left
in a 64 word that actually has no open bits and we are doing useless search
in find_next_zero_bit(). Perhaps we should check full_fds_bits to make sure
there are empty slots before we do
find_next_zero_bit() fast path. Something like
if (!test_bit(bitbit, fdt->full_fds_bits)) {
bit = find_next_zero_bit(&fdt->open_fds[bitbit], BITS_PER_LONG,
start & (BITS_PER_LONG - 1));
if (bit < BITS_PER_LONG)
return bit + bitbit * BITS_PER_LONG;
}
Tim
Hi Guzik, Honza,Drat, you're right. I missed that Ma did not add the proper offset toJust tried this on v6.10-rc6, the improvement on top of patch 1 and
open_fds. *This* is what I meant :)
Honza
patch 2 is 7% for read and 3% for write, less than just check first word.
Per my understanding, its performance would be better if we can find
free bit in the same word of next_fd with high possibility, but
next_fd just represents the lowest possible free bit. If fds are
open/close frequently and randomly, that might not always be the case,
next_fd may be distributed randomly, for example, 0-65 are occupied,
fd=3 is returned, next_fd will be set to 3, next time when 3 is
allocated, next_fd will be set to 4, while the actual first free bit
is 66 , when 66 is allocated, and fd=5 is returned, then the above
process would be went through again.
Yu
Do we have any more comment or idea regarding to the fast path? Thanks
for your time and any feedback :)
Regards
Yu