Re: [PATCH V9 0/4] fuse: Add support for passthrough read/write
From: Alessio Balsini
Date: Fri Oct 02 2020 - 09:38:13 EST
On Wed, Sep 30, 2020 at 05:33:30PM +0200, Miklos Szeredi wrote:
> On Thu, Sep 24, 2020 at 3:13 PM Alessio Balsini <balsini@xxxxxxxxxxx> wrote:
>
> > The first benchmarks were done by running FIO (fio-3.21) with:
> > - bs=4Ki;
> > - file size: 50Gi;
> > - ioengine: sync;
> > - fsync_on_close: true.
> > The target file has been chosen large enough to avoid it to be entirely
> > loaded into the page cache.
> > Results are presented in the following table:
> >
> > +-----------+--------+-------------+--------+
> > | Bandwidth | FUSE | FUSE | Bind |
> > | (KiB/s) | | passthrough | mount |
> > +-----------+--------+-------------+--------+
> > | read | 468897 | 502085 | 516830 |
> > +-----------+--------+-------------+--------+
> > | randread | 15773 | 26632 | 21386 |
>
>
> Have you looked into why passthrough is faster than native?
>
> Thanks,
> Miklos
Hi Miklos,
Thank you for bringing this to my attention, I probably missed it because
focusing on the comparison between FUSE and FUSE passthrough.
I jumped back to benchmarkings right after you sent this email.
At a first glance I though I made a stupid copy-paste mistake, but looking
at a bunch of partial results I'm collecting, I realized that the Vi550 S3
SSD I'm using has sometimes unstable performance, especially when dealing
with random offsets. I also realized that SSD performance might change
depending on previous operations.
To solve these issues, each test is now being run 10 times, and at
post-processing time I'm thinking of getting the median to remove possible
outliers.
I also noticed that the performance noise increases after a few minutes the
SSD is busy. This made me think of some kind of SSD thermal throttling I
totally overlooked.
This might be reason why passthrough is performing better than native in
the numbers you highlighted.
Unfortunately the SMART registers of my SSD always reports 33 Celsius
degrees regardless the workload, so to solve this I'm now applying a 5
minutes cooldown between each run.
This time I'm also removing fsync_on_close and reducing the file size to 25
GiB to improve caching and limit the interaction with the SSD during
writes. Still for caching reasons I am also separating the creation of the
fio target file from the actual execution of the benchmark by first running
fio with create_only=1. Before triggering fio, in the above benchmark I was
just sync-ing and dropping the pagecache, I now also drop slab objects,
including inodes and dentries:
echo 3 > /proc/sys/vm/drop_caches
that I suspect wouldn't make any difference, but wouldn't harm as well.
Please let me know if you have any suggestion on how to improve my
benchmarks, or if you recommend tools other than fio (that I actually
really like) to make comparisons.
Thanks,
Alessio