Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
From: Ming Lei
Date: Tue Dec 10 2019 - 23:01:26 EST
On Tue, Dec 10, 2019 at 09:41:37PM -0500, Theodore Y. Ts'o wrote:
> On Tue, Dec 10, 2019 at 04:05:50PM +0800, Ming Lei wrote:
> > > > The path[2] is expected behaviour. Not sure path [1] is correct,
> > > > given
> > > > ext4_release_file() is supposed to be called when this inode is
> > > > released. That means the file is closed 4358 times during 1GB file
> > > > copying to usb storage.
> > > >
> > > > [1] insert requests when returning to user mode from syscall
> > > >
> > > > b'blk_mq_sched_request_inserted'
> > > > b'blk_mq_sched_request_inserted'
> > > > b'dd_insert_requests'
> > > > b'blk_mq_sched_insert_requests'
> > > > b'blk_mq_flush_plug_list'
> > > > b'blk_flush_plug_list'
> > > > b'io_schedule_prepare'
> > > > b'io_schedule'
> > > > b'rq_qos_wait'
> > > > b'wbt_wait'
> > > > b'__rq_qos_throttle'
> > > > b'blk_mq_make_request'
> > > > b'generic_make_request'
> > > > b'submit_bio'
> > > > b'ext4_io_submit'
> > > > b'ext4_writepages'
> > > > b'do_writepages'
> > > > b'__filemap_fdatawrite_range'
> > > > b'ext4_release_file'
> > > > b'__fput'
> > > > b'task_work_run'
> > > > b'exit_to_usermode_loop'
> > > > b'do_syscall_64'
> > > > b'entry_SYSCALL_64_after_hwframe'
> > > > 4358
>
> I'm guessing that your workload is repeatedly truncating a file (or
> calling open with O_TRUNC) and then writing data to it. When you do
> this, then when the file is closed, we assume that since you were
> replacing the previous contents of a file with new contents, that you
> would be unhappy if the file contents was replaced by a zero length
> file after a crash. That's because ten years, ago there were a *huge*
> number of crappy applications that would replace a file by reading it
> into memory, truncating it, and then write out the new contents of the
> file. This could be a high score file for a game, or a KDE or GNOME
> state file, etc.
>
> So if someone does open, truncate, write, close, we still immediately
> writing out the data on the close, assuming that the programmer really
> wanted open, truncate, write, fsync, close, but was too careless to
> actually do the right thing.
>
> Some workaround[1] like this is done by all of the major file systems,
> and was fallout the agreement from the "O_PONIES"[2] controversy.
> This was discussed and agreed to at the 2009 LSF/MM workshop. (See
> the "rename, fsync, and ponies" section.)
>
> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
> [2] https://blahg.josefsipek.net/?p=364
> [3] https://lwn.net/Articles/327601/
>
> So if you're seeing a call to filemap_fdatawrite_range as the result
> of a fput, that's why.
>
> In any case, this behavior has been around for a decade, and it
> appears to be incidental to your performance difficulties with your
> USB thumbdrive and block-mq.
I didn't reproduce the issue in my test environment, and follows
Andrea's test commands[1]:
mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
SECONDS=0
cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
umount /mnt/pendrive 2>&1 |tee -a $logfile
The 'cp' command supposes to open/close the file just once, however
ext4_release_file() & write pages is observed to run for 4358 times
when executing the above 'cp' test.
[1] https://marc.info/?l=linux-kernel&m=157486689806734&w=2
Thanks,
Ming