Re: [PATCH] ext4: make __ext4_get_inode_loc plug

From: Jan Kara
Date: Wed Jun 19 2019 - 08:30:07 EST


On Wed 19-06-19 19:34:00, Zhangjs Jinshui wrote:
> You can blktrace
>
> 8,80 31 11 0.296373038 2885275 Q RA 8279571464 + 8 [xxxx]
> 8,80 31 12 0.296374017 2885275 G RA 8279571464 + 8 [xxxx]
> 8,80 31 13 0.296375468 2885275 I RA 8279571464 + 8 [xxxx]
> 8,80 31 14 0.296382099 3886 D RA 8279571464 + 8 [kworker/31:1H]
> 8,80 31 15 0.296391907 2885275 Q RA 8279571472 + 8 [xxxx]
> 8,80 31 16 0.296392275 2885275 G RA 8279571472 + 8 [xxxx]
> 8,80 31 17 0.296393305 2885275 I RA 8279571472 + 8 [xxxx]
> 8,80 31 18 0.296395844 3886 D RA 8279571472 + 8 [kworker/31:1H]
> 8,80 31 19 0.296399685 2885275 Q RA 8279571480 + 8 [xxxx]
> 8,80 31 20 0.296400025 2885275 G RA 8279571480 + 8 [xxxx]
> 8,80 31 21 0.296401232 2885275 I RA 8279571480 + 8 [xxxx]
> 8,80 31 22 0.296403422 3886 D RA 8279571480 + 8 [kworker/31:1H]
> 8,80 31 23 0.296407375 2885275 Q RA 8279571488 + 8 [xxxx]
> 8,80 31 24 0.296407721 2885275 G RA 8279571488 + 8 [xxxx]
> 8,80 31 25 0.296408904 2885275 I RA 8279571488 + 8 [xxxx]
> 8,80 31 26 0.296411127 3886 D RA 8279571488 + 8 [kworker/31:1H]
> 8,80 31 27 0.296414779 2885275 Q RA 8279571496 + 8 [xxxx]
> 8,80 31 28 0.296415119 2885275 G RA 8279571496 + 8 [xxxx]
> 8,80 31 29 0.296415744 2885275 I RA 8279571496 + 8 [xxxx]
> 8,80 31 30 0.296417779 3886 D RA 8279571496 + 8 [kworker/31:1H]
>
> these RA io were caused by ext4_inode_readahead_blks, there are all not merged becourse of the unplugged state.
> the backtrace shows below, was traced by systemtap ioblock.request filtered by "opf & 1 << 19"
>
> 0xffffffff8136fb20 : generic_make_request+0x0/0x2f0 [kernel]
> 0xffffffff8136fe7e : submit_bio+0x6e/0x130 [kernel]
> 0xffffffff812971e6 : submit_bh_wbc+0x156/0x190 [kernel]
> 0xffffffff81297bca : ll_rw_block+0x6a/0xb0 [kernel]
> 0xffffffff81297cc0 : __breadahead+0x40/0x70 [kernel]
> 0xffffffffa0392c9a : __ext4_get_inode_loc+0x37a/0x3d0 [ext4]
> 0xffffffffa0396a6c : ext4_iget+0x8c/0xc00 [ext4]
> 0xffffffffa03ad98a : ext4_lookup+0xca/0x1d0 [ext4]
> 0xffffffff8126b814 : path_openat+0xcb4/0x1250 [kernel]
> 0xffffffff8126dc41 : do_filp_open+0x91/0x100 [kernel]
> 0xffffffff8125ad86 : do_sys_open+0x126/0x210 [kernel]
> 0xffffffff81003864 : do_syscall_64+0x74/0x1a0 [kernel]
> 0xffffffff81800081 : entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [kernel]
>
> I have patched it on online servers, It can improved the performance.

Ah, OK, directory lookup code... Makes sense. Thanks for sharing!

Honza

>
> > å 2019å6æ19æï19:08ïJan Kara <jack@xxxxxxx> åéï
> >
> > On Mon 17-06-19 23:57:12, jinshui zhang wrote:
> >> From: zhangjs <zachary@xxxxxxxxxxxxxxxx <mailto:zachary@xxxxxxxxxxxxxxxx>>
> >>
> >> If the task is unplugged when called, the inode_readahead_blks may not be merged,
> >> these will cause small pieces of io, It should be plugged.
> >>
> >> Signed-off-by: zhangjs <zachary@xxxxxxxxxxxxxxxx <mailto:zachary@xxxxxxxxxxxxxxxx>>
> >
> > Out of curiosity, on which path do you see __ext4_get_inode_loc() being
> > called without IO already plugged?
> >
> > Otherwise the patch looks good to me. You can add:
> >
> > Reviewed-by: Jan Kara <jack@xxxxxxx <mailto:jack@xxxxxxx>>
> >
> > Honza
> >
> >> ---
> >> fs/ext4/inode.c | 6 ++++++
> >> 1 file changed, 6 insertions(+)
> >>
> >> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> >> index c7f77c6..8fe046b 100644
> >> --- a/fs/ext4/inode.c
> >> +++ b/fs/ext4/inode.c
> >> @@ -4570,6 +4570,7 @@ static int __ext4_get_inode_loc(struct inode *inode,
> >> struct buffer_head *bh;
> >> struct super_block *sb = inode->i_sb;
> >> ext4_fsblk_t block;
> >> + struct blk_plug plug;
> >> int inodes_per_block, inode_offset;
> >>
> >> iloc->bh = NULL;
> >> @@ -4654,6 +4655,8 @@ static int __ext4_get_inode_loc(struct inode *inode,
> >> }
> >>
> >> make_io:
> >> + blk_start_plug(&plug);
> >> +
> >> /*
> >> * If we need to do any I/O, try to pre-readahead extra
> >> * blocks from the inode table.
> >> @@ -4688,6 +4691,9 @@ static int __ext4_get_inode_loc(struct inode *inode,
> >> get_bh(bh);
> >> bh->b_end_io = end_buffer_read_sync;
> >> submit_bh(REQ_OP_READ, REQ_META | REQ_PRIO, bh);
> >> +
> >> + blk_finish_plug(&plug);
> >> +
> >> wait_on_buffer(bh);
> >> if (!buffer_uptodate(bh)) {
> >> EXT4_ERROR_INODE_BLOCK(inode, block,
> >> --
> >> 1.8.3.1
> >>
> > --
> > Jan Kara <jack@xxxxxxxx <mailto:jack@xxxxxxxx>>
> > SUSE Labs, CR
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR