Re: [RFC PATCH v3] ceph: ceph: add remote object copies to fs client metrics

From: Luís Henriques
Date: Thu Oct 28 2021 - 10:29:55 EST


On Thu, Oct 28, 2021 at 09:27:08PM +0800, Xiubo Li wrote:
>
> On 10/28/21 7:48 PM, Luís Henriques wrote:
> > This patch adds latency and size metrics for remote object copies
> > operations ("copyfrom"). For now, these metrics will be available on the
> > client only, they won't be sent to the MDS.
> >
> > Cc: Patrick Donnelly<pdonnell@xxxxxxxxxx>
> > Signed-off-by: Luís Henriques<lhenriques@xxxxxxx>
> > ---
> > This patch is still an RFC because it is... ugly. Although it now
> > provides nice values (latency and size) using the metrics infrastructure,
> > it actually needs to extend the ceph_osdc_copy_from() function to add 2
> > extra args! That's because we need to get the timestamps stored in
> > ceph_osd_request, which is handled within that function.
> >
> > The alternative is to ignore those timestamps and collect new ones in
> > ceph_do_objects_copy():
> >
> > start_req = ktime_get();
> > ceph_osdc_copy_from(...);
> > end_req = ktime_get();
> >
> > These would be more coarse-grained, of course. Any other suggestions?
> >
> > Cheers,
> > -- Luís fs/ceph/debugfs.c | 19 ++++++++++++++++++ fs/ceph/file.c | 7
> > ++++++- fs/ceph/metric.c | 35 +++++++++++++++++++++++++++++++++
> > fs/ceph/metric.h | 14 +++++++++++++ include/linux/ceph/osd_client.h | 3
> > ++- net/ceph/osd_client.c | 8 ++++++-- 6 files changed, 82
> > insertions(+), 4 deletions(-) diff --git a/fs/ceph/debugfs.c
> > b/fs/ceph/debugfs.c index 55426514491b..b657170d6bc3 100644 ---
> > a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -203,6 +203,16 @@ static
> > int metrics_latency_show(struct seq_file *s, void *p)
> > spin_unlock(&m->metadata_metric_lock); CEPH_LAT_METRIC_SHOW("metadata",
> > total, avg, min, max, sq); + spin_lock(&m->copyfrom_metric_lock); +
> > total = m->total_copyfrom; + sum = m->copyfrom_latency_sum; + avg =
> > total > 0 ? DIV64_U64_ROUND_CLOSEST(sum, total) : 0; + min =
> > m->copyfrom_latency_min; + max = m->copyfrom_latency_max; + sq =
> > m->copyfrom_latency_sq_sum; + spin_unlock(&m->copyfrom_metric_lock); +
> > CEPH_LAT_METRIC_SHOW("copyfrom", total, avg, min, max, sq); + return 0;
> > } @@ -234,6 +244,15 @@ static int metrics_size_show(struct seq_file *s,
> > void *p) spin_unlock(&m->write_metric_lock);
> > CEPH_SZ_METRIC_SHOW("write", total, avg_sz, min_sz, max_sz, sum_sz); +
> > spin_lock(&m->copyfrom_metric_lock); + total = m->total_copyfrom; +
> > sum_sz = m->copyfrom_size_sum; + avg_sz = total > 0 ?
> > DIV64_U64_ROUND_CLOSEST(sum_sz, total) : 0; + min_sz =
> > m->copyfrom_size_min; + max_sz = m->copyfrom_size_max; +
> > spin_unlock(&m->copyfrom_metric_lock); + CEPH_SZ_METRIC_SHOW("copyfrom",
> > total, avg_sz, min_sz, max_sz, sum_sz); + return 0; } diff --git
> > a/fs/ceph/file.c b/fs/ceph/file.c index e61018d9764e..d1139bbcd58d
> > 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -2208,6 +2208,7 @@
> > static ssize_t ceph_do_objects_copy(struct ceph_inode_info *src_ci, u64
> > *src_off struct ceph_object_locator src_oloc, dst_oloc; struct
> > ceph_object_id src_oid, dst_oid; size_t bytes = 0; + ktime_t start_req,
> > end_req; u64 src_objnum, src_objoff, dst_objnum, dst_objoff; u32
> > src_objlen, dst_objlen; u32 object_size = src_ci->i_layout.object_size;
> > @@ -2242,7 +2243,11 @@ static ssize_t ceph_do_objects_copy(struct
> > ceph_inode_info *src_ci, u64 *src_off CEPH_OSD_OP_FLAG_FADVISE_DONTNEED,
> > dst_ci->i_truncate_seq, dst_ci->i_truncate_size, -
> > CEPH_OSD_COPY_FROM_FLAG_TRUNCATE_SEQ); +
> > CEPH_OSD_COPY_FROM_FLAG_TRUNCATE_SEQ, + &start_req, &end_req); +
> > ceph_update_copyfrom_metrics(&fsc->mdsc->metric, + start_req, end_req, +
> > object_size, ret);

(Ugh! Your mail client completely messed-up the patch and took me a while
to figure out what you're suggesting :-) )

> Maybe you can move this to ceph_osdc_copy_from() by passing the object_size
> to it ?

I think this would mean to push into net/ceph/ more details about cephfs
(such as the knowledge about metrics). Which I think we should avoid.
I've just suggested something different in my reply to Jeff, maybe that's
a better approach (basically, get the OSD request struct from
ceph_osdc_copy_from()).

Cheers,
--
Luís