Re: [RFC PATCH v3] ceph: ceph: add remote object copies to fs client metrics

From: Luís Henriques
Date: Thu Oct 28 2021 - 10:25:37 EST


On Thu, Oct 28, 2021 at 08:41:52AM -0400, Jeff Layton wrote:
> On Thu, 2021-10-28 at 12:48 +0100, Luís Henriques wrote:
> > This patch adds latency and size metrics for remote object copies
> > operations ("copyfrom").  For now, these metrics will be available on the
> > client only, they won't be sent to the MDS.
> >
> > Cc: Patrick Donnelly <pdonnell@xxxxxxxxxx>
> > Signed-off-by: Luís Henriques <lhenriques@xxxxxxx>
> > ---
> > This patch is still an RFC because it is... ugly.  Although it now
> > provides nice values (latency and size) using the metrics infrastructure,
> > it actually needs to extend the ceph_osdc_copy_from() function to add 2
> > extra args!  That's because we need to get the timestamps stored in
> > ceph_osd_request, which is handled within that function.
> >
> > The alternative is to ignore those timestamps and collect new ones in
> > ceph_do_objects_copy():
> >
> > start_req = ktime_get();
> > ceph_osdc_copy_from(...);
> > end_req = ktime_get();
> >
> > These would be more coarse-grained, of course.  Any other suggestions?
> >
>
> Not really. It is definitely ugly, I'll grant you that though...
>
> The cleaner method might be to just inline ceph_osdc_copy_from in
> ceph_do_objects_copy so that you deal with the req in there.

Yeah, but the reason for having these 2 functions was to keep net/ceph/
code free from cephfs-specific code. Inlining ceph_osdc_copy_from would
need to bring some extra FS knowledge into libceph.ko. Right now the
funcion in osd_client receives only the required args for doing a copyfrom
operation. (But TBH it's possible that libceph already contains several
bits that are cephfs or rbd specific.)

However, I just realized that I do have some code here that changes
ceph_osdc_copy_from() to return the OSD req struct. The caller would then
be responsible for doing the ceph_osdc_wait_request(). This code was from
my copy_file_range parallelization patch (which I should revisit one of
these days), but could be reused here. Do you think it would be
acceptable?

<...>
> > + spinlock_t copyfrom_metric_lock;
> > + u64 total_copyfrom;
> > + u64 copyfrom_size_sum;
> > + u64 copyfrom_size_min;
> > + u64 copyfrom_size_max;
> > + ktime_t copyfrom_latency_sum;
> > + ktime_t copyfrom_latency_sq_sum;
> > + ktime_t copyfrom_latency_min;
> > + ktime_t copyfrom_latency_max;
> > +
>
> Not a comment about your patch, specifically, but we have a lot of
> copy/pasted code to deal with different parts of ceph_client_metric.
>
> It might be nice to eventually turn each of the read/write/copy metric
> blocks in this struct into an array, and collapse a lot of the other
> helper functions together.
>
> If you feel like doing that cleanup, I'd be happy to review. Otherwise,
> I'll plan to look at it in the near future.

Yeah, sure. I can have a look at that too.

Cheers,
--
Luís