Re: [RFC PATCH v3] ceph: ceph: add remote object copies to fs client metrics

From: Jeff Layton
Date: Thu Oct 28 2021 - 10:38:07 EST


On Thu, 2021-10-28 at 15:25 +0100, Luís Henriques wrote:
> On Thu, Oct 28, 2021 at 08:41:52AM -0400, Jeff Layton wrote:
> > On Thu, 2021-10-28 at 12:48 +0100, Luís Henriques wrote:
> > > This patch adds latency and size metrics for remote object copies
> > > operations ("copyfrom").  For now, these metrics will be available on the
> > > client only, they won't be sent to the MDS.
> > >
> > > Cc: Patrick Donnelly <pdonnell@xxxxxxxxxx>
> > > Signed-off-by: Luís Henriques <lhenriques@xxxxxxx>
> > > ---
> > > This patch is still an RFC because it is... ugly.  Although it now
> > > provides nice values (latency and size) using the metrics infrastructure,
> > > it actually needs to extend the ceph_osdc_copy_from() function to add 2
> > > extra args!  That's because we need to get the timestamps stored in
> > > ceph_osd_request, which is handled within that function.
> > >
> > > The alternative is to ignore those timestamps and collect new ones in
> > > ceph_do_objects_copy():
> > >
> > > start_req = ktime_get();
> > > ceph_osdc_copy_from(...);
> > > end_req = ktime_get();
> > >
> > > These would be more coarse-grained, of course.  Any other suggestions?
> > >
> >
> > Not really. It is definitely ugly, I'll grant you that though...
> >
> > The cleaner method might be to just inline ceph_osdc_copy_from in
> > ceph_do_objects_copy so that you deal with the req in there.
>
> Yeah, but the reason for having these 2 functions was to keep net/ceph/
> code free from cephfs-specific code. Inlining ceph_osdc_copy_from would
> need to bring some extra FS knowledge into libceph.ko. Right now the
> funcion in osd_client receives only the required args for doing a copyfrom
> operation. (But TBH it's possible that libceph already contains several
> bits that are cephfs or rbd specific.)
>


Oh, I was more just suggesting that you just copy the guts out of
ceph_osdc_copy_from() and paste them into the only caller
(ceph_do_objects_copy). That would give you access to the OSD req field
in ceph_do_objects_copy and you could just copy the appropriate fields
there.


> However, I just realized that I do have some code here that changes
> ceph_osdc_copy_from() to return the OSD req struct. The caller would then
> be responsible for doing the ceph_osdc_wait_request(). This code was from
> my copy_file_range parallelization patch (which I should revisit one of
> these days), but could be reused here. Do you think it would be
> acceptable?
>

Yeah, that would work too. The problem you have is that the OSD request
is driven by ceph_osdc_copy_from, and you probably want to do that in
ceph_do_objects_copy instead so you can get to the timestamp fields.

> <...>
> > > + spinlock_t copyfrom_metric_lock;
> > > + u64 total_copyfrom;
> > > + u64 copyfrom_size_sum;
> > > + u64 copyfrom_size_min;
> > > + u64 copyfrom_size_max;
> > > + ktime_t copyfrom_latency_sum;
> > > + ktime_t copyfrom_latency_sq_sum;
> > > + ktime_t copyfrom_latency_min;
> > > + ktime_t copyfrom_latency_max;
> > > +
> >
> > Not a comment about your patch, specifically, but we have a lot of
> > copy/pasted code to deal with different parts of ceph_client_metric.
> >
> > It might be nice to eventually turn each of the read/write/copy metric
> > blocks in this struct into an array, and collapse a lot of the other
> > helper functions together.
> >
> > If you feel like doing that cleanup, I'd be happy to review. Otherwise,
> > I'll plan to look at it in the near future.
>
> Yeah, sure. I can have a look at that too.
>
> Cheers,
> --
> Luís

--
Jeff Layton <jlayton@xxxxxxxxxx>