Re: [patch 6/6] statistics infrastructure - exploitation: zfcp

From: Martin Peschke
Date: Wed Dec 14 2005 - 22:53:33 EST


Christoph Hellwig wrote:

+ atomic_t read_num;
+ atomic_t write_num;
+ struct statistic_interface *stat_if;
+ struct statistic *stat_sizes_scsi_write;
+ struct statistic *stat_sizes_scsi_read;
+ struct statistic *stat_sizes_scsi_nodata;
+ struct statistic *stat_sizes_scsi_nofit;
+ struct statistic *stat_sizes_scsi_nomem;
+ struct statistic *stat_sizes_timedout_write;
+ struct statistic *stat_sizes_timedout_read;
+ struct statistic *stat_sizes_timedout_nodata;
+ struct statistic *stat_latencies_scsi_write;
+ struct statistic *stat_latencies_scsi_read;
+ struct statistic *stat_latencies_scsi_nodata;
+ struct statistic *stat_pending_scsi_write;
+ struct statistic *stat_pending_scsi_read;
+ struct statistic *stat_erp;
+ struct statistic *stat_eh_reset;



NACK. pretty much all of this is generic and doesn't belong into an LLDD.
We already had this statistics things with emulex and they added various
bits to the core in response.




Agreed. It's not necessarily up to LLDDs to keep track of request sizes, request latencies, I/O queue utilization, and error recovery conditions by means of statistics. This could or maybe should be done in a more central spot.

With regard to latencies, it might make some difference, though, how many layers are in between that cause additional delays. Then the question is which latency one wants to measure.

There is some very basic measurement data on FC-4 or FCP level in the FC transport class code:

&class_device_attr_host_fcp_input_requests.attr,
&class_device_attr_host_fcp_output_requests.attr,
&class_device_attr_host_fcp_control_requests.attr,
&class_device_attr_host_fcp_input_megabytes.attr,
&class_device_attr_host_fcp_output_megabytes.attr,
&class_device_attr_host_reset_statistics.attr,

Looks like
- counters for the number of read and write requests and requests without any data
- counters for the number of megabytes read and written
- a counter for one out of several recovery conditions

The gap between the statistics posted by me and the FCP transport statistics is
- no information about the actual traffic pattern generated by Linux
(no information - e.g. histogram - about request size,
no information - e.g. histogram - about latencies)
- no information about command timeouts
- no information about I/O concurrency caused by TCQ

I am not sure whether the transport statistics refer to the overall utilization of an actual FC port - let's call it physical HBA -, or to the share of an FC port utilized by one out of several sharing OS instances - let's call these shares, carved out of an FC port, virtual HBAs.

I won't object to move some stuff. But neither think I that a transport class would be the right place for latencies etc. nor would I like to give up certain functionality, like histograms.
Would it be fine with you to move such statistics to the scsi mid layer, provided I can get lkml's approval for some form of a generic statistic code as it comes with my zfcp patch?

Martin

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/