What do the fc statistics under /sys/class/fc_host/hostN/statistics mean ?

Updated

Issue

  • What do the fc statistics under /sys/class/fc_host/host*/statisics mean?
  • Is it possible to change parameters like fcp_output_megabytes to report bytes instead?
  • Where are the definitions for /sys/class/fc_host/host*/statistic parameters found?
  • Why don't all storage adapters, such as hpsa or megasas provide these statistics under sysfs?

Environment

  • Red Hat Enterprise Linux 5, 6, 7, 8, 9
  • FC HBA such as Qlogic (qla2xxx driver for example) or Emulex (lpfc driver) adapters

Resolution

* The fc_host statistics are fc port level statistics that, if implemented, are maintained by the HBA's firmware. * The fields and their definitions are defined by the [T11](www.t11.org) standard ["Fibre Channel HBA API (FC-HBA)"](https://webstore.ansi.org/Standards/INCITS/INCITS3862004S2014). * *" INCITS 386-2004[S2014]
A standard Application Programming Interface (API) defines a scope within which, and a grammar by which, it is possible to write application software without attention to vendor-specific infrastructure behavior. This standard defines a standard Application Programming Interface the scope of which is management of Fibre Channel Host Bus Adapters (HBAs) and use of certain Fibre Channel facilities for discovery and management of the components of a Fibre Channel Storage Area Network. This standard is to be used with the Fibre Channel and SCSI families of standards.*" * The most common FC HBA implment this standard, including qla2xxx, lpfc, fnic as well as several others. However, some may only be partial implementations of the set of port level statistics. The actual implementation, including which fields are implemented, depends upon the vendor, model card, and firmware version running on that card. * If you have any specific questions on what is implemented, contact the storage vendor for assistance. * **NOTE:** The [reset_statistics](#NOTE-4) is the only write accessible entry. However, the standard defines this as obsolete and that "*it shall have no effect and return no value*". So while present, it may or may not actually do anything -- it depends upon if the driver and/or HBA firmware implemented the reset.

Within the kernel's fc transport implementation is a structure called fc_host_statistics that is defined within /include/scsi/scsi_transport_fc.h. The structure members are defined by the via the "Fibre Channel HBA API (FC-HBA)" specification. The specification defines optional port level statistics that can be maintained by the HBA's firmware and provided to the HBA's driver by querying the HBA for the information.

As such only FC adapters will have these parameters and their definitions are fixed by the standard (so changing fcp_output_megabytes to fcp_output_bytes would break the standard and as such would be unlikely to be implemented). Each FC HBA's firmware implements the counters which are read and the values returned from the HBA's firmware when reading the sysfs paramters.

/include/scsi/scsi_transport_fc.h

/*
 * FC Local Port (Host) Statistics
 */

/* FC Statistics - Following FC HBAAPI v2.0 guidelines */
struct fc_host_statistics {
        /* port statistics */
        u64 seconds_since_last_reset;
        u64 tx_frames;
        u64 tx_words;
        u64 rx_frames;
        u64 rx_words;
        u64 lip_count;
        u64 nos_count;
        u64 error_frames;
        u64 dumped_frames;
        u64 link_failure_count;
        u64 loss_of_sync_count;
        u64 loss_of_signal_count;
        u64 prim_seq_protocol_err_count;
        u64 invalid_tx_word_count;
        u64 invalid_crc_count;

        /* fc4 statistics  (only FCP supported currently) */
        u64 fcp_input_requests;
        u64 fcp_output_requests;
        u64 fcp_control_requests;
        u64 fcp_input_megabytes;
        u64 fcp_output_megabytes;
};

The above parameters, and a few more, appear within the sysfs directory at /sys/class/fc_host/host*/statistics for each FC adapter that implements the FC HBA API. As will be shown below, most of these parameters are defined by the T11 standard FC-HBA API specification. Reading these sysfs parameters causes the kernel to call the driver which interrogates the adapter's firmware for the current counts and then displays the current values of same. The output below from /sys/class/fc_host/host*/statistics shows

  • the parameter name as found within sysfs,
  • an example value for each, and
  • the referenced standard section number which defines the meaning of each parameter within the .
    • In this case a *draft* of the standard is referenced: 12/2003 T11/1568-D Revision 13 Draft of Fibre Channel HBA API (FC-HBAAPI) specification

Note:
Within Annex A.5 End Port Statistics within the standard it details which of these parameters are Mandatory [M] and which are Optional [O]. Unavailable optional paramters have a designated value of -1 (0xffffffffffffffff hex). Not all FC HBA's drivers/firmware may be compliant with the FC HBAAPI specification and/or some or all of the optional parameters may not be available from specific adapter model's implementation. For example, see Bugzilla This content is not included.Bug 642424 - [RFE] Support performance statistics in /sys/class/fc_host/host?/statistics for QLogic FibreChannel HBAs -- in this case the specific model's firmware didn't support collection and reporting of the desired FC HBAAPI statistics and therefore couldn't be "fixed". It requires firmware and driver support in order to present these statistics within the sysfs statistics directory.

The non-greyed parameters below are the ones defined within the fc_host_statistics structure within /include/scsi/scsi_transport_fc.h.

/sys/class/fc_host/host5/statisitcs:

                            Current         Optional/  FS-HBAAPI
Parameter                   Value           Mandatory  Section #[1]    Notes
seconds_since_last_reset   :0x1d68            [O]      6.5.2.2
tx_frames                  :0x39ce            [O]      6.5.2.3
tx_words                   :0x0               [O]      6.5.2.5
rx_frames                  :0x6e22            [O]      6.5.2.4
rx_words                   :0x8b0848          [O]      6.5.2.6
lip_count                  :0x0               [O]      6.5.2.7
nos_count                  :0x0               [O]      6.5.2.8
error_frames               :0x0               [O]      6.5.2.9 
dumped_frames              :0x0               [O]      6.5.2.10 
link_failure_count         :0x0               [M]      6.5.2.11        [2]
loss_of_sync_count         :0x0               [M]      6.5.2.12        [2] 
loss_of_signal_count       :0x0               [M]      6.5.2.13        [2] 
prim_seq_protocol_err_count:0x0               [M]      6.5.2.14        [2] 
invalid_tx_word_count      :0x0               [M]      6.5.2.15        [2] 
invalid_crc_count          :0x0               [M]      6.5.2.16        [2] 

fcp_input_requests         :0x2cb4            [O]      6.5.2.17        [3]
fcp_output_requests        :0x0               [O]      6.5.2.18        [3]
fcp_control_requests       :0x2d              [O]      6.5.2.19        [3]
fcp_input_megabytes        :0x8               [O]      6.5.2.20        [3]
fcp_output_megabytes       :0x0               [O]      6.5.2.21        [3]

reset_statistics           :<write-only>                               [4]

fc_no_free_exch            :0xffffffffffffffff
fc_no_free_exch_xid        :0xffffffffffffffff
fc_non_bls_resp            :0xffffffffffffffff
fcp_frame_alloc_failures   :0xffffffffffffffff
fcp_packet_aborts          :0xffffffffffffffff
fcp_packet_alloc_failures  :0xffffffffffffffff
fc_seq_not_found           :0xffffffffffffffff
fc_xid_busy                :0xffffffffffffffff
fc_xid_not_found           :0xffffffffffffffff
Notes:
[1] The kernel's structure definition references T11 standard FC-HBA API v2.0, whereas the section numbers above are from a draft version of that release. So, those section numbers may be different in the released FC standard covering these parameters.

[2] These parameters are all part of the Link Error Status Block (LESB) as defined in the Fibre Channel Framing and Signalling specification (FC-FS). The LESB is used to track lower link level error statistics at a port and is required by the T11 FC specifications which is why these fields are the only mandatory ones listed in the structure. The LESB description says these counters represent "errors accumulated {that} provide a coarse measure of the integrity of the link over time". See FC-FS for more detailed information. Within FC-FS; "now means are provided" to reset the LESB counters back to zero. This seems to conflict with the FC-HBAAPI reset_statistics function -- except this function in FC-HBAAPI is defined as obsolete and having no effect. However, drivers could implement this outside of the FC-HBAAPI specification within the driver... which is what the lpfc driver does. If stats counter reset is requested, then the driver saves the current stats counters from the adapter -- all of them -- and subtracts this base from the statistics read during any future request, thus simulating a firmware reset of these statistics. Not all drivers do it this way, for example the qla2xxx driver doesn't implement a reset function at all so invoking it for this driver is a noop.

[3] These parameters have the "fc4 statistics (only FCP supported currently)" within the linux include file. The FC-4 layer is the protocol layer of FC (FCP-SCSI, FCP-IP, et. al.). This just means these counts are associated with command and data counts associated with these types of upper layer protocols vs all commands/data transfers on the link which include other things like FC level general services, etc. So for an FCP-SCSI HBA, the command counts are for scsi commands and the data megabytes are for "user" data transferred to, and recieved from, a scsi target or device.

[4] $ echo 1 > /sys/class/fc_host/hostN/statistics/reset_statistics will reset the statistic counters for the given host back to zero. The FC-HBAAPI indicates this function is obsolete so it may disappear in the future or may be retained but not actually reset the statistics. See Note [2] above, each driver may or may not implement this and the result of the reset may or may not result in all stats being "reset".

    "7.2.11 HBA_ResetStatistics ; 7.2.11.2 Description The HBA_ResetStatistics function is obsolete. It is retained for compatibility with previous implementations of HBA API clients. It shall have no effect and return no value.
  • Since this is declared obsolete by the FC HBA API standard, the implementation of a reset function with the HBA's firmware may or may not be implemented and that implementation can vary by HBA model even from the same vendor.
  • The following drivers define a FC reset_statistics function within their drivers:
      File                Function            Line
    0 zfcp_scsi.c                      972 .reset_fc_host_stats = zfcp_scsi_reset_fc_host_stats,
    1 bfad_attr.c                      637 .reset_fc_host_stats = bfad_im_reset_stats,
    2 bfad_attr.c                      694 .reset_fc_host_stats = bfad_im_reset_stats,
    3 fnic_main.c                      170 .reset_fc_host_stats = fnic_reset_host_stats,
    4 lpfc_attr.c                     7155 .reset_fc_host_stats = lpfc_reset_stats,
    5 lpfc_attr.c                     7224 .reset_fc_host_stats = lpfc_reset_stats,
    6 qla_attr.c                      3055 .reset_fc_host_stats = qla2x00_reset_host_stats,
    7 qla_attr.c                      3102 .reset_fc_host_stats = qla2x00_reset_host_stats,
    
  • However, a larger number of HBA drivers define the function to read the FC statistics from the HBA:
      File                Function            Line
    0 zfcp_scsi.c                      971 .get_fc_host_stats = zfcp_scsi_get_fc_host_stats,
    1 bfad_attr.c                      636 .get_fc_host_stats = bfad_im_get_stats,
    2 bfad_attr.c                      693 .get_fc_host_stats = bfad_im_get_stats,
    3 bnx2fc_fcoe.c                   2869 .get_fc_host_stats = bnx2fc_get_host_stats,
    4 bnx2fc_fcoe.c                   2909 .get_fc_host_stats = fc_get_host_stats,
    5 csio_attr.c                      745 .get_fc_host_stats = csio_get_stats,
    6 csio_attr.c                      790 .get_fc_host_stats = csio_get_stats,
    7 fcoe.c                           219 .get_fc_host_stats = fc_get_host_stats,
    8 fcoe.c                           267 .get_fc_host_stats = fc_get_host_stats,
    9 fnic_main.c                      169 .get_fc_host_stats = fnic_get_stats,
    a lpfc_attr.c                     7154 .get_fc_host_stats = lpfc_get_stats,
    b lpfc_attr.c                     7223 .get_fc_host_stats = lpfc_get_stats,
    c qedf_main.c                     2071 .get_fc_host_stats = qedf_fc_get_host_stats,
    d qedf_main.c                     2104 .get_fc_host_stats = fc_get_host_stats,
    e qla_attr.c                      3054 .get_fc_host_stats = qla2x00_get_fc_host_stats,
    f qla_attr.c                      3101 .get_fc_host_stats = qla2x00_get_fc_host_stats,
  • Check with the hardware vendor of your HBA to determine if the HBA model of interest implements the reset function.

All of the following is a paraphrasing of the information within FC HBAAPI section "6.5.2 End Port Statistics". Refer to the current released FC-HBAAPI standard for complete descriptions and more information.

seconds_since_last_reset

The number of seconds since the statistics were last reset. The statistics are reset at system boot time and if you write to the reset_statistics parameter within the /sys/class/fc_host/host*/statistics directory.

tx_frames

The total number of FC frames that have been transmitted by the HBA.
This count is across all protocols and classes, which includes general services etc. as well as frames associated with scsi traffic.

rx_frames

The total number of FC frames that have been received by the HBA.
This count is across all protocols and classes, which includes general services etc. as well as frames associated with scsi traffic.

tx_words

The total number of transmitted words by the HBA. (A word is a string of 4 bytes)
This count is across all protocols and classes, which includes general services etc. as well as frames associated with scsi traffic. Also this counts all transmitted words, frame headers, crc, etc. -- not just "user" data bytes.

rx_words

The total number of recieved words by the HBA. (A word is a string of 4 bytes)
This count is across all protocols and classes, which includes general services etc. as well as frames associated with scsi traffic. Also this counts all transmitted words, frame headers, crc, etc. -- not just "user" data bytes.

lip_count

The number of LIP (loop initializaton FC primitive) resets that have occurred.
This should be the number initiated by the HBA unless there are other ports on the link the HBA is connected to that can also issue the LIP primitive. One way of sending a lip is to:

$ echo 1 > /sys/class/fc_host/hostN/issue_lip
A LIP will cause temporary loss of link (link down/link up events).

nos_count

The number of NOS (not operational) FC primitives that have occurred on the switched fabric.
This would typically be recieved by the HBA during link initialization between the HBA port and the switch if the switch detected a problem -- typically NOS is sent by a port that is offline or has detected a link problem or failure of some type. This being non-zero implies problems at the link level or with the switch port the HBA is connected to.

error_frames

The count of the number of FC frames that have been recieved "in error".
The quoted "in error" is what the specification says, my interpretation is that these frames were not meant for the HBA's port(?).

dumped_frames

The count of FC frames that were lost due to lack of local resources (buffers).
A frame arrives at the HBA nport, but there is no place to capture it due to lack of available buffers within the adapter. The frame is "dumped", i.e. dropped and the firmware never sees it. Something isn't working with buffer credits between ports at a lower FC link level if this is happening is one guess as to why dumped frames could occur.


link_failure_count (LESB)

This count is the value of LINK FAILURE COUNT field of the Link Error Status Block (LESB, FC-FS).
This counter refects miscellaneous link errors. See FC-FS for more information.

loss_of_sync_count (LESB)

This count is the value of LOSS-OF-SYNCHRONIZATION COUNT field of the Link Error Status Block (LESB, FC-FS).
This counter reflects the count of "confirmed and persistent synchronization losses" on the link, a specific type of link failure where the ability to tell the start/stop points for correctly forming/recieving words off of the link fails. See FC-FS for more information.

loss_of_signal_count (LESB)

This count is the value of the LOSS-OF-SIGNAL COUNT field of the Link Error Status Block (LESB, FC-FS).
This counter reflects the number of loss of signal occurances detected by the HBA, a specific type of link failure.

prim_seq_protocol_err_count (LESB)

This count is the value of the PRIMITIVE SEQUENCE PROTOCOL ERROR field of the Link Error Status Block (LESB, FC-FS).
This counter reflects the number of times an FC primitive ("command") sequence was recieved that was invalid for the current port state. This is considered a type of link level failure.

invalid_tx_word_count (LESB)

This count is the value of the INVALID TRANSMISSION WORD field of the Link Error Status Block (LESB, FC-FS).
There are several things that can cause this count to get incremented including invalid 8bit/10bit transmission code values. See the FC-FS for the full list, but these are "bad data combination" detections in general, i.e. link level issues which prevent transfer of commands and information.

invalid_crc_count (LESB)

This count is the value of the INVALID CRC COUNT field of the Link Error Status Block (LESB, FC-FS).
The CRC calculated for the frame as recieved doesn't match the CRC field within the frame.


fpc_input_requests (FC-4, e.g. FCP-SCSI)

The input_requests count tracks the number of scsi commands that cause data to be transferred from scsi target device to scsi initiator (host). So this would count scsi READ commands, INQUIRY commands, etc. -- any scsi command that sends data back to the host. Some scsi commands could cause both data to be sent to the scsi target, outside of the scsi command itself, as well as causing data to be returned to the host. For these scsi commands both input_requests and output_requests would be bumped by one. But honestly, I can't think of a scsi command of this type at the moment.

fcp_output_requests (FC-4, e.g. FCP-SCSI)

The output_request count tracks the number of scsi commands that cause data to be transferred from scsi initiator (host) to scsi target device. So this counts scsi WRITE commands, MODE SELECT, etc. -- any scsi command that sends data, outside of the scsi command itself, to the scsi target. Some scsi commands could cause both data to be sent to the scsi target was well as causing data to be returned to the host. For these scsi commands both input_requests and output_requests could be bumped by one. But honestly, I can't think of a scsi_command of this type at the moment.

fcp_control_requests (FC-4, e.g. FCP-SCSI)

The control_request count tracks the number of scsi commands that cause no data to be transferred between host and target. So this counts scsi TEST UNIT READY (TUR) and similar commands that only return standard scsi status and sense information but that involve no data transfers.

fcp_input_megabytes (FC-4, e.g. FCP-SCSI)

The number of megabytes recieved by the host for scsi commands.
The FC-HBAAPI specifies megabytes as = 1,000,000 bytes. However, implementations seem to vary. Within the qla2xxx driver it converts the raw byte count as returned by the firmware into megabates via:
pfc_host_stat->fcp_input_megabytes = vha->qla_stats.input_bytes >> 20;
That is, in this instance megabytes = 2^20 and not 10^6.

Having this field in megabytes only allows a rough view of data transfers per time interval. For example, on a lightly loaded system it could take 20-30s or more before a full additional megabyte was transferred. The intent of this counter seems to be to provide a rough guide as to what is happening in terms of data transfer on the link. The counter only becomes reasonably useful when the link is under significant and heavy load, or for determining long term balancing across multiple links/ports.

fcp_output_megabytes

The number of megabytes sent by the host for scsi commands.
The FC-HBAAPI specifies megabytes as = 1,000,000 bytes. However, implementations seem to vary. Within the qla2xxx driver it converts the raw byte count as returned by the firmware into megabates via:
pfc_host_stat->fcp_output_megabytes = vha->qla_stats.output_bytes >> 20;
That is, in this instance megabytes = 2^20 and not 10^6.

Having this field in megabytes only allows a rough view of data transfers per time interval. For example, on a lightly loaded system it could take 20-30s or more before a full additional megabyte was transferred. The intent of this counter seems to be to provide a rough guide as to what is happening in terms of data transfer on the link. The counter only becomes reasonably useful when the link is under significant and heavy load, or for determining long term balancing across multiple links/ports.
Category
Article Type