Motivation: ------------ dladm(1m) has traditionally been used for administering data links. dladm also includes options to monitor dynamic network traffic passing through a link. Integration of Crossbow (/PSARC/2006/357) in snv_105 affected dladm in several ways: - In order to administer crossbow-introduced data links like vnics and etherstubs, new subcommands were introduced viz. {create|delete|show}-{vnic|etherstub}. - Since dladm show-link reported dynamic network traffic statistics with -s suboption, similar suboption was added to show-{vnic|etherstub}. - Furthermore dladm show-link -s itself was extended to support the newly introduced data links. - A new subcommand was introduced to report historical network traffic usage information viz. show-usage. We would like to point out following aspects about the existing design: - Dynamic traffic statistics querying functionality is significantly different than configuration capability provided by dladm. Thus, segregating dynamic network traffic statistics from link configuration interface will result in cleaner design and easy-to-use interface. We propose dlstat(1M) for querying data link traffic statistics thus. - Consider a system with 2 VNICs - vnic1, vnic2 (say) carved out of a physical NIC that is connected to external world. Let us further assume that each of these VNICs have a couple of dedicated Rx rings associated with them. Any traffic flowing into vnic1 might be arriving from external world via one of the hardware Rx rings of the physical NIC or could also be originating from vnic2 and then switched by virtual switch on the same host. Thus, not all traffic destined for vnic1 is seen by its hardware Rx rings. As a system administrator, network performance analyst or as a programmer one needs a mechanism to distinguish between these types of traffic. Moreover, a device driver writer can benefit from the knowhow of how many packets from a particular Rx ring were picked by MAC layer polling vs. how many packets raised an interrupt. Current dladm show-link -s does not provide such a fine granularity of statistics. Thus, we need a mechanism to provide per hadware and per software lane statistics for poll count, interrupt count, chain lengths, packet/byte count. We believe that this functionality is substantial enough to warrant a split into new command as against overloading dladm. Crossbow also introduced a flow abstraction whereby one could allow bandwidth limits, CPUs, and priorities to be associated with a subset of the network traffic received and sent through a NIC, link aggregation, or VNIC. Flows can be defined on the basis of IP addresses, well-known port numbers, protocol types etc. flowadm(1M), modeled on the lines of dladm, was introduced to administer flows. Similar to dladm show-{link|vnic-etherstub}'s -s suboption, flowadm supports show-flow -s to print dynamic network traffic statistics. Preceding paragraphs outlined the motivation for splitting dlstat and dladm. In order to maintain symmetry in the design, we also propose to split flowadm into flowadm and flowstat. Current Crossbow implementation (phase I - as integrated in snv_105), does not leverage layer 3 hardware classification capability of a NIC. Thus, one cannot assign dedicated hardware ring(s) to a flow as yet. However, we plan to introduce hardware ring support for flows in future. Per hardware lane querying as supported by flowstat would become effective then. In summary, with Crossbow's integration, a rich set of new counters and interfaces need to be introduced to gain better visibility into network traffic which will assist debugging and performance tuning. dlstat and flowstat will address this need. Proposed changes: ------------------ - In order to separate out traffic monitoring functionality from dladm, show-{link|vnic|etherstub}'s -s will be replaced by dlstat show-link. Similarly, flowadm show-flow -s will be replaced by flowstat. - Interface for reporting historic usage information will move out of dladm/flowadm as well. dladm show-usage becomes dlstat show-link -h. Simlarly, flowadm show-usage becomes flowstat show-history. - Crossbow can assign dedicated hardware resources (Rx/Tx rings) to VNICs to form hardware lanes. dlstat will maintain counters to monitor packets/bytes flowing in and out of individual hardware lane. - When a continuous stream of packets are fired at a host, Crossbow can switch that particular ring from interrupt mode to polling mode independent of other rings on the NIC. It can then pick long chains of packets via polling thereby significantly boosting the performance. dlstat will display interrupt count, poll count as well as average length of packet chains polled. - On a multiprocessor system, The incoming load (packets delivered via polling/interrupt path) could be further distributed across multiple cpus for parallel processing (software fanout). dlstat will maintain and report per software fanout statistics. - On occasions, one might not be interested in the actual count itself but in the patterns or percentages. For example, for a steady stream of data influx, one would be keen to know if high percentage of packets are indeed processed via polling as against the actual count itself. There are several other parameters like minimum/maximum/average queue length or packet size which give a high level view of system's behavior from network perspective. An aggregate suboption in dlstat and flowstat will provide this feature. - On several occasions, one might want to reset the statistics counters. Currently, the only way to achieve that is via unplumb-replumb (or module unload-reload if stats are being read directly from the device). dlstat provides a way to reset statistics without such disruption. Examples of Use ------------------ In the preceding sections we touched on several of dlstat/flowstat features that would be useful for system administrator, network performance analyst, kernel programmer or a device driver writer. Let us consider concrete examples to further elaborate that aspect: I. From performance tuning perspective Let us consider a system under network load. We are poking around to see if we can improve its network perk performance. dlstat show-link -r would enlist per hardware lane packet/byte counts as well as poll vs. interrupt count. Let us say we find those counts satisfactory (say, traffic is nicely distributed across hardware Rx rings and more than 95% packets are delivered via polling). We turn our attention to the software fanout. dlstat show-link -r -F gives us breakdown of per software fanout statistics while mpstat will show us % utilization per software fanoout cpus. If we find that a large number of packets are being delivered while software fanout is small, corresponding CPUs will be fully consumed almost all the time, we can choose to bump up software fanout. Currently, this can be achieved by asscoiating list of cpus with a link as: dladm set-linkprop -p cpus=<...> II. From programmer's perspective While trying to debug a performance issue, it is useful to know how many hardware rings are assigned to the link under investigation. dlstat show-phys -r and dlstat show-phys -t provide this information for Rx and Tx side respectively. Moreover, whether each of the assigned rings is indeed contributing (sending/receiving traffic) can further provide a pointer to what might be missing. Similar argument applies for software fanout as well.