A few notes first: the case looks large, but is mostly straightforward. This project has Patch/Micro release binding. Nemo has not yet shipped, and thus the stability and binding of the administrative interfaces (Evolving) isn't affected. This case adds new functionality, and does not take away existing functionality. Existing third-party software need not be modified, but new features are possible if it is. A summary of the changes is at the bottom. Overview ======== This case proposes a small set of changes to Nemo (PSARC/2004/571) to provide a seamless customer transition to the administrative enhancements proposed by Clearview (see http://clearview.east), which itself is targeting Nevada. These changes also simplify the administrative model and enhance observability for current Nemo devices. To implement those administrative changes, this case also proposes several changes to the Nemo MAC Client interfaces, which are currently Contracted Consolidation Private. However, the Nemo team indicates this was due to a section numbering oversight in their materials. Instead, only the interfaces in sections 3 and 4.1 of the Nemo commitment materials are intended to be Contracted Consolidation Private, and the MAC Client interfaces (section 4.2) are intended to be Project Private. Indeed, none of the current Nemo contracts makes use of the MAC Client interfaces. As such, this case asserts that MAC Client interfaces are Project Private. Since the proposed dladm changes are not backwards-compatible with the current Nemo behavior which itself is "Evolving", we are requesting patch binding, so that we can integrate into the same update release as Nemo itself -- S10U2. Thus, to the customer, the change to these evolving interfaces will not be visible. This case has been written in cooperation with the Nemo I-Team. Details ======= The changes are in four basic areas: 1. Changes to the dladm command. 2. Changes to the MAC Client Support routines for multiple-open MAC. 3. Changes to DLPI to support multiple-open MAC. 4. Changes to libdlpi to support multiple-open MAC. Each of these areas are detailed below. 1. Changes to the dladm command ------------------------------- In some situations Solaris currently requires the administrator to explicitly manage the /dev links for Nemo network devices through the dladm create-link and delete-link subcommands. Specifically: * When creating an aggregation, any /dev links associated with devices that make up the aggregation must be removed with delete-link before the aggregation can be created. * Further, after creating the aggregation, its /dev link must be manually created by the administrator via dladm create-link. * Similarly, before deleting the aggregation, its /dev link must be manually deleted with delete-link, and before the underlying device can be used again, its link must be recreated using create-link. * In order to modunload a MAC device, the links associated with the device must be explicitly removed. This is because the act of creating the link creates a reference to the underlying MAC, holding it open even though it does not yet have any consumers. This management is annoying and error-prone, and differs from the behavior of non-Nemo devices, which always have /dev links so long as the hardware is present in the system. As such, we propose to restore "classic" behavior by having the system automatically manage the links in all situations. That is: * The system no longer requires (or allows) a delete-link prior to adding a device to an aggregation. However, if the device is in use (see (2)), the dladm command will fail with an informative error message. * The system automatically creates or deletes the /dev links for an aggregation when the aggregation is created or deleted. Note that, as before, the name of the /dev link is based on the aggregation key: e.g., an aggregation with key 42 is /dev/aggr42. * To permit module unloading, the creation of /dev links no longer holds a reference to the underlying MAC. Instead, the reference is held when the MAC is actually opened, e.g., by an application opening /dev/bge0. As with non-Nemo devices, opening the /dev link associated with a module that has been unloaded will cause it to be automatically reloaded. Also, we propose to remove the undocumented dladm "down-link" subcommand. This subcommand is currently used when stopping the network/datalink service to remove any configured /dev links. This subcommand has never been required for correctness, and further does not match the classic model which ensures that the /dev link exists as long as the hardware is present in the system, regardless of what services are running. All that said, there is one remaining use of create-link and delete-link: when combined with the -v option, they enable VLANs to be created and destroyed. As such, to match its new purpose, we propose to rename "create-link" to "create-vlan" and "delete-link to "delete-vlan". In summary, here are the proposed changes to dladm: Current | Proposed | Purpose ------------------|-------------------|--------------------------- create-link -k | [removed] | Creates an aggregation /dev link create-link -d | [removed] | Creates a device /dev link create-link -d -v | create-vlan -d -v | Creates a VLAN for a device delete-link | [removed] | Deletes an aggregation /dev link delete-link | [removed] | Deletes a device /dev link delete-link | delete-vlan -d -v | Deletes a VLAN for a device down-link | [removed] | Deletes all configured /dev links 2. Changes to the MAC Client Support routines for multiple-open MAC ------------------------------------------------------------------- One implication of the above changes is that it's now possible for a MAC to be opened more than once. Specifically, when a device is part of an aggregation, the device's MAC is opened both by the aggregation driver and also by the DLS layer when an application opens the device's /dev link. In fact, avoiding this situation is precisely why a delete-link was previously required prior to adding a device to an aggregation. However, there is nothing inherently wrong with this situation, and indeed allowing it is essential to allowing tools such as snoop(1M) to observe the traffic on each devices that comprises an aggregation, and also to observe the aggregation's own LACP traffic. That said, a mechanism is needed to ensure that there is only one "active" user of a MAC at a time. An "active" user of a MAC is a consumer who either configures the MAC (e.g., by setting its hardware address), or who ties other network stack state to the MAC configuration (e.g, IP). To support this, we propose two new functions: boolean_t mac_active_set(mac_handle_t); void mac_active_clear(mac_handle_t); The first attempts to upgrade the MAC access to be active, and returns B_TRUE if it's successful. The second downgrades a previously-active MAC. The aggregation driver will immediately upgrade its MACs to active, and will not downgrade them until it is done using them. The DLS layer will upgrade and downgrade its MACs as required by its DLPI consumers, as discussed in (3) below. Now that MACs can be opened more than once, several other functions need to be generalized. First, mac_rx_set(), which sets the function to pass received packets to, needs to be enhanced to allow multiple receive functions. We propose to replace mac_rx_set() with two functions: mac_rx_handle_t mac_rx_add(mac_handle_t, mac_rx_t, void *); void mac_rx_remove(mac_handle_t, mac_rx_handle_t); The first is identical to the old mac_rx_set() function, except that it does not remove any existing receive functions, and that it returns a handle that refers to the registered receive function. The second removes a previously-registered receive function. Uses of mac_rx_set() will be replaced with mac_rx_add() and mac_rx_remove(), as appropriate. The other set of functions that need to be generalized concern handling of looped-back traffic when operating in promiscuous mode. Specifically, promiscuous-mode loopback is currently implemented strictly in the DLS layer, rather than in the MAC layer. Since the aggregation driver directly uses the MAC layer, this makes packets sent by it inaccessible to DLS consumers (such as snoop(1M)) that are also using that MAC. To move loopback functionality to the MAC layer, we propose two new MAC functions: typedef void (*mac_txloop_t)(void *, mblk_t *); mac_txloop_handle_t mac_txloop_add(mac_handle_t, mac_txloop_t, void *); void mac_txloop_remove(mac_handle_t, mac_txloop_handle_t); These functions exactly mirror the receive functions defined above, except that registered functions will be called back when a packet is transmitted to a MAC that is in promiscuous mode. This allows the DLS layer to register functions that will be invoked whenever either itself or the aggregation driver sends a packet to the underlying MAC. Finally, a new MAC transmit function to perform the loopback is proposed: mblk_t * mac_txloop(void *, mblk_t *); This function will call all the registered loopback functions in an unspecified order. As with the old DLS-level loopback implementation, for performance reasons this function is only used when there are registered loopback listeners. Otherwise, the MAC client directly invoke the transmit function of the underlying MAC. 3. Changes to DLPI to support multiple-open MAC ----------------------------------------------- As mentioned in (2), it's important to ensure that there's at most one active MAC consumer at a time -- while having multiple active MAC consumers will not result in a panic, it will cause difficult-to-diagnose system misconfigurations. Unfortunately, DLPI consumers run the gamut from benign packet sniffers to IP itself. As such, we need a mechanism for DLPI consumers to indicate whether or not they will configure the MAC or configure other network stack based on the MAC configuration. We propose the following changes: * By default, all DLPI consumers are assumed to be passive. * Unless it has successfully issued a DL_PASSIVE REQ primitive (discussed below), if a DLPI consumer successfully uses one of the following primitives, it is deemed to be active: DL_BIND_REQ DL_ENABMULTI_REQ DL_PROMISCON_REQ DL_AGGR_REQ DL_UNAGGR_REQ DL_CONTROL_REQ DL_SET_PHYS_ADDR_REQ While processing one of these primitives, dld will notify dls of the new active status of the dld client, and dls will in turn notify the mac by calling mac_active_set() as described above. If mac_active_set() fails because there is a link aggregation using the mac, the processing of the DLPI message will result in an error. * A new DL_PASSIVE_REQ primitive is created. It requests the DLS provider to let the user remain passive even after issuing one of the active DLPI primitives mentioned above. This allows special applications such as snoop(1M) to issue all DLPI messages without becoming active. This message is valid in any DLPI state less than or equal to DL_UNBOUND. We propose to update snoop(1M) to use the DL_PASSIVE_REQ DLPI message to allow it to observe packets on individual links that are part of a link aggregation. Please note that none of these proposed changes constitute a regression: currently, no DLPI application can open a device in an aggregation at all. After this change, unmodified applications can open such devices and issue DLPI messages not mentioned in the active list above, allowing them to attach to device instances and obtain information about the device using DL_INFO_REQ or DL_PHYS_ADDR_REQ without binding (for example). Modified snoop-like utilities have the added benefit of being allowed to interact with such devices using the full set of DLPI messages. Further, as before, unmodified DLPI applications can continue to open a device that is not in an aggregation, or the aggregation itself. However, we will also work to update other non-ON DLPI consumers that we know to be safe, such as libpcap and packet shell. 4. Changes to libdlpi to support DLPI changes --------------------------------------------- PSARC/2003/375 introduced a convenience library for DLPI applications, libdlpi. This library was subsequently extended by Nemo as part of PSARC/2004/571. All the functions in this library are currently Consolidation Private. In order to allow ON applications that use libdlpi the convenience of using the new DL_PASSIVE_REQ message, we propose adding the following dlpi_passive() function to this library: int dlpi_passive(int, int); The snoop(1M) command will be modified to link with libdlpi and use this new function. No other ON commands need to use this function in order to continue to function properly. Summary of Proposed Changes =========================== Changes to dladm: Classification: EVOLVING Current | Proposed | Purpose ------------------|-------------------|--------------------------- create-link -k | [removed] | Creates an aggregation /dev link create-link -d | [removed] | Creates a device /dev link create-link -d -v | create-vlan -d -v | Creates a VLAN for a device delete-link | [removed] | Deletes an aggregation /dev link delete-link | [removed] | Deletes a device /dev link delete-link | delete-vlan -d -v | Deletes a VLAN for a device down-link | [removed] | Deletes all configured /dev links Changes to MAC client interfaces: Classification: was CONTRACTED CONSOLIDATION PRIVATE, but by accident; now PROJECT PRIVATE. Removed: mac_rx_set() Added: mac_active_set(), mac_active_clear() mac_txloop_t, mac_txloop_add(), mac_txloop_remove(), mac_tx() Changes to DLPI: Classification: EVOLVING Added: DL_PASSIVE_REQ primitive Changes to libdlpi: Classification: CONSOLIDATION PRIVATE Added: dlpi_passive()