Nemo Changes for Binary Compatibility ===================================== release binding: patch 1 Introduction ============== This fast-track case is part of the Clearview umbrella case (PSARC/2005/132). It modifies and makes additions to the interfaces defined in PSARC/2004/571 (Nemo - a.k.a. GLD v3). The justification for this work and its relationship to the rest of the Clearview project is discussed in that case. This case has two goals. 1. Address the problem that the GLDv3 framework cannot be extended in a binary compatible way due to the design of its MAC driver interfaces. 2. While modifying the MAC driver interfaces, add MAC driver support for the MAC-Type plugin architecture described in PSARC/2006/248 to the MAC driver interfaces. That case requires that drivers specify which MAC-Type plugin they wish to use upon registering, and register some plugin specific data. This case specifies how this will be done using mac_register(). This case and PSARC/2006/248 are therefore interdependent, and cannot integrate separately. The GLDv3 framework requires the size of internal data-structures (mac_t and mac_info_t) to be hard-coded within each driver. This will make it nearly impossible to extend the capabilities of the framework in the future without breaking binary compatibility with drivers written to the framework. This problem is outlined in bugids 6242059 ("nemo drivers must not know the size of the mac_t structure") and 6226635 ("MAC stats interface could cause problems with binary compatibility".) This is addressed by changing the way that drivers register and interact with the GLDv3 mac module. Note that these changes are not backward compatible, but because the GLDv3 interfaces are consolidation private, this type of change is acceptable. This work will be one step forward to make it possible for the interfaces to become public in the future. This case does not upgrade the stability of these interfaces. Note that throughout this case, the terminology "previous model" refers to the GLDv3 implementation before the implementation of this case, as defined in PSARC/2004/571. 2 Changes to mac_register() =========================== The cornerstone of the GLDv3 MAC driver interface is the mac_register() function, which is used by GLDv3 drivers to register MAC devices. In order to create a binary compatible interface, driver registration will be modified such that drivers will use the following four functions: mac_register_t *mac_alloc(uint_t mac_version); int mac_register(mac_register_t *, mac_handle_t *); void mac_free(mac_register_t *); int mac_unregister(mac_handle_t); In order to register, drivers will first allocate a mac_register_t structure using mac_alloc(). The sole argument to mac_alloc() _must_ be MAC_VERSION (defined in ). Passing in MAC_VERSION allows mac_alloc() to verify if the driver was compiled against a version of the GLDv3 framework which is compatible with what is running on the system. In the case of an incompatible version, mac_alloc() fails and returns NULL. If the version is compatible, it will allocate a mac_register_t structure and return a pointer to it. The m_version field of the returned structure will have been automatically set to the requested version by mac_alloc(). The mac_alloc() function does a sleeping memory allocation, and therefore is not safe to call from contexts that are sensitive to blocking. Drivers will then fill in the rest of the mac_register_t structure with the required information and pass it into mac_register(), which will return an opaque MAC handle for use in all other driver interface functions. Possible return values for mac_register() include: 0 On success. EINVAL The mac_register_t contains an invalid field. This can include a MAC-Type plugin that couldn't be found or loaded, missing callbacks, or a missing source address. EEXIST The device being registered was already registered. The registration structure can be freed after registration using the mac_free() function. When the device detaches and needs to unregister, it passes this handle into mac_unregister(). The mac_unregister() function may fail and will have the following possible return values: 0 On success. EBUSY The MAC is in use by a data-link and cannot be unregistered. The benefit of this model is that drivers are shielded from knowing the real size of the mac_register_t structure. The GLDv3 framework can grow to include more optional information in mac_register_t (thus returning a larger structure in mac_alloc()), and drivers will only fill in the parts of the structure that they were compiled to know about, thus preserving binary compatibility. 2.1 mac_register_t ------------------ The structure of mac_register_t is: typedef struct mac_register_s { uint_t m_version; const char *m_type_ident; void *m_driver; dev_info_t *m_dip; uint_t m_instance; uint8_t *m_src_addr; uint8_t *m_dst_addr; mac_callbacks_t *m_callbacks; uint_t m_min_sdu; uint_t m_max_sdu; void *m_pdata; size_t m_pdata_size; } mac_register_t; * m_version will be automatically set to the requested version by the mac_alloc() function. This allows the mac module to know what version of the MAC driver interface the driver was compiled against. Drivers do not need to explicitly set this field. * m_type_ident is the string identifying which MAC type plugin the driver needs to use. For Ethernet, this would be set to MAC_PLUGIN_IDENT_ETHER (as defined in PSARC/2006/248.) * m_driver is set to the driver's instance private data. This will be passed as an argument to all driver callbacks. * m_dip is the device's devinfo pointer. * m_instance is set to 0 unless the driver wishes to associate a MAC instance number with this MAC that is different from ddi_get_instance(m_dip). This is used by drivers such as "aggr" that have one devinfo pointer, but register multiple MAC's. In that case, it can register multiple MACs, each with the same m_dip, but each having a unique instance number. * m_src_addr is the unicast address of the MAC at the time mac_register() is called. * m_dst_addr is the destination address of the MAC at the time mac_register() is called. This field is meant to be used by MACs that have fixed destinations. It is thus optional and may be set to NULL. * m_callbacks defines the list of driver entry points or callbacks that GLDv3 will use. The structure of mac_callbacks_t and the definition of each callback is described in section 2.2. * m_min_sdu is set to the minimum payload size that can be conveyed by the media. * m_max_sdu is set to the maximum payload size that can be conveyed by the media. * m_data is an optional field (may be set to NULL if the plugin requires no data to function) and is used to pass state information to the MAC type plugin in use by the device. The structure of the data (if any) is defined by the plugin's documentation, and the validity of the registered data is verified by the plugin itself. If the data is invalid, mac_register() will fail. This data is copied by GLDv3, so no reference to data pointed to by m_pdata is kept by the framework. If this is set to NULL, then m_pdata_size must be set to 0. If this field is non-NULL and the requested plugin does not support MAC plugin data, then mac_register() will fail. * m_pdata_size is the size of the data pointed to by m_pdata. This allows the GLDv3 framework to copy the data. If m_pdata is NULL (no plugin data is supplied), then this must be set to 0. 2.2 mac_callbacks_t ------------------- Drivers use this structure in mac_register_t to enumerate the set of driver callbacks that GLDv3 will use. The first argument to all callbacks is a pointer to the m_driver field of mac_register_t as passed in through mac_register(). The mac_callbacks_t structure is: typedef struct mac_callbacks_s { uint_t mc_callbacks; mac_getstat_t mc_getstat; /* Get the value of a statistic */ mac_start_t mc_start; /* Start the device */ mac_stop_t mc_stop; /* Stop the device */ mac_setpromisc_t mc_setpromisc; /* Enable or disable promiscuous mode */ mac_multicst_t mc_multicst; /* Enable or disable a multicast addr */ mac_unicst_t mc_unicst; /* Set the unicast MAC address */ mac_tx_t mc_tx; /* Transmit a packet */ mac_resources_t mc_resources; /* Get the device resources */ mac_ioctl_t mc_ioctl; /* Process an unknown ioctl */ mac_getcapab_t mc_getcapab; /* Get capability information */ } mac_callbacks_t; mc_callbacks is a flags field that drivers use to denote which _optional_ callbacks are set in the structure. The last two callbacks defined in this structure are currently optional (mc_resources, mc_ioctl, and mc_capab_get), and thus the only three possible flags are: MC_RESOURCES MC_IOCTL MC_GETCAPAB Drivers that do not define mc_resources, mc_ioctl, nor mc_getcapab set mc_callbacks to 0. This flags field allows future additions to this structure to not affect existing binaries, as existing binaries will not set those future flags associated with new callbacks. As such any additions to this structure _must_ be accompanied by an associated mc_callbacks flag. In the previous GLDv3 model, callback functions (prefixed with "m_") were set directly in mac_t. These callbacks map to the new mac_callbacks_t callbacks as follows: m_stat -> mc_getstat m_start -> mc_start m_stop -> mc_stop m_promisc -> mc_setpromisc m_multicst -> mc_multicst m_unicst -> mc_unicst m_resources -> mc_resources m_ioctl -> mc_ioctl Except for mc_getstat and the new mc_getcapab (discussed below), the semantics and signature of the callbacks in mac_callbacks_t are identical to those in the previous model as set in mac_t. 2.2.1 mc_getstat ---------------- typedef int (*mac_getstat_t)(void *arg, uint_t stat, uint64_t *val); This entry point is called to retrieve a value for a statistic. There are two possible sets of statistics. One is the set of generic MAC statistics defined in the mac_driver_stat enumeration in . The other set of statistics is defined by the MAC type plugin in use by the driver. Some plugins may define no statistics. The Ethernet plugin described in PSARC/2006/248 does define statistics in , in the ether_stat enumeration. The stat argument is one of these statistics. The function must either return 0 and set the statistic's value in the "val" argument upon success, or return a non-zero errno upon failure. For example, ENOTSUP would be an acceptable return value if the statistic passed in was not supported by the driver. Note that this callback replaces what was the m_stat callback in the previous model. That model had binary compatibility problems described in bug 6226635 ("MAC stats interface could cause problems with binary compatibility".) This new callback fixes those binary compatibility problems. 2.2.2 mc_getcapab ----------------- typedef boolean_t (*mac_getcapab_t)(void *arg, mac_capab_t capab, void *data); typedef enum { MAC_CAPAB_HCKSUM, MAC_CAPAB_POLL /* new capabilities are defined here */ } mac_capab_t; This optional entry point is used to obtain the MAC's capabilities and associated data from the driver. Capabilities are defined by the GLDv3 framework, as is the format of their associated data. The requested capability is passed in as the second argument, and the function is expected to return B_TRUE if the device supports that capability, or B_FALSE if it does not. If it returns B_TRUE and the capability requires associated data, the function is expected to fill in the data as stated in the capability's documentation. In the previous model, capabilities of devices were communicated to GLDv3 as fields of the mac_info_t structure, which was embedded in the mac_t used in mac_register(). This method caused problems with binary compatibility, as no new capabilities could be defined in mac_info_t without affecting the alignment of the rest of the fields in the mac_t. This new capability mechanism addresses this binary compatibility problem by allowing new capabilities to be defined without affecting the ABI. There are currently only two capabilities defined: * MAC_CAPAB_HCKSUM This is the capability of the device to perform TCP/IP checksum offload. The data points to a uint32_t which corresponds to the transmit flags associated with the capability. The driver must set these flags to to represent the offload capability. The flags are defined in , and are: HCKSUM_ENABLE HCKSUM_INET_PARTIAL HCKSUM_INET_FULL_V4 HCKSUM_INET_FULL_V6 HCKSUM_IPHDRCKSUM These flags and their semantics were introduced in PSARC/2004/106 and 2003/264. This capability replaces the mi_hcksum field in the mac_info_t structure used in the previous mac_register() model. * MAC_CAPAB_POLL This capability has no associated data. The driver simply returns B_TRUE if it supports GLDv3 polling, or B_FALSE if it does not. This capability replaces the mi_poll field in the mac_info_t structure used in the previous mac_register() model. 3 New Driver Interfaces ======================= The following two driver interfaces are introduced by this case. 3.1 mac_pdata_update -------------------- void mac_pdata_update(mac_handle_t mh, void *pdata, size_t datasize); MAC-Type plugins may require or simply optionally use MAC plugin data in order to perform their functions. This data is registered by drivers by setting the m_pdata and m_pdata_size fields of the mac_register_t used in mac_register(). When drivers need to update the data due to some administrative interaction or some other event, they can use the mac_pdata_update() function. The data is copied by the GLDv3 framework and the new data is passed into the MAC-Type callbacks of the plugin in-use by the driver. Because such MAC plugin data will cause the MAC-Type plugins to alter the headers generated by the plugins, headers cached by IP fast-path need to be flush. As such, calling this function will also cause the GLDv3 mac module to generate a MAC_NOTE_FASTPATH_FLUSH notification (introduced by this case). The GLDv3 dld module will then generate a DL_NOTIFY_IND message containing DL_NOTE_FASTPATH_FLUSH to affected DLPI consumers. The DL_NOTE_FASTPATH_FLUSH mechanism was previously introduced by PSARC/2000/285 (DLPI M_DATA fastpath flush notification.) 3.2 mac_dest_update ------------------- void mac_dest_update(mac_handle_t mh, const uint8_t *addr); Because some MAC-Types may require the configuration of a destination address (IP Tunnels for example), the mac_register_t driver registration structure allows drivers to register such an address. This function allows drivers to modify the address after registration. Calling this function will cause the GLDv3 mac module to generate a MAC_NOTE_DEST notification (introduced by this case). The GLDv3 dld module will then generate a DL_NOTIFY_IND message containing DL_NOTE_PHYS_ADDR of type DL_CURR_DEST_ADDR (also introduced by this case). 3.3 Changes to Other Driver Interfaces -------------------------------------- As noted in section 2, drivers will now use an opaque mac_handle_t as the first argument to driver interfaces, as opposed to the mac_t as in the previous model. The functions affected are: void mac_rx(mac_handle_t, mac_resource_handle_t, mblk_t *); void mac_link_update(mac_handle_t, link_state_t); void mac_unicst_update(mac_handle_t, const uint8_t *); void mac_tx_update(mac_handle_t); void mac_resource_update(mac_handle_t); mac_resource_handle_t mac_resource_add(mac_handle_t, mac_resource_t *); void mac_pdata_update(mac_handle_t, void *, size_t); void mac_multicst_refresh(mac_handle_t, mac_multicst_t, void *, boolean_t); void mac_unicst_refresh(mac_handle_t, mac_unicst_t, void *); void mac_promisc_refresh(mac_handle_t, mac_setpromisc_t, void *); The semantics of these functions are unmodified by this case. 4 MAC Client Interface Changes ============================== In order to give access to some of the new functionality defined above, new MAC client interfaces must be defined. These will give the dld, dls and aggr modules (the only three existing consumers of the MAC client interfaces) access to the new functionality. In addition, modifications to existing client interfaces need to be made to accommodate driver interface changes. 4.1 New MAC Client Interfaces ----------------------------- The following client interfaces are being introduced by this case. 4.1.1 mac_capab_get ------------------- boolean_t mac_capab_get(mac_handle_t mh, mac_capab_t cap, void *cap_data); This function is called by a client that wishes to obtain MAC capabilities from the driver. The set of capabilities is discussed in the context of the mc_getcapab driver callback in section 2.2.2. This functional interface replaces the previous use of the mac_info_t as a repository of capability state. 4.1.2 mac_dest_get ------------------ void mac_dest_get(mac_handle_t mh, uint8_t *dest_addr); This function is analogous to the mac_unicst_get() function already defined, but it instead obtains the current destination MAC address. 4.2 Modifications to Existing MAC Client Interfaces --------------------------------------------------- The following MAC client interface is being modified by this case: 4.2.1 mac_open -------------- int mac_open(const char *macname, uint_t ddi_instance, mac_handle_t *mhp); The function signature of mac_open() does not change, but the semantics of the first two arguments of mac_open() are being modified similarly to dls_create(). These semantic changes are described in section 5.1. 4.2.2 mac_stat_get ------------------ uint64_t mac_stat_get(mac_handle_t mh, uint_t stat); The modified mac_stat_get() interface takes a uint_t as a second argument instead of a "enum mac_stat". The reason for this change is that prior to this case, all statistics were defined in in "enum mac_stat". This case introduces plugin-defined statistics that are defined in plugin-specific header files. Therefore, the statistics that can be requested through the mac_stat_get() interface cannot be confined to a single enum type. 5 DLS Client Interface Changes ============================== 5.1 dls_create -------------- int dls_create(const char *linkname, const char *macname, uint_t ddi_instance); The function signature of dls_create() does not change, but the semantics of the last two arguments is being modified here. In PSARC 2004/571, the second argument was described as the "driver instance name", and the third as the "m_port value" of the MAC associated with the new link. Because the new mac_register() mechanism described in section 2 does away with the concept of port numbers, the MAC is now described using its "MAC name" and the associated driver's DDI instance number. The MAC name is constructed by the mac module when a MAC registers using mac_register(). Its format is , where instance is either the DDI instance number of the registered DIP if m_instance is 0, or m_instance if m_instance is non-zero. The semantics of m_instance are described in section 2.1. 6 Interface Table ================= _____________________________________________________________________________ | Interfaces Exported | |_________________________|_______________________|_________________________| | Interface | Classification | Comments | |_________________________|_______________________|_________________________| | mac_alloc() | Consolidation Private | | | mac_free() | Consolidation Private | | | mac_register() | Consolidation Private | (modified) | | mac_unregister() | Consolidation Private | (modified) | | mac_rx() | Consolidation Private | (modified) | | mac_link_update() | Consolidation Private | (modified) | | mac_unicst_update() | Consolidation Private | (modified) | | mac_tx_update() | Consolidation Private | (modified) | | mac_resource_update() | Consolidation Private | (modified) | | mac_resource_add() | Consolidation Private | (modified) | | mac_multicst_refresh() | Consolidation Private | (modified) | | mac_unicst_refresh() | Consolidation Private | (modified) | | mac_promisc_refresh() | Consolidation Private | (modified) | | mac_open() | Consolidation Private | (modified) | | mac_pdata_update() | Consolidation Private | | | mac_dest_update() | Consolidation Private | | | | | | | MC_RESOURCES | Consolidation Private | | | MC_IOCTL | Consolidation Private | | | MC_GETCAPAB | Consolidation Private | | | MAC_VERSION | Consolidation Private | | | MAC_NOTE_FASTPATH_FLUSH | Consolidation Private | | | MAC_NOTE_DEST | Consolidation Private | | | | | | | mac_register_t | Consolidation Private | | | mac_callbacks_t | Consolidation Private | | | mac_capab_t | Consolidation Private | | | | | | | dls_create() | Consolidation Private | (modified) | | | | | | DL_CURR_DEST_ADDR | Evolving | | |_________________________|_______________________|_________________________|