#ident "@(#)shp-userland.txt 1.6 09/06/05 SMI" Copyright 2009 Sun Microsystems Title: Userland Components of Solaris Hotplug. Date: June 5, 2009 Author: scott.carter@sun.com Abstract: This document highlights the userland components of the Solaris Hotplug Framework, and is an extension of the main proposal for that project. Contents: 1. Introduction 2. Architecture 3. Technical Details 4. Interfaces 1.0 Introduction This document only describes the userland components of the Solaris Hotplug Framework project. And specifically only for phase 1 of the project. Kernel details are documented separately. The first phase of the userland implementation supports the following new features introduced by the Solaris Hotplug Framework: virtual hotplug support, and an improved SHPC state model. 1.1 Scope and Roadmap The Solaris Hotplug Framework project is a multi-phase project. The first phase introduces the following features into userland: - New hotplug(1M) CLI to support physical and virtual hotplugging. - New cfgadm plugin to support physical hotplugging of connectors. - New libhotplug(3LIB) library shared by hotplug(1M) and cfgadm. - New hotplugd(1M) daemon to centrally manage hotplug operations. Userland features that are planned for a second phase include: - Management of DDI hotplug events for third party consumers. - Management of user friendly aliases to hotplug connectors and ports. - Management of blacklisted components. - RBAC based authentication. And in a third phase of userland functionality it is foreseen that a GUI would be developed based upon the libhotplug(3LIB) interfaces. 1.2 References - PSARC/2008/181 Solaris Hotplug Framework: Architecture and Design - PSARC/1998/460 RCM Framework 2.0 Architecture Here is a block diagram of the various userland components: +-------------+ +-------------------------------+ | hotplug(1M) | | cfgadm(1M) | | (SHP CLI) | +-------------------------------+ +-------------+ | libcfgadm | | libhotplug | +------------+------------------+ +-------------+ | SHP plugin | Other Plugins... | | +------------+------+-----------+ | | libhotplug | | librcm | | Socket +------------+ +-----------+ | | | | | | | | Socket | | | | +-----------------------------------+ | | | | | hotplugd(1M) | | | | | | (SHP Daemon) | | +---------------------+-------------+ | | libdevinfo / modctl | librcm | | +---------------------+-------------+ V | | +------------+ | ------------------->| RCM Daemon | | +------------+ | | | Device +----------------+ | Contract | Other Consumer | | ------------->| Applications | | | Events +----------------+ | | | | Userland . . . . . .|. . . . . . . . .|. . . . . . . . . . . . . . . . . . V | Kernel The architecture includes the hotplugd(1M) daemon that centrally manages all ongoing hotplug operations, both synchronous and asynchronous. An asynchronous operation is one detected by the kernel (e.g. a surprise removal or an ATTN button event). A synchronous operation is one that a user initiated through a CLI or GUI. The daemon serializes operations to ensure there are no conflicting operations. And it fully sequences all operations by coordinating with in-kernel portions of the SHP framework through modctl APIs, and with other consumer applications through the RCM framework. Users can initiate operations, manage hotplug connections, or view the status and dependency relationships of hotplug connections through the hotplug(1M) CLI or the legacy cfgadm(1M) CLI. In either case, the application uses the libhotplug(3LIB) library to communicate through a private socket based IPC mechanism to hotplugd(1M) where the operations are actually implemented. 3.0 Technical Details 3.1 Hotplug CLI The hotplug(1M) CLI allows a user to: - View a list of defined hotplug connectors and ports, their status, dependencies, and usage. - Initiate state change operations on hotplug connectors and ports. - Perform private, bus specific functions on a hotplug connector or port. This section gives a high level summary of the capabilities of the hotplug(1M) CLI. Refer to the hotplug(1M) man page for full details. 3.1.1 Displaying Hotplug Connectors Consider the following about hotplug connections: - Each hotplug connector has a name and a current state. - Each hotplug connector has one or more dependent hotplug ports. - Each hotplug port also has a name and a current state. - Each hotplug port has one or more dependent device nodes. - There is a hierarchy of connectors and ports, some dependent upon others, depending upon the physical composition of the hotpluggable components in the hardware. - Connectors and ports are integrated in the device tree hierarchy, each one being associated with a specific device node. - Beyond device tree dependencies, additional layers of dependency occur from other subsystems (e.g. filesystem mounts, networks). All these details are of interest to a system administrator when evaluating the impact of a hotplug operation on a system. This necessary information is all displayed by the hotplug(1M) CLI. The natural way to represent such hierarchical relationships is as a graph or tree. The hotplug(1M) CLI displays these details as a tree, and in a manner consistent with other existing CLIs (such as prtconf(1M) and prtpicl(1M)), by indenting each subsequent layer to show the dependency relationships. In reality the structure is a graph and not a tree, because multi-pathed resources have multiple parents. In these cases, the hotplug(1M) CLI will display some resources multiple times, once per path. A user may display the status of: 1) an individual connection, 2) a subtree of an individual connection and its dependents, or 3) a full tree of all defined connectors and ports with their dependencies. For each hotplug connector or port, the hotplug(1M) CLI displays: 1) its name, and 2) its current state. The hotplug(1M) CLI displays information with varying levels of verbosity. It can just show the hotplug connectors and ports. It can also include all their dependent device nodes. And it can also include detailed usage information. Detailed usage information is gathered from the RCM framework, and includes higher level uses of devices (e.g. filesystem mounts, plumbed network interfaces, etc.). 3.1.2 Initiating State Change Operations In general, initiating a state change operation involves specifying the following: o Target device path. o Target hotplug connector or port associated with the device path. o What hotplug state the target should be transitioned to. Initiating a state change operation on a hotplug connection will effect the full hierarchy of dependents below the target. 3.1.3 Virtual Hotplug Support There are physical hotplug connectors, and physical components that can be inserted or removed in those connectors. Traditionally the hotplug features in Solaris were centered on this physical style of hotplugging. The terminology used "attachment points" to describe physical receptacles and their occupants. But except for the physical actions of inserting or removing the components, the remainder of a hotplug operation is quite generic. It mostly entails probing or de-probing devices, attaching or detaching device drivers, and reconfiguring the higher levels of resource consumption. It is not necessary to limit hotplugging to the boundaries of physical components. Especially when components may be multi-function devices whose resources could be divided and managed separately. Virtual hotplugging improves the situation by giving an administrator finer grained control over the system's configuration, and also allows allocating resources individually to virtualized environments. Virtual hotplugging introduces new terminology. Hotplug connectors describe physical locations where hotpluggable components can be inserted or removed. Devices in the Solaris device tree represent the logical hardware functions that each have their own attached driver. A hotplug port manages the connection of a device to the system. There exists a hotplug port for each device, regardless if it can be physically hotplugged. Each bus nexus is associated with one or more hotplug ports to represent its dependent devices. And virtual hotplug operations can be performed on each port. If physical hotplugging is possible, then the bus nexus will also have an extra layer of hotplug connectors upon which the ports depend, to manage related physical hotplug operations. The hotplug(1M) CLI can perform hotplug operations on connectors and ports. The legacy cfgadm(1M) CLI is limited only to physical hotplug operations on connectors. 3.1.4 Private Bus Functions There will always be extra functionality implemented privately by bus controllers that just doesn't fit well in a generic state model for hotplugging. Therefore the Solaris Hotplug Framework provides a mechanism to initiate private, bus-specific functions on hotplug connectors. There is a passthru command in the hotplug(1M) CLI which allows a user to specify a comma-separated list of private options, such as could be parsed by getsubopt(3C). These options are arbitrary name/value pairs. The hotplug(1M) CLI parses the options then puts them into an nvlist_t data structure which it then transmits to the hotplugd(1M) daemon. The hotplugd(1M) daemon in turn passes the nvlist_t data structure to the driver that controls the target hotplug connector. 3.2 Hotplug Library The hotplug library connects administrative commands to the hotplugd(1M) daemon, where all hotplug operations are then implemented. The library exports a management API to its clients, allowing them to locate, list, and initiate operations on hotplug connectors and ports. A private sockets based IPC mechanism is then used to communicate with the daemon internally to actually retrieve hotplug status information and perform commands. 3.2.1 Authentication All requests for hotplug status information are granted. But to perform a state change operation, root privilege is required. Conversion to RBAC will be in a later phase of implementation. 3.2.2 Management API The information about hotplug connectors, ports, and their devices is represented as a graph. The interfaces to get and process all this information is as follows: - An application first gets a snapshot of hotplug information. A path can be specified to select a subset. And flags are used to indicate the level of detail included. - There are functions to traverse the nodes of the snapshot. The caller can manually traverse through child and sibling nodes, or perform an automated traversal with a callback function. - For each node in the snapshot, there are accessor functions to get individual data items for each node. Including the node's name, current state, what type of node it is, and any verbose usage description associated with the node. - There is a control function to perform an action on a node, to initiate a state change operation. - Once finished, a final interface exists to clean up and destroy the snapshot. 3.2.3 Socket Based IPC Mechanism The hotplug library establishes socket connections to the hotplug daemon to get information snapshots, and to initiate actions on hotplug connectors and ports. Each socket connection only exists temporarily until the snapshot is retrieved, or the action is completed. The socket protocol includes: 1. An initial phase to negotiate and establish the socket, exchanging details about the caller's identity and what locale should be used to internationalize any resulting error messages. 2. A set of commands sent over an established socket session to either request a snapshot with specific attributes, or initiate a control operation on a specific target. 3. An intermediate phase in which messages may be sent by the hotplug daemon to indicate forward progress or report on errors related to the current command. 4. Finally, termination of the session. If a caller first requests a snapshot, a separate session will be established just to retrieve that information. If later while the caller is traversing the snapshot it initiates a control operation on a specific hotplug connection, a new separate session will be established just to send the relevant command to the daemon and to get the results of that operation. Sessions have a short lifespan. 3.3 Hotplug Daemon (hotplugd(1M)) The hotplug daemon is an SMF managed service, which operates as a socket server to receive incoming libhotplug.so sockets only from the localhost. (It could be expanded to accept external connections later, if desired.) It also receives system events from the kernel which indicate when any asynchronous hotplug operations occur, such as ATTN button events or surprise removals. There are certain architectural reasons why a hotplug daemon is actually required, versus just implementing shared functionality in the hotplug library. These reasons are: - If a client application crashes while performing a hotplug operation, the system may be left in an inconsistent state. Because the hotplug daemon is managed by SMF, and because all operations are implemented by the hotplug daemon, recovery scenarios can then be ensured and automated. - All RCM interactions strictly require root privilege. The hotplug daemon runs with root privilege and performs all RCM interactions. Client applications run in separate address spaces and communicate with the hotplug daemon through a socket. The requirement for full root privilege therefore does not have to extend to the clients. Finer grained access control (through RBAC) can then be implemented in the hotplug library before it opens a socket to the daemon. 3.3.1 Service Management Facility (SMF) The hotplugd(1M) daemon is managed by SMF: the Service Management Facility. There is only one service instance per running domain, and its FMRI is: - svc:/system/hotplug:default Using SMF simplifies the efforts to manage the daemon as a long running service. It will take care of automatically starting and restarting the service. Even though the hotplug daemon is a socket server, it will be directly managed by SMF instead of inetd. To make the hotplug feature more robust, SMF will automatically restart the hotplug daemon if it ever crashes. A transaction log of ongoing operations will be maintained by the hotplug daemon so that it can restore the system to a consistent state if it ever fails in the middle of an operation. 3.3.2 Socket Based IPC Mechanism The IPC mechanism for libhotplug.so clients to communicate with the hotplugd(1M) daemon is based on sockets. This has certain benefits: - Detecting when client applications unexpectedly terminate is easy. - Detecting when the server is unavailable or unexpectedly terminates is equally simple on the client side. - Existing features from libnvpair simplify the task of packing data for transport over a socket in an efficient manner. - Although this is not in the current scope of the Solaris Hotplug Framework, expanding support to remote client applications would be trivial if desired in the future. Alternatives such as RPC and doors become too heavyweight and too complicated. Not enough data will be transmitted between the daemon and its clients to justify the extra complexity in exchange for the minor performance improvements that might result from alternatives, such as using a doors based IPC mechanism. 3.3.3 Serialization of Hotplug Operations Internally, the hotplugd(1M) daemon serializes incoming requests to do synchronous hotplug operations. It does the necessary locking to avoid interference between simultaneous operations that collide or overlap. It is the central arbiter for all hotplug operations that ultimately go through a libdevinfo(3LIB) and modctl based interface to the in-kernel portions of the Solaris Hotplug Framework. Higher level sequencing with other frameworks is managed either through the new device contract event interfaces directly consumed by other consumer applications, or by interactions with the RCM framework. 3.3.4 RCM Interactions The hotplugd(1M) daemon interacts with the RCM framework for two separate reasons. It collects detailed resource usage information from the RCM clients which is then integrated with other hotplug connection information when providing verbose listings. And it sequences RCM offline operations on the resources affected by any hotplug connector/port state change operations. The RCM framework must be informed of the root of an operation before the hotplugd(1M) daemon initiates a state change operation in the kernel. And RCM operations are transactional, which means that the hotplugd(1M) must further interact with the RCM framework at the end of an operation to notify RCM if the operation succeeded or failed. RCM clients take specific actions to restore the use of a resource if a hotplug attempt has failed. Or when a hotplug succeeds, RCM clients need to be informed that it is now safe to discard the information they retained in case restoration was necessary. 3.3.5 Gathering Hotplug Information Information about hotplug connectors, ports, and their states is managed in the kernel and exported to userland by libdevinfo(3LIB). Because the information is distributed throughout the device tree as additional properties of device nodes, this technique naturally supports gathering the hierarchical relationships between hotplug connectors, ports, and their dependent device nodes. The interfaces added to libdevinfo(3LIB) for this purpose are: - di_init(3DEVINFO): specify the flag DINFOHP to include hotplug connector information in the device tree snapshot. - Each device node (di_node_t) will have a list of related hotplug connectors or ports (di_hp_t) that are associated with the node. - To walk the list of hotplug connections associated with a device: - int di_walk_hp(di_node_t node, (*hp_callback), void *arg); - int (*hp_callback)(di_node_t node, di_hp_t hp, void *arg); - Alternatively, to traverse the list manually: - di_hp_t di_hp_next(di_node_t node, di_hp_t hp); - Calling di_hp_next() with 'hp' equal to DI_HP_NIL returns the head of the list. - When the end of the list is reached, di_hp_next() will return the value DI_HP_NIL. - To access specific fields of information about a hotplug connection: - char *di_hp_name(di_hp_t hp); (Returns the name for the hotplug connector/port) - int di_hp_physical(di_hp_t hp); (Returns 1 if physical, 0 if virtual) - char *di_hp_type(di_hp_t hp); (Returns a string describing the hotplug connector/port type.) - int di_hp_physnum(di_hp_t hp); (Returns physical slot number associated with connector/port.) - int di_hp_virtnum(di_hp_t hp); (Returns virtual slot number associated with connector/port.) - int di_hp_state(di_hp_t hp); (Returns the current state of the hotplug connector/port) The defined state of a hotplug connector or port are defined in the system header file . The states include the following: - DDI_HP_STATE_EMPTY - DDI_HP_STATE_PRESENT - DDI_HP_STATE_POWERED - DDI_HP_STATE_ENABLED - DDI_HP_STATE_PORT_EMPTY - DDI_HP_STATE_PORT_PRESENT - DDI_HP_STATE_PORT_PRESENT - DDI_HP_STATE_OFFLINE - DDI_HP_STATE_ONLINE - DDI_HP_STATE_MAINTENANCE The libdevinfo(3LIB) interfaces allow the hotplugd(1M) daemon to get all information about hotplug connectors and their dependent device nodes, which can then be packaged and transmitted to a client as a separate type of snapshot used by libhotplug.so. If a client requests additional details about how devices are used, then the hotplugd(1M) daemon must also gather information from the RCM framework and integrate the resulting RCM information tuples into their rightful place within the hotplug information snapshot. At this point, since an RCM information tuple could be reached through multiple device paths, this is where the hotplug information snapshot now becomes a graph instead of just a tree. 3.3.6 modctl API A libhotplug.so caller initiates a state change operation by referencing a node in its hotplug information snapshot. The snapshot includes a name and path associated with each hotplug connector. This path and name is transmitted by the libhotplug.so client over the socket based IPC mechanism to the hotplugd(1M) daemon to identify the target of a state change operation. The hotplugd(1M) daemon then uses a modctl API to initiate the operation in the kernel, referencing the target of the operation by the given hotplug connector name. The modctl API is described in other documentation that defines the kernel interfaces of the Solaris Hotplug Framework. 3.4 Libcfgadm Plugin (cfgadm_shp.so) Existing libcfgadm plugins for other DR and hotplug features will remain unchanged. But to integrate support for the new hotplug connectors defined by the Solaris Hotplug Framework into the existing cfgadm framework, a new libcfgadm plugin is required. Like the hotplug(1M) CLI, it will use libhotplug.so to interact with the hotplugd(1M) daemon. It will only operate upon physical connectors, and not virtual ports. Virtual hotplugging is not supported. Its interactions with hotplugd(1M) will include gathering connector information which can then be integrated into the cfgadm(1M) output, and initiating state change operations on those connectors. In addition to the plugin, other changes are also required in the generic libcfgadm.so library. Currently, libcfgadm.so searches the device tree for "attachment points", and dispatches operations on these targets to class-specific plugins. This information is found in libdevinfo(3LIB). The libcfgadm.so library is modified to also recognize the new hotplug connectors in libdevinfo(3LIB), and to dispatch operations on these new targets to the SHP specific plugin. 3.5 Other Consumer Applications High level consumers of hotpluggable devices include features such as the Solaris network stack, filesystem mounts, clustering features, storage volumes, and various multipathing features for storage and networks. Many of these applications already integrate with the RCM framework to synchronize their reconfiguration with DR or hotplug operations. The hotplugd(1M) daemon maintains interactions when appropriate with the RCM framework to synchronize these consumers, and to gather their usage information for verbose hotplug information displays. 4.0 Interfaces 4.1 Exported Interfaces Deliverables: Interface Stability Comments ---------------------------------------------------------------------- /usr/sbin/hotplug Consol. Priv. hotplug(1M) CLI (See manpage) /usr/sbin/hotplugd Consol. Priv. hotplugd(1M) daemon /lib/libhotplug.so.1 Consol. Priv. libhotplug(3LIB) (See manpage) Management Interfaces (libhotplug(3LIB)): Interface Stability Comments ---------------------------------------------------------------------- HP_NODE_DEVICE Consol. Priv. Node type for a device node. HP_NODE_CONNECTOR Consol. Priv. Node type for physical connector HP_NODE_PORT Consol. Priv. Node type for virtual port. HP_NODE_USAGE Consol. Priv. Node type for usage record. hp_node_t Consol. Priv. Structure of a node in snapshot. hp_init() Consol. Priv. Initialize a hotplug snapshot. hp_fini() Consol. Priv. Cleanup/Remove hotplug snapshot. hp_traverse() Consol. Priv. Traverse nodes in a snapshot. hp_name() Consol. Priv. Get a node's name. hp_is_physical() Consol. Priv. Boolean test if node is physical hotplug connector. hp_type() Consol. Priv. Get a node's type description. hp_state() Consol. Priv. Get a node's current state. hp_usage() Consol. Priv. Get a node's usage description. hp_parent() Consol. Priv. Get a node's parent. hp_child() Consol. Priv. Get a node's first child. hp_sibling() Consol. Priv. Get a node's next sibling. hp_set_state() Consol. Priv. Command to initiate state change. 4.2 Imported Interfaces From libdevinfo(3LIB): Interface Stability Comments ---------------------------------------------------------------------- DINFOHP Consol. Priv. Flag for di_init() to include hotplug information in snapshot. DI_HP_NIL Consol. Priv. A NULL di_hp_t structure. di_hp_t Consol. Priv. Structure of a hotplug connector associated with a di_node_t. di_walk_hp() Consol. Priv. Traverse hotplug connectors associated with a di_node_t. di_hp_next() Consol. Priv. Get next di_hp_t in a list. di_hp_name() Consol. Priv. Get name of a di_hp_t connector. di_hp_state() Consol. Priv. Get state of a di_hp_t connector. di_hp_physical() Consol. Priv. 1 if physical, 0 if virtual. di_hp_type() Consol. Priv. Return description of the hotplug handle (e.g. "PCI Slot", etc.) di_hp_child() Consol. Priv. Return child device node of a virtual hotplug port. di_hp_physnum() Consol. Priv. Return physical slot number associated with a connection. di_hp_virtnum() Consol. Priv. Return virtual slot number associated with a connection.