7. Daemon

Version 0.18, 2009-Mar-17

The NWAM daemon (nominally nwamd) is the policy component in NWAM. Given the current reality it determines if it there is a configuration that can be automatically applied and if not asks the UI to query the user for further data. In order to do this, nwamd needs to integrate information from multiple sources, and in doing so fulfill multiple roles. These are described in the next few sections.

7.1 Daemon Overview

nwamd is the daemon that controls network autoconfiguration; it does this by

7.2 Daemon Startup and Shutdown Behavior

The daemon supports two different startup modes: hard reset and soft reset. When the daemon starts, it checks for the /etc/svc/volatile/nwam/nwamd_soft_reset file. This token is used to indicate that on restart, nwamd should carry out a soft reset. If the nwamd_soft_reset file does not exist, nwamd creates it (so that future starts will be soft resets), and performs some specific hard reset initialization steps; if the file does exist, nwamd skips the hard reset initialization steps and continues in soft reset mode.

When nwamd exits, it will remove the soft reset marker. The marker will also be automatically removed on boot, as it is in the /etc/svc/volatile filesystem. This ensures that on the first launch after boot, and on any launch that follows a clean shutdown, nwamd will run in hard reset mode. Any launch that follows an unexpected shutdown will result in nwamd launching in soft reset mode.

The hard reset initialization steps are:

Work then continues with the steps common to both hard and soft reset:

When nwamd is shutdown, the following steps are performed:

These steps will be described in more detail in section 7.5, where the complete nwamd state machine is outlined. Next we discuss the specifics of event handling and event propagation.

7.3 nwamd as Event Handler

The NWAM daemon needs to handle many different events triggered both externally by the system or user and internally by its different threads. For example, it needs to handle the removal of a link, which causes the IP interface that depends on that link to become unavailable. It also needs to handle internally generated events such as completion of the thread responsible for gathering interface information. Here we explain the different event sources monitored and how the events are retrieved and handled by the NWAM daemon. These events then trigger NWAM events that interested parties (including nwamd) can capture and process (NWAM events are discussed in section 7.3).

Event sources:

7.4 nwamd as Event Dispatcher

nwamd is the network configuration policy engine of the system. There are also UI components which inform the system about how the configuration of the system is changing (link up/down) and which query the user when information is needed (connect to an essid).

Below is a table of events that nwamd captures or creates and turns into event messages of type NWAM_EVENTS_... for consumption by interested processes. Event listeners register interest in NWAM events by calling the libnwam function nwam_events_register() and supplying a callback. The callback is free to ignore events that are not of interest. nwamd itself will often carry out actions in response to events it detects, as well as signaling that the event has occurred to interested parties. The doors IPC mechanism is used for event dispatch from nwamd to registered event listeners.

A simple example of an event listener is the NWAM GUI, which listens for NWAM_EVENT_TYPE_SCAN_REPORT event, which provides the GUI a list of available wireless networks garnered by nwamd via dladm_wlan_scan().

Event Trigger

Meaning

Event source

NWAM_EVENT_TYPE_...

RTM_NEWADDR

New address for IP interface

Routing socket

IF_ACTION

RTM_DELADDR

Address removed

Routing socket

IF_ACTION

DL_NOTE_LINK_UP

Wired link is up
Wireless link is connected to AP

DLPI notification

LINK_STATE

DL_NOTE_LINK_DOWN

Wired link is down
Wireless link is disconnected from AP

DLPI notification

LINK_STATE

EC_DEV_ADD

NIC hotplug inserted

sysevent

LINK_ACTION

EC_DEV_REMOVE

NIC hotplug removed

sysevent

LINK_ACTION

WLAN scan done

dladm_wlan_scan()

nwamd

WLAN_SCAN_REPORT

WLAN connection state

dladm_wlan_getlinkattr()

nwamd

WLAN_CONNECTION_REPORT

more information needed

no recognized WLAN available

nwamd

WLAN_NEED_CHOICE

more information needed

WEP or WPA key needed

nwamd

WLAN_NEED_KEY

7.4.1 IP Interface Events

The first two events pertain to IP interfaces, and occur when an IP interface is assigned an address (via DHCP or statically) or an address is removed (via DHCP lease expiry). These events are delivered via routing socket and converted into NWAM events.

7.4.2 Datalink Events

The next four events provide information about datalinks. First are the two LINK_STATE events, which indicate layer 2 connection to or disconnect from a network: either the cable has been plugged in/unplugged in the case of an 802.3 link, or a connection to a wlan has been made/lost to the case of an 802.11 link. nwamd receives these events via a libdlpi(3LIB) notification callback.

The LINK_ACTION event is generated when a sysevent is received indicating that a NIC has been inserted or removed. NIC hotplug events are reported by EC_DEV_ADD and EC_DEV_REMOVE events of subclass ESC_NETWORK.

7.4.3 Wireless Notification Events

Event clients that wish to receive information regarding wireless configuration should use the *_REPORT events. The SCAN_REPORT event supplies the results of wireless scans; while the CONNECTION_REPORT event is a notification of successful connection to a wireless network. This report includes information, such as signal strength, about the connected wlan.

7.4.4 More Information Events

The WLAN_NEED_* events indicate that nwamd cannot determine the next action to take in configuring a specific NCU; thus, it needs more information from the user.

One example would be the case where nwamd finds that there is one wireless AP, broadcasting essid "ap7", which is not in the preferred wlan list. Thus nwamd cannot connect to this wlan. In this case, nwamd will send a WLAN_NEED_CHOICE event to advertise the fact that it needs more information from the user (is it okay to connect to this AP?) in order to proceed. If the user chooses to go forward with the connection, the UI can update the preferred essid list in the configuration repository; this update will cause NWAM to reevaluate its configuration and determine that it can now proceed with configuration of the wireless NCU.

Our philosophy for dealing with this situation is that nwamd will send the WLAN_NEED_CHOICE event to any registered listeners (which will typically be UIs) indicating that nwamd is trying to configure an NCU, but has reached a point where it cannot proceed without additional information. The name of the NCU in question will be sent with the NO_MAGIC event. After successfully sending the event, nwamd will mark the NCU as 'waiting-for-response' and go on with its work without blocking on or otherwise waiting for a response to the WLAN_NEED_* event.

Multiple clients can register and receive WLAN_NEED_* events, including GUI and CLI programs. How they go about getting the information needed is up to the client. After information is obtained from the user, the client will simply update the NWAM configuration database and poke nwamd to reread its configuration. At this point, with the updated information, nwamd may be able to make more progress in configuring the NCU.

Returning to the original example: a GUI receiving the WLAN_NEED_CHOICE event indicating that no preferred wlan is available may choose to pop up a window notifying the user of this condition. The user may simply dismiss the window and ignore the event; or the user may choose to go ahead and connect to ap7. In the former case, no configuration updates will take place, and the NCU will not progress out of the 'waiting-for-response' state. In the latter case, the configuration update will cause nwamd to restart configuration of the NCU, and this time will find ap7 in the preferred list, and will be able to connect to this wlan.

7.5 nwamd as Policy Engine

The reason nwamd needs to collect, dispatch and handle events relating to links, locations and ENMs is of course to implement as much of the desired configuration policy as is possible given current system state. The structures NWAM deals in, NCUs, Locations and ENMs, are used to describe desired configurations and circumstances under which they should be (de-)activated. Of course, the system realities may preclude a complete application of policy (a NIC card may have been removed, or a cable unplugged, etc). More significantly, the goal of NWAM is to respond to changes in circumstances and reconfigure as needed; for example if a VPN tunnel appears, we may wish to change to a "work" profile, and the location used will effect the necessary changes, enabling the appropriate name services, etc.

Examples of policy engine behavior include:

To accomplish this, nwamd maintains a separate state machines for each NCU. When a minimal set of NCUs (the definition of which depends on the NCP requirements) is "online", nwamd can proceed with Location selection. If that minimal set of NCUs goes offline for any reason, nwamd will revert to the "NoNet" Location.

7.5.1 NCU State Machines

Each NCU is configured by an independent state machine. The details of each state machine will vary, depending on the NCU type. The following tables define the Link and Interface NCU state machines.

Link NCU States
State Notes
Init The initial state
Auth/Assoc Authentication and/or Association (e.g. wlan connection) is underway: the wifi state machine is running.
Down Link is enabled but is not "connected"
Up Link is connected/authenticated/associated
Remove Link is being removed
Link NCU State Transitions
State Event Next State
Init link enabled, wireless link Auth/Assoc
Init link enabled, unconnected physical link Down
Init link enabled, connected physical link Up
Auth/Assoc authentication and association complete Up
Down link up notification Up
Down link removed Removed
Up link down notification Down
Up authentication/association lost Auth/Assoc

Interface NCU States
State Notes
Down Interface is down
Config_Static Static addresses are being configured
Config_DHCP DHCP lease is being requested
Up Interface is up with at least one IP address assigned
Interface NCU State Transitions
State Event Next State
Down Static address(es) to be assigned Config_Static
Down No static addrs; DHCP needed Config_DHCP
Config_Static Failed to assign any addresses, no DHCP requested Down
Config_Static Failed to assign any addresses, DHCP requested Config_DHCP
Config_Static Assigned address(es), DHCP requested Config_DHCP
Config_Static Assigned address(es), no DHCP requested Up
Config_DHCP At least one IP addr assigned Up

7.5.2 Wireless Authentication/Association State Machine

The Auth/Assoc state in the Link NCU state machine encompasses an additional, link-type-specific state machine. Currently, the only type of authentication and association supported by NWAM are WEP- and WPA-authenticated WLAN connections. The state machine for this type of authentication and association is described in the following tables.
WiFi Connection States
State Notes
Init Initial state, gathering known info
WlanWait WLAN unknown, need user input
KeyReq Selected WLAN uses WEP or WPA
KeyWait Key unknown, need user input
Connecting Have required info, attempting to connect
Connected Successfully connected
WiFi Connection State Transitions
State Event Next State
Init WLAN unknown WlanWait
Init WLAN known, is secured KeyReq
Init WLAN known, is not secured Connecting
WlanWait Got WLAN, is secured KeyReq
WlanWait Got WLAN, is not secured Connecting
KeyReq key lookup failed KeyWait
KeyReq key lookup succeeded Connecting
KeyWait Got key Connecting
Connecting connection succeeded Connected
Connecting connection failed Init

7.5.3 Preventing Automatic Unload of WiFi Drivers

At present, NWAM plumbs IP on all available datalinks on the system. This is done in part to prevent WiFi drivers from being automatically unloaded if their module reference count is zero, which will happen periodically. If a WiFi device has successfully connected and associated with an AP, unloading can still occur, and it has the unfortunate side-effect of losing all WiFi configuration preferences associated with the device, so a successful association with an AP can be lost.

Moving forward, we would like to remove this dependency on IP plumbing, as it violates one of the key aims of Clearview, which is to separate the datalink and IP layer functionality cleanly.

CR 6684460 wifi drivers must not unload while connected ultimately needs to be fixed, with dladm_wlan_connect() holding a reference to the driver which is released by dladm_wlan_disconnect(). Another possible approach mentioned was to have detach entrypoints for WiFi drivers fail with EBUSY if connected, but this would involve modifying all WiFi driver source - it seems that a framework-based approach is more feasible.

Some application-level options are available in the short term:

  1. We will be using DLPI for wired links to get link status notifications via dlpi_enabnotify() anyway, so we could simply dlpi_open() the WiFi datalinks too for their lifetime. This should be enough to ensure the drivers aren't unloaded.
  2. We make nwamd explicitly load the drivers for all WiFi devices via modctl() - the equivalent of modload'ing the drivers in other words. This has the side-effect of setting MOD_NOAUTOUNLOAD for the driver in question, which prevents auto-unloading occurring. This would allow us to have the flexibility of not plumbing IP on all datalinks, while being sure that there is no danger of losing WiFi connections.

Option 1 is the simplest, since we will need to dlpi_open() wired datalinks anyway in order to get link notifications, so that will be the short-term approach taken.

7.5.4 Location Activation Logic

In order to begin installing and activating locations, two objectives must first be met: the root filesystem must be writable, and the minimal set of NCUs must be online.

The first requirement, that of a writable root filesystem, is an issue because svc:/network/physical:nwam starts very early during boot, before this is the case. Creating a dependency on the root filesystem being writable is not an option, as that would create a dependency loop. Instead, the new svc:/network/location service will do the actual installation and activation of locations; it will be refreshed or disabled/enabled as needed by nwamd.

The second requirement is that a minimal set of NCUs be "online." In this case, "online" is defined as having reached the "Up" state of the NCU state machine. The minimal set of NCUs is, in the case of the Automatic NCP, defined as one link NCU and its associated interface NCU. The minimal set of NCUs for a user-configured NCP will be determined based on the NCP configuration; but will minimally consist of at least one interface NCU and any link NCUs that interface NCU depends on.

Thus location activation will work as follows. The location service will have a dependency on filesystem/usr, as well as on the existence of /etc/svc/volatile/nwam/location_ready. When ready to install a location (either the No-Net on startup, or when the location state machine has selected an appropriate location based on the current network conditions), nwamd will set the needed service properties, and copy any related files, plus the location_ready signal file, into /etc/svc/volatile/nwam. As file dependencies are not yet fully supported in the SMF framework, nwamd will at this point disable and re-enable the location service; this will cause it to re-check its dependencies and see that the file dependency is now met.

The start method for the location service will then activate the location by copying any needed files from the volatile directory into the appropriate filesystem location, and refreshing or restarting the services whose properties were updated.

Revision History

Revision Date Changes
0.1 2007-Sep-05 placeholder
0.2 2007-Sep-07 initial draft
0.3 2007-Sep-13 updates; expand overview and policy engine sections
0.4 2007-Oct-18 updated to match prototype
0.5 2007-Oct-22 minor formatting tweaks; back-fill version table
0.6 2007-Dec-03 system-wide config profiles are now called Locations
0.7 2008-Jan-15 nit: link up/down events are triggered by dlpi notifications
0.8 2008-Feb-20 we're not using SMF as the repository
0.9 2008-Mar-05 clean up editing nits; update cold/warm start behavior; add some ToDo items
0.10 2008-Mar-06 can't do soft_reset marker in start/stop methods (can't distinguish between stop->maintenance vs. stop->disabled); move to daemon.
0.11 2008-Mar-12 clarify automatic ncp details (additional details added to overview and repository pages as well)
0.12 2008-Apr-15 clarifications in intro and §7.4; identify source of tunnel create/destroy events; §7.6 (Privileges and Roles) moves to its own page
0.13 2008-Apr-16 added wifi plumbing discussion
0.14 2008-Apr-28 add NO_MAGIC overview
0.15 2008-Nov-24 draft state machine details
0.16 2008-Dec-23 update based on implementation details: new rules about when legacy location is re-installed; new location startup process.
0.17 2009-Jan-27 miscellaneous clean-up
0.18 2009-Mar-17 pre-psarc review feedback