Version 0.10, 2009-Mar-17
Network configuration is currently scattered across several SMF services:
In NWAM Phase 1, a large portion of the network/service service will be eliminated, and the behavior of the network/physical legacy (:default) and NWAM (:nwam) instances will be clarified.
The NWAM Architecture discussed the elimination of all four of the SMF services listed above. After some implementation experience, and with delivery targets in mind, we have narrowed down the scope for Phase 1 to only eliminate network/service.
Elimination of this service will be considered when all network configuration is unified under SMF; at that time, it is likely the loopback interfaces will be represented as instances of the IP service, in the same way as other IP interfaces.
Implementation experience from NWAM Phase 0 indicates that keeping this service, with separate instances for NWAM vs. Legacy mode, is a reasonable design. Doing so prevents disruption to the (many) services that depend on network/physical; switching between legacy and NWAM does not affect those dependencies, either, as the dependencies are generally on the service itself, not a specific instance; having either instance online thus satisfies the dependency.
Elimination of this service is not directly related to the NWAM phase 1 work; and in fact, other projects are currently working on removing at least parts of the network/initial functionality. A future phase of NWAM may address completely cleaning up this service, but NWAM phase 1 will not change it.
Completely eliminating this service would be difficult, as many other services have dependencies on it. This work will be postponed until a more comprehensive re-work of the network-related SMF services can be performed. However, this service will be modified in NWAM Phase 1.
It currently performs three tasks:
The service will continue to perform task two, but only for the non-NWAM case (i.e. network/physical:default is enabled, rather than network/physical:nwam). When NWAM is enabled, the nwam service will handle task two, as described in section 8.1.4.1. This distinction is necessary to avoid conflicts when NWAM is controlling network configuration; conflicts can occur because NWAM allows the network configuration to change dynamically, and thus potentially receive new DHCP leases, from different servers. This then requires that the DNS information stored in resolv.conf and nsswitch.conf be updated more often than during the boot sequence, which is typically the only time when network/service would be enabled.
The distinction between the non-NWAM and NWAM cases will be managed with a simple check at the beginning of the network/service start method, which will return success immediately if network/physical:nwam is online.
As explained above, given the dynamic configuration made possible by NWAM, updates to /etc/resolv.conf and /etc/nsswitch.conf need to happen more dynamically as well. The code to perform those updates currently exists in a script that 1) is generally executed only during boot and 2) performs other work that isn't necessary or even desirable at the same time the file updates are needed.
Phase 0 took the approach of simply exec'ing the network services start method when needed. This is not an appropriate solution, as the start method wound up needing to change its behavior depending on how it was executed, by nwamd or by a service (re)start.
In phase 1, nwamd will perform the file updates when it
detects lease acquisition events (via RTM_NEWADDR messages). As noted
in section 8.1.4, the network/service start method will continue to do
the updates for the legacy case when NWAM is not enabled.
Since network/physical:nwam manages all network configuration, stopping this service has the effect of shutting down the system's network capabilities. This is accomplished by tearing down all interfaces and links, and changing to the "NoNet" Location. This behavior would be drastic, though, in the case where the service went into the maintenance state, and was then returned to the online state after the error condition was corrected. It would also be heavy-handed in the case of a service refresh.
Thus there are two approaches to service restart that the NWAM service should take, depending on the circumstances of the restart. It could do a "hard reset," in which all network configuration is torn down and created from scratch, based on the active NCP; or it could do a "soft reset," where it only makes the changes needed to restore the active NCP.
The following table defines the type of reset desired under different service conditions.
| Service State | Action | Behavior |
|---|---|---|
| offline | start | hard reset |
| online | stop | hard reset (tear down only) |
| online | refresh | soft reset |
| maintenance | clear | soft reset |
This behavior will be accomplished by performing specific actions on normal start and stop of the NWAM service.
On startup, nwamd will first determine if it is doing a hard or soft
reset, depending on the presence of
/etc/svc/volatile/nwam/nwamd_soft_reset. For a soft reset it
will simply activate the current NCP, making whatever changes are needed
to the link/interface configuration to make it match the NCP. For a hard
reset, it will perform the following steps:
/etc/svc/volatile/nwam/nwamd_soft_reset
The function and use of the Legacy Location is discussed in more detail in section 8.3.
A service property will be provided to allow the administrator to request that nwamd do a soft shutdown, resulting in no disruption of network configuration and no removal of the marker file.
With NWAM Phase 1, there will be two separate repositories for network
configuration data: the legacy repository and the NWAM repository. The
legacy repository is made up of /etc/hostname.<intf>
and /etc/dladm.conf files, various other configuration files
associated with network services, and the enabled/disabled state and
properties of network-related SMF services; while the NWAM repository
is made up of the files under /etc/nwam.
The components of the legacy repository are consumed by different SMF services; some by svc:/network/physical:default, others by one of the other svc:/network services discussed in section 8.1, and still others by their own associated SMF service. The legacy repository components consumed by network/physical:default are not problematic; when NWAM is enabled, that service instance is disabled, so the configuration data it consumes will be ignored. Legacy configuration that is consumed by other services is more problematic, however; clear rules need to be established to define how the settings are handled under the different modes.
For the most part, the distinction between legacy mode link/interface configuration and NWAM NCP configuration is easily separable; most all of the legacy mode work is done in the network/physical:default start method, so when that instance is disabled, that configuration is not performed.
Tunnel configuration, however, is explicitly omitted from network/physical:default; it is currently performed in the start method of the network/initial service, and will be broken out into a new network/iptun service with the integration of the IP Tunneling Device Driver component of the Clearview project.
This will not be an issue for NWAM Phase 1. When the IP Tunneling Device Driver project and NWAM Phase 1 have both integrated, NWAM will be updated to handle tunnel links. At this time, NWAM will simply change the dependency of this service; instead of depending on network/physical, it will depend explicitly on network/physical:default, ensuring that it will only be enabled in legacy mode. Thus NWAM will be able to create and manage IP tunnels according to its NCP, with no interference with legacy mode configuration.
The distinction between the legacy and NWAM repositories becomes less clear for Location configuration. While NWAM has a distinct repository that stores Location settings, a Location is activated by placing that data into the standard file/service property used by the legacy repository. It will thus be necessary for NWAM to keep track of the legacy Location settings, which can be restored when switching to legacy mode.
Activation of a location consists of two distinct phases:
The instantiation step will be performed by a helper app of some sort, either a shell script or a c program; instantiation will be performed by a newly-created SMF service, svc:/network/location. In order to activate a location, nwamd will first invoke the helper app, and then restart network/location.
As installation of a location involves writing to the filesystem, it cannot be performed as early as network/physical, when nwamd is started. Thus installation will be broken into two stages: updating of SMF service properties and copying of files to /etc/svc/volatile, which is writable when network/physical comes online; and then copying from /etc/svc/volatile into the appropriate location in the filesystem after filesystem/usr has come online. The first step will be performed by the helper app; the second by network/location, which will have a dependency on filesystem/usr.
The security-related services upon which milestone/network currently depends will be given a dependency on network/location; thus ensuring that a location will be activated before services which use networking are allowed to come online.
During boot, these dependencies will be sufficient to ensure that things happen in the correct order: the network elements will be configured by network/physical and location selection made by nwamd, the location data installation will be completed by network/location (installing the legacy location if network/physical:default is enabled, the NoNet location if network/physical:nwam is enabled but nwamd has not yet selected a location), and then the location-related services will come online with the appropriate configuration in place.
This sequence must also happen if NWAM is disabled and legacy mode enabled, or vice versa; and when NWAM is running and chooses to change location. This will be accomplished by adding restarts/refreshes as needed to network/location's start method. Additionally, network/location will have a restart_on restart dependency on network/physical.
Thus if either the nwam or default instance of network/physical is disabled, network/location will be restarted, as will the location-related services. When NWAM is running and changes location, it will do so by performing the installation step (via the helper app), and then refreshing the network/location service, which will complete the installation and perform the appropriate service restarts/refreshes, thus instantiating the location.
When NWAM is disabled, it will perform a hard reset shutdown; that is, all links and interfaces will be torn down, and the NoNet Location will be activated, and the legacy location will be partially installed. Legacy mode can then be turned on by enabling network/physical:default; when network/location restarts (due to its restart_on restart dependency on network/physical), it will complete the installation of the legacy location, which will then be instantiated as the location-related services are restarted/refreshed.
Note that the record of the legacy location must be deleted after it has been installed; subsequent starts of the network/location service (as on reboot) should not change the location configuration if the system remains in legacy mode.
In order to minimize the changes to legacy mode behavior, no changes will be made to network/physical:default's existing stop method. Therefore, the network configuration will not be modified at all when that service is stopped. When NWAM is enabled, it will perform hard reset behavior (assuming that either NWAM has not been enabled since boot, or that it was stopped prior to legacy mode being enabled); this will result in the creation of the Legacy Location based on the current settings, the installation of the NoNet Location, and then a restart of the network/location service to instantiate the NoNet Location. As the active NCP is instantiated, the Location will be adjusted as needed, according to the Location selection logic; changes to the Location will result in additional restarts of the network/location service.
Section 8.3 clearly assumes that Legacy and NWAM modes are mutually
exclusive; that is, there will not be a case where both are enabled.
The obvious way to enforce that exclusivity would be to create EXCLUDE
dependencies, where each service instance depends on the other being
disabled. Unfortunately, exclude dependencies only work in one direction;
that is, one service can have an exclude dependency on another, but they
cannot be mutually exclusive. If service A and B each have exclude
dependencies on each other, and service A is online, enabling service B
results in
The next best alternative would be for each service to put itself into maintenance state if enabled while the other is already enabled. Going into maintenance state should be a very clear red flag to the administrator that something is wrong; and messaging explaining the problem and resolution will be easily obtained via the standard SMF diagnostic tools (error message in the service log file, information in the svcs -xv output). Appropriate changes will be made to each service instance's start method to implement this.
| Revision | Date | Changes |
|---|---|---|
| 0.1 | 2007-Sep-05 | initial draft with topic outline |
| 0.2 | 2008-Mar-04 | we're not using SMF as the repository; first cut at details |
| 0.3 | 2008-Mar-06 | added rationale for not using exclude_all dependencies |
| 0.4 | 2008-Mar-14 | added additional notes on restart_on attribute question (§8.3.2.1) |
| 0.5 | 2008-Apr-08 | complete §8.3.1 question |
| 0.6 | 2008-Apr-17 | complete §8.1.4 question |
| 0.7 | 2008-Apr-29 | update §8.3.2 with more details on location activation |
| 0.8 | 2008-Dec-23 | update based on implementation experience: tunnel support and dhcp event notifications are post-phase-1; clarified legacy location handling on shutdown; reorganized location activation behavior. |
| 0.9 | 2009-Jan-27 | miscellaneous clean-up |
| 0.10 | 2009-Mar-17 | pre-psarc review feedback |