Version 0.7, 2008-Apr-29
Network configuration is currently scattered across several SMF services:
In NWAM Phase 1, a large portion of the network/service service will be eliminated, and the behavior of the network/physical legacy (:default) and NWAM (:nwam) instances will be clarified.
The NWAM Architecture discussed the elimination of all four of the SMF services listed above. After some implementation experience, and with delivery targets in mind, we have narrowed down the scope for Phase 1 to only eliminate network/service. (The external architecture document linked above is meant to be informative, not normative for this project.)
Elimination of this service will be considered when all network configuration is unified under SMF; at that time, it is likely the loopback interfaces will be represented as instances of the IP service, in the same way as other IP interfaces.
Implementation experience from NWAM Phase 0 indicates that keeping this service, with separate instances for NWAM vs. Legacy mode, is a reasonable design. Doing so prevents disruption to the (many) services that depend on network/physical; switching between legacy and NWAM does not affect those dependencies, either, as the dependencies are generally on the service itself, not a specific instance; having either instance online thus satisfies the dependency.
Elimination of this service is not directly related to the NWAM phase 1 work; and in fact, other projects are currently working on removing at least parts of the network/initial functionality. A future phase of NWAM may address completely cleaning up this service, but NWAM phase 1 will not change it.
Completely eliminating this service would be difficult, as many other services have dependencies on it. This work will be postponed until a more comprehensive re-work of the network-related SMF services can be performed. However, this service will be modified in NWAM Phase 1.
It currently performs three tasks:
The service will continue to perform task two, but only for the non-NWAM case (i.e. network/physical:default is enabled, rather than network/physical:nwam). When NWAM is enabled, the nwam service will handle task two, as described in section 8.1.4.1. This distinction is necessary to avoid conflicts when NWAM is controlling network configuration; conflicts can occur because NWAM allows the network configuration to change dynamically, and thus potentially receive new DHCP leases, from different servers. This then requires that the DNS information stored in resolv.conf and nsswitch.conf be updated more often than during the boot sequence, which is typically the only time when network/service would be enabled.
The distinction between the non-NWAM and NWAM cases will be managed with a simple check at the beginning of the network/service start method, which will return success immediately if network/physical:nwam is online.
As explained above, given the dynamic configuration made possible by NWAM, updates to /etc/resolv.conf and /etc/nsswitch.conf need to happen more dynamically as well. The code to perform those updates currently exists in a script that 1) is generally executed only during boot and 2) performs other work that isn't necessary or even desirable at the same time the file updates are needed; obviously, some changes need to be made.
Several approaches have been considered:
The NWAM Phase 0 solution: nwamd simply re-execs the start method for network/service, which checks to see if it has been invoked by nwamd and exits before doing the additional, unrelated work in that case.
This works, but is not a clean solution to the problem.
Move the update functionality into dhcpagent(1M), or into a library that can be called by dhcpagent, when it obtains a lease.
Though simple, this is not architecturally appropriate; dhcpagent should not be in the business of deciding what configuration it receives should be applied, or where it should be used.
Extend the existing scripting mechanisms provided by dhcpagent(1M): services could provide scripts that would be invoked by dhcpagent on state changes; thus NWAM could provide a script that performed the appropriate updates when a lease was obtained.
Provide a mechanism for dhcpagent(1M) to signal other processes when specific events occur; nwamd could then receive those signals and do the appropriate file updates when the lease was obtained. This would also provide a more reliable notification of lease acquisition than the RTM_NEWADDR messages currently being used in NWAM Phase 0.
The signaling mechanism provided by dhcpagent will be designed, arc'd and implemented separately from the NWAM Phase 1 project, but will be a requirement for NWAM Phase 1 integration, and will be resourced by the NWAM team. This design will assume the existence of a "Lease Acquired" event which it will receive from dhcpagent.
Since network/physical:nwam manages all network configuration, stopping this service has the effect of shutting down the system's network capabilities. This is accomplished by tearing down all interfaces and links, and changing to the "No-network" Location. This behavior would be drastic, though, in the case where the service went into the maintenance state, and was then returned to the online state after the error condition was corrected. It would also be heavy-handed in the case of a service refresh.
Thus there are two approaches to service restart that the NWAM service should take, depending on the circumstances of the restart. It could do a "hard reset," in which all network configuration is torn down and created from scratch, based on the active NCP; or it could do a "soft reset," where it only makes the changes needed to restore the active NCP.
The following table defines the type of reset desired under different service conditions.
| Service State | Action | Behavior |
|---|---|---|
| offline | start | hard reset |
| online | stop | hard reset (tear down only) |
| online | refresh | soft reset |
| maintenance | clear | soft reset |
This behavior will be accomplished by performing specific actions on normal start and stop of the NWAM service.
On startup, nwamd will first determine if it is doing a hard or soft
reset, depending on the presence of /var/run/nwam_soft_reset.
For a soft reset it will simply activate the current NCP, making whatever
changes are needed to the link/interface configuration to make it match
the NCP. For a hard reset, it will perform the following steps:
/var/run/nwam_soft_resetThe function and use of the Legacy Location is discussed in more detail in section 8.3.
A service property will be provided to allow the administrator to request that nwamd do a soft shutdown, resulting in no disruption of network configuration and no removal of the marker file.
With NWAM Phase 1, there will be two separate repositories for network
configuration data: the legacy repository and the NWAM repository. The
legacy repository is made up of /etc/hostname.<intf>
and /etc/dladm.conf files, various other configuration files
associated with network services, and the enabled/disabled state and
properties of network-related SMF services; while the NWAM repository
is made up of the files under /etc/nwam.
The components of the legacy repository are consumed by different SMF services; some by svc:/network/physical:default, others by one of the other svc:/network services discussed in section 8.1, and still others by their own associated SMF service. The legacy repository components consumed by network/physical:default are not problematic; when NWAM is enabled, that service instance is disabled, so the configuration data it consumes will be ignored. Legacy configuration that is consumed by other services is more problematic, however; clear rules need to be established to define how the settings are handled under the different modes.
For the most part, the distinction between legacy mode link/interface configuration and NWAM NCP configuration is easily separable; most all of the legacy mode work is done in the network/physical:default start method, so when that instance is disabled, that configuration is not performed. However, tunnel configuration is explicitly omitted from network/physical:default; it is currently performed in the start method of network/initial service, and will be broken out into a new network/iptun service with the integration of the IP Tunneling Device Driver component of the Clearview project.
Since the new network/iptun service will be in place prior to the integration of NWAM Phase 1, NWAM will simply change the dependency of this service; instead of depending on network/physical, it will depend explicitly on network/physical:default, ensuring that it will only be enabled in legacy mode. Thus NWAM will be able to create/manage IP tunnels according to its NCP.
The distinction between the legacy and NWAM repositories becomes less clear for Location configuration. While NWAM has a distinct repository that stores Location settings, a Location is activated by placing that data into the standard file/service property used by the legacy repository. It will thus be necessary for NWAM to keep track of the legacy Location settings, which can be restored when switching to legacy mode.
Activation of a location consists of two distinct phases:
When it activates a location, nwamd will explicitly perform the installation step; SMF service dependencies will be used to perform instantiation.
As installation of a location involves writing to the filesystem, it cannot be performed as early as network/physical. Thus a new service, svc:/network/location will be introduced. This service will depend on svc:/system/filesystem/usr; and a dependency on svc:/network/location will be added to each location-related service. The new service will also depend on network/physical.
During boot, these dependencies will be sufficient to ensure that things happen in the correct order: the network elements will be configured by network/physical, the location data will be installed by network/location (installing the legacy location if network/physical:default is enabled, the no-net location if network/physical:nwam is enabled), and then the location-related services will come online with the appropriate configuration in place.
This sequence must also happen if NWAM is disabled and legacy mode enabled, or vice versa; and when NWAM is running and chooses to change location. This will be accomplished by giving the dependencies an appropriate restart_on attribute:
Thus if either the nwam or default instance of network/physical is disabled, network/location will be restarted, as will the location-related services. When NWAM is running and changes location, it will do so by performing the installation step, and then refreshing the network/location service, which will cause the location-related services to be restarted, thus instantiating the location.
When NWAM is disabled, it will perform a hard reset shutdown; that is, all links and interfaces will be torn down, and the No-Net Location will be activated. Legacy mode can then be turned on by enabling network/physical:default; when network/location restarts (due to its restart_on restart dependency on network/physical), it will install the legacy location, which will then be instantiated as the location-related services are restarted due to their restart_on refresh dependencies on network/location.
Note that the record of the legacy location must be deleted after it has been installed; subsequent starts of the network/location service (as on reboot) should not change the location configuration if the system remains in legacy mode.
In order to minimize the changes to legacy mode behavior, no changes will be made to network/physical:default's existing stop method. Therefore, the network configuration will not be modified at all when that service is stopped. When NWAM is enabled, it will perform hard reset behavior (assuming that either NWAM has not been enabled since boot, or that it was stopped prior to legacy mode being enabled); this will result in the creation of the Legacy Location based on the current settings, the installation of the No-Net Location, and then a refresh of the network/location service to instantiate the No-Net Location. As the active NCP is instantiated, the Location will be adjusted as needed, according to the Location selection logic; changes to the Location will result in additional refreshes of the network/location service.
Section 8.3 clearly assumes that Legacy and NWAM modes are mutually
exclusive; that is, there will not be a case where both are enabled.
The obvious way to enforce that exclusivity would be to create EXCLUDE
dependencies, where each service instance depends on the other being
disabled. Unfortunately, exclude dependencies only work in one direction;
that is, one service can have an exclude dependency on another, but they
cannot be mutually exclusive. If service A and B each have exclude
dependencies on each other, and service A is online, enabling service B
results in
The next best alternative would be for each service to put itself into maintenance state if enabled while the other is already enabled. Going into maintenance state should be a very clear red flag to the administrator that something is wrong; and messaging explaining the problem and resolution will be easily obtained via the standard SMF diagnostic tools (error message in the service log file, information in the svcs -xv output). Appropriate changes will be made to each service instance's start method to implement this.
| Revision | Date | Changes |
|---|---|---|
| 0.1 | 2007-Sep-05 | initial draft with topic outline |
| 0.2 | 2008-Mar-04 | we're not using SMF as the repository; first cut at details |
| 0.3 | 2008-Mar-06 | added rationale for not using exclude_all dependencies |
| 0.4 | 2008-Mar-14 | added additional notes on restart_on attribute question (§8.3.2.1) |
| 0.5 | 2008-Apr-08 | complete §8.3.1 question |
| 0.6 | 2008-Apr-17 | complete §8.1.4 question |
| 0.7 | 2008-Apr-29 | update §8.3.2 with more details on location activation |