Events For SMF Transitions ========================== 1 Event Classes --------------- We introduce 6 new classes ireport.os.smf.state-transition.: ireport.os.smf.state-transition.maintenance ireport.os.smf.state-transition.uninitialized ireport.os.smf.state-transition.online ireport.os.smf.state-transition.offline ireport.os.smf.state-transition.degraded ireport.os.smf.state-transition.maintenance Event class names are all Committed. 2 Event Payload --------------- An ireport event has standard payload as described in portfolio 2010.006. The class-specific component is an nvlist named 'attr' within the event payload. We will define the 'attr' nvlist for our six events. Attr Member Type Stability --------------- ------- ----------------------------------- svc fmri Committed, (fmri) svc-string string Committed from-state string Committed to-state string Committed reason-version uint32 Committed reason-short string Committed, Committed by version reason-long string Committed, Volatile Attr Member Description --------------- ------------------------------------------ svc "svc" scheme FMRI svc-string Short style service instance string for the affected service as used in SMF documentation and command lines (i.e., svc:/foo/bar instead of svc:///foo/bar). from-state State transitioning from. One of { "maintenance", "unitialized", "online", "offline", "degraded" } to-state State transitioning to. One of { "maintenance", "unitialized", "online", "offline", "degraded" } reason-version Revision number of the reason string namespace used for reason-short. See below. reason-short Reason for transition, short form - see below. This text is not localized. reason-long Wordy reason for transition - see below. Localized. 3 Transition Reasons -------------------- Restarter Dependence Note that the reason-short and reason-long included in a transition event are restarter-dependent - the reason for the transition is provided to the graph engine by the restarter, and it may choose to provide no reason, select from the set we document below, or utilize a custom set. We define a standard set for the system-provided restarters in the table below. The standard system restarter svc.startd will always utilize this set, and the inetd restarter will use a subset of these reasons. If some other restarter is in use for a particular service then we make no commitment as to what reasons will be provided in events for services with that custom restarter. See the final section if this document for a list of which reasons are used by each restarter. reason-long The reason-long member provides a wordy description of the reason for the transition. The phrasing for each may change over time (Volatile), but we do assure that the reason-long will always be usable in the following contexts: An instance transitioned state: %s A service failed: %s Reason: %s The service transitioned state (%s) and ... For all but a reason-short of "none" we also assure: An instance transitioned because %s, and ... An instance transitioned to because %s, and ... The reason-long strings do not start with a capital letter, and do not end in a period (or full-stop). [Aside: I think that is to say that reason-long, with the exception of that for reason-short of "none", forms an independent clause - i.e., having a subject, a predicate, and a complete thought (aka a simple sentence if initial capitalization and terminating punctuation is added). The reason-long is localized - that is, it can be translated to other locales and so be presented in the selected language. reason-short The reason-short string is intended to be used in filtering events in order to monitor particular circumstances. For example to monitor all service restarts one would look for transition events with a reason-short of "restart_request". That this reason-short describes restarts and that all restarts will use this reason-short is assured by the Committed attribute (but see reason-version below). The reason-short is not localized. reason-version Each version/revision of the set of reason-short strings has Committed stability attributes, meaning that any given revision will not change incompatibly. This means that within a given version: o the semantics described for any given reason-short string are fixed: that reason-short will always imply those semantics, and any transition with those semantics will use that reason-short. o it follows that a given reason-short string cannot be split/refined into multiple new reason codes without requiring a version change o new reason-short strings can be added to a version without requiring a version change, provided the additions are orthogonal to the rest of the set; other additions must change the revision A consumer is required to check that the version in use for the events it is receving is one it understands. A consumer must also be prepared for new reasons being added to the set without a change in revision (as above). It is anticipated that the reason-version will change infrequently. reason-short Set, Version 1 We will list what is defined to be version 1 of the reason-short namespace. Below we list all possible transition reasons in both short and long form, and with a description. In the description we will write 's1 -> s2' to denote a transition from state s1 to state s2, and 's1 -> s2 -> s3' to denote a transition from state s1 to state s3 via an intermediate state of s2 (up to two events can be generated, one for s1 -> s2 and one for s2 -> s3; both events will have the same reason even if the transition s2 -> s2 would normally have a different reason). The format of the list below is as follows: "reason-short" "reason-long" Description This table describes version 1 of the reason-short namespace. "none" "the restarter gave no reason" Any transition for which the restarter has not provided a reason. "administrative_request" "maintenance was requested by an administrator" A transition to maintenance state due to a 'svcadm mark maintenance '. *Not* used if the libscf interface smf_maintain_instance(3SCF) is used to request maintenance. "bad_repo_state" "an SMF repository inconsistency exists" A transition to maintenance state if a repository inconsistency exists when the service/instance state is first read by startd into the graph engine (this can also happen during startd restart). "clear_request" "maintenance clear was requested by an administrator" A transition 'maintenance -> uninitialized' resulting always from 'svcadm clear '. *Not* used if the libscf interface smf_restore_instance(3SCF) is used. "ct_ev_core" "a process dumped core" A transition 'online -> offline' due to a process core dump. "ct_ev_exit" "all processes in the service have exited" A transition 'online -> offline' due to an empty process contract, i.e., the last process in a contract type service has exited. "ct_ev_hwerr" "a process was killed due to uncorrectable hardware error" A transition 'online -> offline' due to a hardware error. "ct_ev_signal" "a process received a fatal signal from outside the service" A transition 'online -> offline' due to a process in the service having received a fatal signal originating from outside the service process contract. "dependencies_satisfied" "all dependencies have been satisfied" A transition 'offline -> online' when all dependencies for the service have been met. "dependency_activity" "a dependency activity required a stop" A transition 'online -> offline' because some dependency for the service is no-longer met. "dependency_cycle" "a dependency cycle exists" A transition to maintenance state due to a cycle in the service dependencies. "disable_request" "a disable was requested" A transition 'online -> offline -> disabled' due to a 'svcadm disable [-t] ' or smf_disable_instance(3SCF) call. "enable_request" "an enable was requested" A transition 'disabled -> offline' due to a 'svcadm enable [-t] ' or smf_enable_instance(3SCF) call. "fault_threshold_reached" "a method is failing in a retryable manner but too often" A transition to maintenance state when a method fails repeatedly for a retryable reason. "insert_in_graph" "the instance was inserted in the graph" A transition to uninitialized state when startd reads the service configuration and inserts it into the graph engine. "invalid_dependency" "a service has an invalid dependency" A transition to maintenance state due to an invalid dependency declared for the service. "invalid_restarter" "the service restarter is invalid" A transition to maintenance state because the service-declared restarter is invalid. "method_failed" "a start, stop or refresh method failed" A transition to maintenance state because a restarter method exited with one of SMF_EXIT_ERR_CONFIG, SMF_EXIT_ERR_NOSMF, SMF_EXIT_ERR_PERM, or SMF_EXIT_ERR_FATAL. "per_configuration" "the SMF repository configuration specifies this state" A transition 'uninitialized -> {disabled|offline}' after "insert_in_graph" to match the state configured in the repository. "restart_request" "a restart was requested" A transition 'online -> offline -> online' due to a 'svcadm restart or equivlaent libscf API call. Both the 'online -> offline' and 'offline -> online' transtions specify this reason. "restarting_too_quickly" "the instance is restarting too quickly" A transition to maintenance state because the start method is being executed successfully but too frequently. "service_request" "maintenance was requested by another service" A transition to maintenance state due a service requesting 'svcadm mark maintenance ' or equivalent libscf API call. A command line 'svcadm mark maintenance ' does not produce this reason - it produces administrative_request instead. Restarter Use of Reason Strings ------------------------------- Restarter Set of reason-short version 1 strings used --------------- ----------------------------------------------------- svc.startd All from above table, but excluding "none" inetd "none", "service_request", "administrative_request" (custom) Any set, not necessarily from the above table