Example 1 ========= We will simulate a service restarting too frequently by killing a process in the service contract. The service restarts ok each time, but if we kill it repeatedly then SMF will fail it. We start with a clean fmd and SMF state: root@parity:/var/fm# svcs -x root@parity:/var/fm# fmadm faulty We will pick on the intrd service: root@parity:/var/fm# svcs intrd STATE STIME FMRI online 22:44:10 svc:/system/intrd:default Online, but trouble is brewing: root@parity:/var/fm# pkill intrd root@parity:/var/fm# pkill intrd root@parity:/var/fm# pkill intrd root@parity:/var/fm# pkill intrd Console ------- The list.suspect is being rendered by syslog (the default); we see: SUNW-MSG-ID: SMF-8000-YX, TYPE: defect, VER: 1, SEVERITY: major EVENT-TIME: Wed May 12 22:52:47 PDT 2010 PLATFORM: Sun-Fire-V40z, CSN: XG051535088, HOSTNAME: parity SOURCE: software-diagnosis, REV: 0.1 EVENT-ID: 915cb64b-e16b-4f49-efe6-de81ff96fce7 DESC: A service failed - it is restarting too quickly. Refer to http://sun.com/msg/SMF-8000-YX for more information. AUTO-RESPONSE: The service has been placed into the maintenance state. IMPACT: svc:/system/intrd:default is unavailable. REC-ACTION: Run 'svcs -xv svc:/system/intrd:default' to determine why the service failed and the location of logfiles, if any. fmadm faulty ------------ root@parity:/var/fm# fmadm faulty --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- May 12 22:52:47 915cb64b-e16b-4f49-efe6-de81ff96fce7 SMF-8000-YX major Host : parity Platform : Sun-Fire-V40z Chassis_id : XG051535088 Product_sn : Fault class : defect.sunos.smf.svc.maintenance Affects : svc:///system/intrd:default faulted and taken out of service Problem in : svc:///system/intrd:default faulted and taken out of service Description : A service failed - it is restarting too quickly. Refer to http://sun.com/msg/SMF-8000-YX for more information. Response : The service has been placed into the maintenance state. Impact : svc:/system/intrd:default is unavailable. Action : Run 'svcs -xv svc:/system/intrd:default' to determine why the service failed and the location of logfiles, if any. svcs -xv -------- Running svcs -xc as instructed above: root@parity:/var/fm# svcs -xv svc:/system/intrd:default svc:/system/intrd:default (interrupt balancer) State: maintenance since Wed May 12 22:52:47 2010 Reason: Restarting too quickly. See: http://sun.com/msg/SMF-8000-L5 See: man -M /usr/share/man -s 1M intrd See: /var/svc/log/system-intrd:default.log Impact: This service is not running. Note that SMF-8000-YX is the generic "service in maintenance" article id, and is not customized for different maintenance reasons (the .po dictionary entry for it is customized to insert the affected FMRI, as above, but the static knowledge article cannot do that). The svcs -xv output points to a more-specific article SMF-8000-L5 fmdump -m --------- root@parity:/var/fm# fmdump -m -u 915cb64b-e16b-4f49-efe6-de81ff96fce7 SUNW-MSG-ID: SMF-8000-YX, TYPE: defect, VER: 1, SEVERITY: major EVENT-TIME: Wed May 12 22:52:47 PDT 2010 PLATFORM: Sun-Fire-V40z, CSN: XG051535088, HOSTNAME: parity SOURCE: software-diagnosis, REV: 0.1 EVENT-ID: 915cb64b-e16b-4f49-efe6-de81ff96fce7 DESC: A service failed - it is restarting too quickly. Refer to http://sun.com/msg/SMF-8000-YX for more information. AUTO-RESPONSE: The service has been placed into the maintenance state. IMPACT: svc:/system/intrd:default is unavailable. REC-ACTION: Run 'svcs -xv svc:/system/intrd:default' to determine why the service failed and the location of logfiles, if any. fmdump -Vp ---------- root@parity:/var/fm# fmdump -Vp -u 915cb64b-e16b-4f49-efe6-de81ff96fce7 TIME UUID SUNW-MSG-ID May 12 2010 22:52:47.492298000 915cb64b-e16b-4f49-efe6-de81ff96fce7 SMF-8000-YX TIME CLASS ENA May 12 22:52:47.4459 ireport.os.smf.state-transition.maintenance 0x0000000000000000 nvlist version: 0 version = 0x0 class = list.suspect uuid = 915cb64b-e16b-4f49-efe6-de81ff96fce7 code = SMF-8000-YX diag-time = 1273729967 454310 de = fmd:///module/software-diagnosis fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.smf.svc.maintenance certainty = 0x64 asru = svc:///system/intrd:default resource = svc:///system/intrd:default reason-short = restarting_too_quickly reason-long = it is restarting too quickly svc-string = svc:/system/intrd:default (end fault-list[0]) fault-status = 0x3 severity = major __ttl = 0x1 __tod = 0x4beb93af 0x1d57df10 fmdump -IVp ----------- root@parity:/var/fm# fmdump -IVp TIME UUID May 12 2010 22:52:47.445951257 915cb64b-e16b-4f49-efe6-de81ff96fce7 nvlist version: 0 version = 0x0 class = ireport.os.smf.state-transition.maintenance uuid = 915cb64b-e16b-4f49-efe6-de81ff96fce7 detector = sw:///:path=/lib/svc/bin/svc.startd#:file=graph.c:line=4743 pri = high attr = (embedded nvlist) nvlist version: 0 svc = svc:///system/intrd:default svc-string = svc:/system/intrd:default from-state = offline to-state = maintenance reason-version = 0x1 reason-short = restarting_too_quickly reason-long = it is restarting too quickly (end attr) __ttl = 0x1 __tod = 0x4beb93af 0x1a94ad19