1 System Administration Commands in.mpathd(1M)
2
3
4
5 NAME
6 in.mpathd - daemon for network adapter (NIC) failure detec-
7 tion, recovery, automatic failover and failback
8
9 SYNOPSIS
10 /usr/lib/inet/in.mpathd
11
12
13 DESCRIPTION
14 The in.mpathd daemon performs Network Interface Card (NIC)
15 failure and repair detection. In the event of a NIC failure,
16 it causes IP network access from the failed NIC to failover
17 to a standby NIC, if available, or to any another opera-
18 tional NIC that has been configured as part of the same net-
19 work multipathing group. Once the failed NIC is repaired,
20 all network access is restored to the repaired NIC.
21
22
23 The in.mpathd daemon can detect NIC failure and repair
24 through two methods: by monitoring the IFF_RUNNING flag for
25 each NIC (link-based failure detection), and by sending and
26 receiving ICMP echo requests and replies on each NIC
27 (probe-based failure detection). Link-based failure detec-
28 tion requires no explicit configuration and thus is always
29 enabled (provided the NIC driver supports the feature);
30 probe-based failure detection must be enabled through the
31 configuration of one or more test addresses (described
32 below), but has the benefit of testing the entire NIC send
33 and receive path.
34
35
36 If only link-based failure detection is enabled, then the
37 health of the interface is determined solely from the state
38 of the IFF_RUNNING flag. Otherwise, the interface is con-
39 sidered failed if either of the two methods indicate a
40 failure, and repaired once both methods indicate the failure
41 has been corrected. Not all interfaces in a group need to be
42 configured with the same failure detection methods.
43
44
45 As mentioned above, in order to perform probe-based failure
46 detection in.mpathd needs a special test address on each NIC
47 for the purpose of sending and receiving probes on the NIC.
48 Use the ifconfig command -failover option to configure these
49 test addresses. See ifconfig(1M). The test address must
50 belong to a subnet that is known to the hosts and routers on
51 the link.
52
53
54 The in.mpathd daemon can detect NIC failure and repair by
55 two methods, by sending and receiving ICMP echo requests and
56 replies on each NIC, and by monitoring the IFF_RUNNING flag
57
58
59
60 SunOS 5.11 Last change: 8 Sep 2006 1
61
62
63
64
65
66
67 System Administration Commands in.mpathd(1M)
68
69
70
71 for each NIC. The link state on some models of NIC is indi-
72 cated by the IFF_RUNNING flag, allowing for faster failure
73 detection when the link goes down. The in.mpathd daemon con-
74 siders a NIC to have failed if either of the above two
75 methods indicates failure. A NIC is considered to be
76 repaired only if both methods indicate the NIC is repaired.
77
78
79 The in.mpathd daemon sends the ICMP echo request probes to
80 on-link routers. If no routers are available, it sends the
81 probes to neighboring hosts. Thus, for network failure
82 detection and repair, there must be at least one neighbor on
83 each link that responds to ICMP echo request probes.
84
85
86 in.mpathd works on both IPv4 and IPv6. If IPv4 is plumbed on
87 a NIC, an IPv4 test address is configured on theNIC, and the
88 NIC is configured as part of a network multipathing group,
89 then in.mpathd will start sending ICMP probes on the NIC
90 using IPv4.
91
92
93 In the case of IPv6, the link-local address must be config-
94 ured as the test address. The in.mpathd daemon will not
95 accept a non-link-local address as a test address. If the
96 NIC is part of a multipathing group, and the test address
97 has been configured, then in.mpathd will probe the NIC for
98 failures using IPv6.
99
100
101 Even if both the IPv4 and IPv6 protocol streams are plumbed,
102 it is sufficient to configure only one of the two, that is,
103 either an IPv4 test address or an IPv6 test address on a
104 NIC. If only an IPv4 test address is configured, it probes
105 using only ICMPv4. If only an IPv6 test address is config-
106 ured, it probes using only ICMPv6. If both type test
107 addresses are configured, it probes using both ICMPv4 and
108 ICMPv6.
109
110
111 The in.mpathd daemon accesses three variable values in
112 /etc/default/mpathd: FAILURE_DETECTION_TIME, FAILBACK and
113 TRACK_INTERFACES_ONLY_WITH_GROUPS.
114
115
116 The FAILURE_DETECTION_TIME variable specifies the NIC
117 failure detection time for the ICMP echo request probe
118 method of detecting NIC failure. The shorter the failure
119 detection time, the greater the volume of probe traffic. The
120 default value of FAILURE_DETECTION_TIME is 10 seconds. This
121 means that NIC failure will be detected by in.mpathd within
122 10 seconds. NIC failures detected by the IFF_RUNNING flag
123
124
125
126 SunOS 5.11 Last change: 8 Sep 2006 2
127
128
129
130
131
132
133 System Administration Commands in.mpathd(1M)
134
135
136
137 being cleared are acted on as soon as the in.mpathd daemon
138 notices the change in the flag. The NIC repair detection
139 time cannot be configured; however, it is defined as double
140 the value of FAILURE_DETECTION_TIME.
141
142
143 By default, in.mpathd does failure detection only on NICs
144 that are configured as part of a multipathing group. You can
145 set TRACK_INTERFACES_ONLY_WITH_GROUPS to no to enable
146 failure detection by in.mpathd on all NICs, even if they are
147 not part of a multipathing group. However, in.mpathd cannot
148 do failover from a failed NIC if it is not part of a mul-
149 tipathing group.
150
151
152 The in.mpathd daemon will restore network traffic back to
153 the previously failed NIC, after it has detected a NIC
154 repair. To disable this, set the value of FAILBACK to no in
155 /etc/default/mpathd.
156
157 FILES
158 /etc/default/mpathd Contains default values used by the
159 in.mpathd daemon.
160
161
162 ATTRIBUTES
163 See attributes(5) for descriptions of the following attri-
164 butes:
165
166
167
168 ____________________________________________________________
169 | ATTRIBUTE TYPE | ATTRIBUTE VALUE |
170 |_____________________________|_____________________________|
171 | Availability | SUNWcsr |
172 |_____________________________|_____________________________|
173
174
175 SEE ALSO
176 ifconfig(1M), attributes(5), icmp(7P), icmp6(7P),
177
178
179 DIAGNOSTICS
180 Test address address is not unique; disabling probe based
181 failure detection on interface_name
182 Description:
183
184
185 For in.mpathd to perform probe-based failure detection,
186 each test address in the group must be unique. Since the
187 IPv6 test address is a link-local address derived from
188 the MAC address, each IP interface in the group must
189
190
191
192 SunOS 5.11 Last change: 8 Sep 2006 3
193
194
195
196
197
198
199 System Administration Commands in.mpathd(1M)
200
201
202
203 have a unique MAC address.
204
205
206
207 NIC interface_name of group group_name is not plumbed for
208 IPv[4|6] and may affect failover capability
209 Description:
210
211
212 All NICs in a multipathing group must be homogeneously
213 plumbed. For example, if a NIC is plumbed for IPv4, then
214 all NICs in the group must be plumbed for IPv4. The
215 streams modules pushed on all NICs must be identical.
216
217
218
219 No test address configured on interface interface_name disa-
220 bling probe-based failure detection on it
221 Description:
222
223
224 In order for in.mpathd to perform probe-based failure
225 detection on a NIC, it must be configured with a test
226 address: IPv4, IPv6, or both.
227
228
229
230 The link has come up on interface_name more than 2 times in
231 the last minute; disabling failback until it stabilizes.
232 Description:
233
234
235 In order to prevent interfaces with intermittent
236 hardware, such as a bad cable, from causing repeated
237 failovers and failbacks, in.mpathd does not failback to
238 interfaces with frequently fluctuating link states.
239
240
241
242 Invalid failure detection time assuming default 10000
243 Description:
244
245
246 An invalid value was encountered for
247 FAILURE_DETECTION_TIME in the /etc/default/mpathd file.
248
249
250
251 Too small failure detection time of time assuming minimum
252 100
253 Description:
254
255
256
257
258 SunOS 5.11 Last change: 8 Sep 2006 4
259
260
261
262
263
264
265 System Administration Commands in.mpathd(1M)
266
267
268
269 The minimum value that can be specified for
270 FAILURE_DETECTION_TIME is currently 100 milliseconds.
271
272
298 The round trip time for ICMP probes is higher than
299 necessary to maintain the current failure detection
300 time. The network is probably congested or the probe
301 targets are loaded. in.mpathd automatically increases
302 the failure detection time to whatever it can achieve
303 under these conditions.
304
305
306
307 Improved failure detection time time ms on (inet[6]
308 interface_name) for group group_name
309 Description:
310
311
312 The round trip time for ICMP probes has now decreased
313 and in.mpathd has lowered the failure detection time
314 correspondingly.
315
316
317
318 NIC failure detected on interface_name
319 Description:
320
321
322
323
324 SunOS 5.11 Last change: 8 Sep 2006 5
325
326
327
328
329
330
331 System Administration Commands in.mpathd(1M)
332
333
334
335 in.mpathd has detected NIC failure on interface_name,
336 and has set the IFF_FAILED flag on NIC interface_name.
337
338
339
340 Successfully failed over from NIC interface_name1 to NIC
341 interface_name2
342 Description:
343
344
345 in.mpathd has caused the network traffic to failover
346 from NIC interface_name1 to NIC interface_name2, which
347 is part of the multipathing group.
348
349
350
351 NIC repair detected on interface_name
352 Description:
353
354
355 in.mpathd has detected that NIC interface_name is
356 repaired and operational. If the IFF_FAILED flag on the
357 NIC was previously set, it will be reset.
358
359
360
361 Successfully failed back to NIC interface_name
362 Description:
363
364
365 in.mpathd has restored network traffic back to NIC
366 interface_name, which is now repaired and operational.
367
368
369
370 The link has gone down on interface_name
371 Description:
372
373
374 in.mpathd has detected that the IFF_RUNNING flag for NIC
375 interface_name has been cleared, indicating the link has
376 gone down.
377
378
379
380 The link has come up on interface_name
381 Description:
382
383
384 in.mpathd has detected that the IFF_RUNNING flag for NIC
385 interface_name has been set, indicating the link has
386 come up.
387
388
389
390 SunOS 5.11 Last change: 8 Sep 2006 6
391
392
393
394
395
396
397 System Administration Commands in.mpathd(1M)
398
399
400
401
402
403
404
405
|
1 System Administration Commands in.mpathd(1M)
2
3
4
5 NAME
6 in.mpathd - IP multipathing daemon
7
8 SYNOPSIS
9 /usr/lib/inet/in.mpathd
10
11
12 DESCRIPTION
13
14 The *in.mpathd* daemon performs failure and repair detection
15 for IP interfaces that have been placed into an IPMP group
16 (or optionally, for all IP interfaces on the system). It
17 also controls which IP interfaces in an IPMP group are
18 "active" (being used by the system to send or receive IP
19 data traffic) in a manner which is consistent with the
20 administrator's configured policy.
21
22
23 The *in.mpathd* daemon can detect IP interface failure and
24 repair through two methods: by monitoring the *IFF_RUNNING*
25 flag for each IP interface (link-based failure detection),
26 and by sending and receiving ICMP probes on each IP
27 interface (probe-based failure detection). Link-based
28 failure detection is instantaneous and is always enabled
29 (provided the network driver supports the feature);
30 probe-based failure detection must be enabled through the
31 configuration of one or more test addresses (described
32 below), but tests the entire IP interface send and receive
33 path. The *ipmpstat(1M)* utility can be used to check which
34 failure detection methods are enabled.
35
36
37 If only link-based failure detection is enabled, then the
38 health of the interface is determined solely from the state
39 of the IFF_RUNNING flag. Otherwise, the interface is con-
40 sidered failed if either of the two methods indicate a
41 failure, and repaired once both methods indicate the failure
42 has been corrected. Not all interfaces in a group need to be
43 configured with the same failure detection methods.
44
45
46 As mentioned above, to perform probe-based failure detection
47 *in.mpathd* requires a test address on each IP interface for
48 the purpose of sending and receiving probes. Each address
49 must be marked *NOFAILOVER* (see *ifconfig(1M)*) and
50 *in.mpathd* will be limited to probing targets on the same
51 subnet. Each address may be configured statically or
52 acquired via DHCP. To find targets, *in.mpathd* first
53 consults the routing table for routes on the same subnet,
54 and uses the specified next-hop. If no routes match, it
55 sends all-hosts ICMP probes and selects a subset of the
56 systems that respond. Thus, for probe-based failure
57 detection to operate, there must be at least one neighbor on
58 each subnet that responds to ICMP echo request probes. The
59 *ipmpstat(1M)* utility can be used to display both the
60 current probe target information and the status of sent
61 probes.
62
63
64 Both IPv4 and IPv6 are supported. If an IP interface is
65 plumbed for IPv4 and an IPv4 test address is configured then
66 *in.mpathd* will start sending ICMPv4 probes over that IP
67 interface. Similarly, if an IP interface is plumbed for
68 IPv6 and an IPv6 test address is configured then *in.mpathd*
69 will start sending ICMPv6 probes over that IP interface.
70 However, note that *in.mpathd* will ignore IPv6 test
71 addresses that are not link-local. If both IPv4 and IPv6
72 are plumbed, it is sufficient to configure only one of the
73 two, that is, either an IPv4 test address or an IPv6 test
74 address. If both IPv4 and IPv6 test addresses are
75 configured, *in.pathd* probes using both ICMPv4 and ICMPv6.
76
77
78 As mentioned above, *in.mpathd* also controls which IP
79 interfaces in an IPMP group are "active" (used by the system
80 to send and receive IP data traffic). Specifically,
81 *in.mpathd* tracks the administrative configuration of each
82 IPMP group and attempts to keep the number of active IP
83 interfaces in each group consistent with that configuration.
84 Therefore, if an active IP interface fails, *in.mpathd* will
85 activate an *INACTIVE* interface in the group, provided one
86 exists (it will prefer *INACTIVE* interfaces that are also
87 marked *STANDBY*). Likewise, if an IP interface repairs and
88 the resulting repair leaves the IPMP group with more active
89 interfaces than the administrative configuration specifies,
90 *in.mpathd* will deactivate one of the interfaces
91 (preferably one marked *STANDBY*), except when the
92 *FAILBACK* variable is used, as described below. Similar
93 adjustments will be made by *in.mpathd* when offlining IP
94 interfaces (for instance, in response to *if_mpadm(1M)*).
95
96
97 The in.mpathd daemon accesses three variable values in
98 /etc/default/mpathd: FAILURE_DETECTION_TIME, FAILBACK and
99 TRACK_INTERFACES_ONLY_WITH_GROUPS.
100
101
102 The *FAILURE_DETECTION_TIME* variable specifies the
103 probe-based failure detection time. The shorter the failure
104 detection time, the more probe traffic. The default value
105 of *FAILURE_DETECTION_TIME* is 10 seconds. This means that
106 IP interface failure will be detected by *in.mpathd* within
107 10 seconds. The IP interface repair detection time is
108 always twice the value of *FAILURE_DETECTION_TIME*. Note
109 that failures and repairs detected by link-based failure
110 detection are acted on immediately, though *in.mpathd* may
111 ignore link state changes if it suspects that the link state
112 is flapping due to defective hardware; see DIAGNOSTICS.
113
114
115 By default, *in.mpathd* limits failure and repair detection
116 to IP interfaces that are configured as part of a named IPMP
117 group. Setting *TRACK_INTERFACES_ONLY_WITH_GROUPS* to *no*
118 enables failure and repair detection on all IP interfaces,
119 even if they are not part of a named IPMP group. IP
120 interfaces that are tracked but not part of a named IPMP
121 group are considered to be part of the "anonymous" IPMP
122 group. In addition to having no name, this IPMP group is
123 special in that its IP interfaces are not equivalent and
124 thus cannot take over for one another in the event of an IP
125 interface failure. That is, the anonymous IPMP group can
126 only be used for failure and repair detection, and provides
127 no high-availability or load-spreading.
128
129
130 As described above, when *in.mpathd* detects that an IP
131 interface has repaired, it activates it so that it will
132 again be used to send and receive IP data traffic. However,
133 if *FAILBACK* is set to *no*, then the IP interface will
134 only be activated if no other active IP interfaces in the
135 group remain. However, the interface may subsequently be
136 activated if another IP interface in the group fails.
137
138 FILES
139 /etc/default/mpathd Contains default values used by the
140 in.mpathd daemon.
141
142
143 ATTRIBUTES
144 See attributes(5) for descriptions of the following attri-
145 butes:
146
147
148
149 ____________________________________________________________
150 | ATTRIBUTE TYPE | ATTRIBUTE VALUE |
151 |_____________________________|_____________________________|
152 | Availability | SUNWcsr |
153 |_____________________________|_____________________________|
154
155
156 SEE ALSO
157 ifconfig(1M), ipmpstat(1M), if_mpadm(1M), icmp(7P), icmp6(7P)
158
159
160 DIAGNOSTICS
161 IP interface *interface_name* has a hardware address which
162 is not unique in group *group_name*; offlining
163 Description:
164
165 For probe-based failure detection, load-spreading, and
166 other code IPMP features to work properly, each IP
167 interface in an IPMP group must have a unique hardware
168 address. If this requirement is not met, *in.mpathd*
169 will automatically offline all but one of the IP
170 interfaces with duplicate hardware addresses.
171
172 IP interface *interface_name* now has a unique hardware
173 address in group *group_name*; onlining
174 Description:
175
176 The previously-detected duplicate hardware address is
177 now unique, and therefore *in.mpathd* has brought
178 *interface_name* back online.
179
180
181 Test address *address* is not unique in group; disabling
182 probe-based failure detection on *interface_name*
183 Description:
184
185
186 For in.mpathd to perform probe-based failure detection,
187 each test address in the group must be unique.
188
189
190 SunOS 5.11 Last change: 8 Sep 2006 3
191
192
193
194
195
196
197 System Administration Commands in.mpathd(1M)
198
199
200
201 No test address configured on interface *interface_name*;
202 disabling probe-based failure detection on it
203 Description:
204
205
206 For *in.mpathd* to perform probe-based failure detection
207 on an IP interface, it must be configured with a test
208 address: IPv4, IPv6, or both.
209
210
211 NIC interface_name of group group_name is not plumbed for
212 IPv[4|6] and may affect failover capability
213 Description:
214
215
216 All NICs in a multipathing group must be homogeneously
217 plumbed. For example, if a NIC is plumbed for IPv4, then
218 all NICs in the group must be plumbed for IPv4. The
219 STREAMS modules pushed on all NICs must also be identical.
220
221
222 The link has come up on interface_name more than 2 times in
223 the last minute; disabling repair until it stabilizes.
224 Description:
225
226
227 To limit the impact of interfaces with intermittent
228 hardware (such as a bad cable), *in.mpathd* will not
229 consider an IP interface with a frequently changing link
230 state as repaired until the link state stabilizes.
231
232
233
234 Invalid failure detection time of *time*, assuming default
235 of 10000 ms
236 Description:
237
238
239 An invalid value was encountered for
240 FAILURE_DETECTION_TIME in the /etc/default/mpathd file.
241
242
243
244 Too small failure detection time of *time*, assuming minimum
245 of 100 ms
246 Description:
247
248
249
250
251 SunOS 5.11 Last change: 8 Sep 2006 4
252
253
254
255
256
257
258 System Administration Commands in.mpathd(1M)
259
260
261
262 The minimum value that can be specified for
263 FAILURE_DETECTION_TIME is currently 100 milliseconds.
264
265
291 The round trip time for ICMP probes is higher than
292 necessary to maintain the current failure detection
293 time. The network is probably congested or the probe
294 targets are loaded. in.mpathd automatically increases
295 the failure detection time to whatever it can achieve
296 under these conditions.
297
298
299
300 Improved failure detection time time ms on (inet[6]
301 interface_name) for group group_name
302 Description:
303
304
305 The round trip time for ICMP probes has now decreased
306 and in.mpathd has lowered the failure detection time
307 correspondingly.
308
309
310
311 IP interface failure detected on interface_name
312 Description:
313
314
315
316
317 SunOS 5.11 Last change: 8 Sep 2006 5
318
319
320
321
322
323
324 System Administration Commands in.mpathd(1M)
325
326
327
328 *in.mpathd* has detected a failure on *interface_name*,
329 and has set the *IFF_FAILED* flag on *interface_name*,
330 ensuring that it will not be used for IP data traffic.
331
332
333 IP interface repair detected on *interface_name*
334 Description:
335
336
337 *in.mpathd* has detected a repair on *interface_name*,
338 and has cleared the *IFF_FAILED* flag. Depending on the
339 administrative configuration, the *interface_name* may
340 again be used for IP data traffic.
341
342
343 The link has gone down on interface_name
344 Description:
345
346
347 *in.mpathd* has detected that the *IFF_RUNNING* flag for
348 *interface_name* has been cleared, indicating the link
349 has gone down.
350
351
352
353 The link has come up on interface_name
354 Description:
355
356
357 *in.mpathd* has detected that the *IFF_RUNNING* flag for
358 *interface_name* has been set, indicating the link has
359 come up.
360
361
362
363 SunOS 5.11 Last change: 8 Sep 2006 6
364
365
366
367
368
369
370 System Administration Commands in.mpathd(1M)
371
372
373
374
375
376
377
378
|