1 System Administration Commands in.mpathd(1M) 2 3 4 5 NAME 6 in.mpathd - daemon for network adapter (NIC) failure detec- 7 tion, recovery, automatic failover and failback 8 9 SYNOPSIS 10 /usr/lib/inet/in.mpathd 11 12 13 DESCRIPTION 14 The in.mpathd daemon performs Network Interface Card (NIC) 15 failure and repair detection. In the event of a NIC failure, 16 it causes IP network access from the failed NIC to failover 17 to a standby NIC, if available, or to any another opera- 18 tional NIC that has been configured as part of the same net- 19 work multipathing group. Once the failed NIC is repaired, 20 all network access is restored to the repaired NIC. 21 22 23 The in.mpathd daemon can detect NIC failure and repair 24 through two methods: by monitoring the IFF_RUNNING flag for 25 each NIC (link-based failure detection), and by sending and 26 receiving ICMP echo requests and replies on each NIC 27 (probe-based failure detection). Link-based failure detec- 28 tion requires no explicit configuration and thus is always 29 enabled (provided the NIC driver supports the feature); 30 probe-based failure detection must be enabled through the 31 configuration of one or more test addresses (described 32 below), but has the benefit of testing the entire NIC send 33 and receive path. 34 35 36 If only link-based failure detection is enabled, then the 37 health of the interface is determined solely from the state 38 of the IFF_RUNNING flag. Otherwise, the interface is con- 39 sidered failed if either of the two methods indicate a 40 failure, and repaired once both methods indicate the failure 41 has been corrected. Not all interfaces in a group need to be 42 configured with the same failure detection methods. 43 44 45 As mentioned above, in order to perform probe-based failure 46 detection in.mpathd needs a special test address on each NIC 47 for the purpose of sending and receiving probes on the NIC. 48 Use the ifconfig command -failover option to configure these 49 test addresses. See ifconfig(1M). The test address must 50 belong to a subnet that is known to the hosts and routers on 51 the link. 52 53 54 The in.mpathd daemon can detect NIC failure and repair by 55 two methods, by sending and receiving ICMP echo requests and 56 replies on each NIC, and by monitoring the IFF_RUNNING flag 57 58 59 60 SunOS 5.11 Last change: 8 Sep 2006 1 61 62 63 64 65 66 67 System Administration Commands in.mpathd(1M) 68 69 70 71 for each NIC. The link state on some models of NIC is indi- 72 cated by the IFF_RUNNING flag, allowing for faster failure 73 detection when the link goes down. The in.mpathd daemon con- 74 siders a NIC to have failed if either of the above two 75 methods indicates failure. A NIC is considered to be 76 repaired only if both methods indicate the NIC is repaired. 77 78 79 The in.mpathd daemon sends the ICMP echo request probes to 80 on-link routers. If no routers are available, it sends the 81 probes to neighboring hosts. Thus, for network failure 82 detection and repair, there must be at least one neighbor on 83 each link that responds to ICMP echo request probes. 84 85 86 in.mpathd works on both IPv4 and IPv6. If IPv4 is plumbed on 87 a NIC, an IPv4 test address is configured on theNIC, and the 88 NIC is configured as part of a network multipathing group, 89 then in.mpathd will start sending ICMP probes on the NIC 90 using IPv4. 91 92 93 In the case of IPv6, the link-local address must be config- 94 ured as the test address. The in.mpathd daemon will not 95 accept a non-link-local address as a test address. If the 96 NIC is part of a multipathing group, and the test address 97 has been configured, then in.mpathd will probe the NIC for 98 failures using IPv6. 99 100 101 Even if both the IPv4 and IPv6 protocol streams are plumbed, 102 it is sufficient to configure only one of the two, that is, 103 either an IPv4 test address or an IPv6 test address on a 104 NIC. If only an IPv4 test address is configured, it probes 105 using only ICMPv4. If only an IPv6 test address is config- 106 ured, it probes using only ICMPv6. If both type test 107 addresses are configured, it probes using both ICMPv4 and 108 ICMPv6. 109 110 111 The in.mpathd daemon accesses three variable values in 112 /etc/default/mpathd: FAILURE_DETECTION_TIME, FAILBACK and 113 TRACK_INTERFACES_ONLY_WITH_GROUPS. 114 115 116 The FAILURE_DETECTION_TIME variable specifies the NIC 117 failure detection time for the ICMP echo request probe 118 method of detecting NIC failure. The shorter the failure 119 detection time, the greater the volume of probe traffic. The 120 default value of FAILURE_DETECTION_TIME is 10 seconds. This 121 means that NIC failure will be detected by in.mpathd within 122 10 seconds. NIC failures detected by the IFF_RUNNING flag 123 124 125 126 SunOS 5.11 Last change: 8 Sep 2006 2 127 128 129 130 131 132 133 System Administration Commands in.mpathd(1M) 134 135 136 137 being cleared are acted on as soon as the in.mpathd daemon 138 notices the change in the flag. The NIC repair detection 139 time cannot be configured; however, it is defined as double 140 the value of FAILURE_DETECTION_TIME. 141 142 143 By default, in.mpathd does failure detection only on NICs 144 that are configured as part of a multipathing group. You can 145 set TRACK_INTERFACES_ONLY_WITH_GROUPS to no to enable 146 failure detection by in.mpathd on all NICs, even if they are 147 not part of a multipathing group. However, in.mpathd cannot 148 do failover from a failed NIC if it is not part of a mul- 149 tipathing group. 150 151 152 The in.mpathd daemon will restore network traffic back to 153 the previously failed NIC, after it has detected a NIC 154 repair. To disable this, set the value of FAILBACK to no in 155 /etc/default/mpathd. 156 157 FILES 158 /etc/default/mpathd Contains default values used by the 159 in.mpathd daemon. 160 161 162 ATTRIBUTES 163 See attributes(5) for descriptions of the following attri- 164 butes: 165 166 167 168 ____________________________________________________________ 169 | ATTRIBUTE TYPE | ATTRIBUTE VALUE | 170 |_____________________________|_____________________________| 171 | Availability | SUNWcsr | 172 |_____________________________|_____________________________| 173 174 175 SEE ALSO 176 ifconfig(1M), attributes(5), icmp(7P), icmp6(7P), 177 178 179 DIAGNOSTICS 180 Test address address is not unique; disabling probe based 181 failure detection on interface_name 182 Description: 183 184 185 For in.mpathd to perform probe-based failure detection, 186 each test address in the group must be unique. Since the 187 IPv6 test address is a link-local address derived from 188 the MAC address, each IP interface in the group must 189 190 191 192 SunOS 5.11 Last change: 8 Sep 2006 3 193 194 195 196 197 198 199 System Administration Commands in.mpathd(1M) 200 201 202 203 have a unique MAC address. 204 205 206 207 NIC interface_name of group group_name is not plumbed for 208 IPv[4|6] and may affect failover capability 209 Description: 210 211 212 All NICs in a multipathing group must be homogeneously 213 plumbed. For example, if a NIC is plumbed for IPv4, then 214 all NICs in the group must be plumbed for IPv4. The 215 streams modules pushed on all NICs must be identical. 216 217 218 219 No test address configured on interface interface_name disa- 220 bling probe-based failure detection on it 221 Description: 222 223 224 In order for in.mpathd to perform probe-based failure 225 detection on a NIC, it must be configured with a test 226 address: IPv4, IPv6, or both. 227 228 229 230 The link has come up on interface_name more than 2 times in 231 the last minute; disabling failback until it stabilizes. 232 Description: 233 234 235 In order to prevent interfaces with intermittent 236 hardware, such as a bad cable, from causing repeated 237 failovers and failbacks, in.mpathd does not failback to 238 interfaces with frequently fluctuating link states. 239 240 241 242 Invalid failure detection time assuming default 10000 243 Description: 244 245 246 An invalid value was encountered for 247 FAILURE_DETECTION_TIME in the /etc/default/mpathd file. 248 249 250 251 Too small failure detection time of time assuming minimum 252 100 253 Description: 254 255 256 257 258 SunOS 5.11 Last change: 8 Sep 2006 4 259 260 261 262 263 264 265 System Administration Commands in.mpathd(1M) 266 267 268 269 The minimum value that can be specified for 270 FAILURE_DETECTION_TIME is currently 100 milliseconds. 271 272 273 274 Invalid value for FAILBACK value 275 Description: 276 277 278 Valid values for the boolean variable FAILBACK are yes 279 or no. 280 281 282 283 Invalid value for TRACK_INTERFACES_ONLY_WITH_GROUPS value 284 Description: 285 286 287 Valid values for the boolean variable 288 TRACK_INTERFACES_ONLY_WITH_GROUPS are yes or no. 289 290 291 292 Cannot meet requested failure detection time of time ms on 293 (inet[6] interface_name) new failure detection time for 294 group group_name is time ms 295 Description: 296 297 298 The round trip time for ICMP probes is higher than 299 necessary to maintain the current failure detection 300 time. The network is probably congested or the probe 301 targets are loaded. in.mpathd automatically increases 302 the failure detection time to whatever it can achieve 303 under these conditions. 304 305 306 307 Improved failure detection time time ms on (inet[6] 308 interface_name) for group group_name 309 Description: 310 311 312 The round trip time for ICMP probes has now decreased 313 and in.mpathd has lowered the failure detection time 314 correspondingly. 315 316 317 318 NIC failure detected on interface_name 319 Description: 320 321 322 323 324 SunOS 5.11 Last change: 8 Sep 2006 5 325 326 327 328 329 330 331 System Administration Commands in.mpathd(1M) 332 333 334 335 in.mpathd has detected NIC failure on interface_name, 336 and has set the IFF_FAILED flag on NIC interface_name. 337 338 339 340 Successfully failed over from NIC interface_name1 to NIC 341 interface_name2 342 Description: 343 344 345 in.mpathd has caused the network traffic to failover 346 from NIC interface_name1 to NIC interface_name2, which 347 is part of the multipathing group. 348 349 350 351 NIC repair detected on interface_name 352 Description: 353 354 355 in.mpathd has detected that NIC interface_name is 356 repaired and operational. If the IFF_FAILED flag on the 357 NIC was previously set, it will be reset. 358 359 360 361 Successfully failed back to NIC interface_name 362 Description: 363 364 365 in.mpathd has restored network traffic back to NIC 366 interface_name, which is now repaired and operational. 367 368 369 370 The link has gone down on interface_name 371 Description: 372 373 374 in.mpathd has detected that the IFF_RUNNING flag for NIC 375 interface_name has been cleared, indicating the link has 376 gone down. 377 378 379 380 The link has come up on interface_name 381 Description: 382 383 384 in.mpathd has detected that the IFF_RUNNING flag for NIC 385 interface_name has been set, indicating the link has 386 come up. 387 388 389 390 SunOS 5.11 Last change: 8 Sep 2006 6 391 392 393 394 395 396 397 System Administration Commands in.mpathd(1M) 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 SunOS 5.11 Last change: 8 Sep 2006 7 454 455 456 457 458 459