1 System Administration Commands                      in.mpathd(1M)
   2 
   3 
   4 
   5 NAME
   6      in.mpathd - daemon for network adapter (NIC) failure  detec-
   7      tion, recovery, automatic failover and failback
   8 
   9 SYNOPSIS
  10      /usr/lib/inet/in.mpathd
  11 
  12 
  13 DESCRIPTION
  14      The in.mpathd daemon performs Network Interface  Card  (NIC)
  15      failure and repair detection. In the event of a NIC failure,
  16      it causes IP network access from the failed NIC to  failover
  17      to  a  standby  NIC,  if available, or to any another opera-
  18      tional NIC that has been configured as part of the same net-
  19      work  multipathing  group.  Once the failed NIC is repaired,
  20      all network access is restored to the repaired NIC.
  21 
  22 
  23      The in.mpathd daemon  can  detect  NIC  failure  and  repair
  24      through  two methods: by monitoring the IFF_RUNNING flag for
  25      each NIC (link-based failure detection), and by sending  and
  26      receiving  ICMP  echo  requests  and  replies  on  each  NIC
  27      (probe-based failure detection). Link-based  failure  detec-
  28      tion  requires  no explicit configuration and thus is always
  29      enabled (provided the  NIC  driver  supports  the  feature);
  30      probe-based  failure  detection  must be enabled through the
  31      configuration of  one  or  more  test  addresses  (described
  32      below),  but  has the benefit of testing the entire NIC send
  33      and receive path.
  34 
  35 
  36      If only link-based failure detection is  enabled,  then  the
  37      health  of the interface is determined solely from the state
  38      of the IFF_RUNNING flag. Otherwise, the  interface  is  con-
  39      sidered  failed  if  either  of  the  two methods indicate a
  40      failure, and repaired once both methods indicate the failure
  41      has been corrected. Not all interfaces in a group need to be
  42      configured with the same failure detection methods.
  43 
  44 
  45      As mentioned above, in order to perform probe-based  failure
  46      detection in.mpathd needs a special test address on each NIC
  47      for the purpose of sending and receiving probes on the  NIC.
  48      Use the ifconfig command -failover option to configure these
  49      test addresses. See  ifconfig(1M).  The  test  address  must
  50      belong to a subnet that is known to the hosts and routers on
  51      the link.
  52 
  53 
  54      The in.mpathd daemon can detect NIC failure  and  repair  by
  55      two methods, by sending and receiving ICMP echo requests and
  56      replies on each NIC, and by monitoring the IFF_RUNNING  flag
  57 
  58 
  59 
  60 SunOS 5.11           Last change: 8 Sep 2006                    1
  61 
  62 
  63 
  64 
  65 
  66 
  67 System Administration Commands                      in.mpathd(1M)
  68 
  69 
  70 
  71      for  each NIC. The link state on some models of NIC is indi-
  72      cated by the IFF_RUNNING flag, allowing for  faster  failure
  73      detection when the link goes down. The in.mpathd daemon con-
  74      siders a NIC to have failed  if  either  of  the  above  two
  75      methods  indicates  failure.  A  NIC  is  considered  to  be
  76      repaired only if both methods indicate the NIC is repaired.
  77 
  78 
  79      The in.mpathd daemon sends the ICMP echo request  probes  to
  80      on-link  routers.  If no routers are available, it sends the
  81      probes to  neighboring  hosts.  Thus,  for  network  failure
  82      detection and repair, there must be at least one neighbor on
  83      each link that responds to ICMP echo request probes.
  84 
  85 
  86      in.mpathd works on both IPv4 and IPv6. If IPv4 is plumbed on
  87      a NIC, an IPv4 test address is configured on theNIC, and the
  88      NIC is configured as part of a network  multipathing  group,
  89      then  in.mpathd  will  start  sending ICMP probes on the NIC
  90      using IPv4.
  91 
  92 
  93      In the case of IPv6, the link-local address must be  config-
  94      ured  as  the  test  address.  The in.mpathd daemon will not
  95      accept a non-link-local address as a test  address.  If  the
  96      NIC  is  part  of a multipathing group, and the test address
  97      has been configured, then in.mpathd will probe the  NIC  for
  98      failures using IPv6.
  99 
 100 
 101      Even if both the IPv4 and IPv6 protocol streams are plumbed,
 102      it  is sufficient to configure only one of the two, that is,
 103      either an IPv4 test address or an IPv6  test  address  on  a
 104      NIC.  If  only an IPv4 test address is configured, it probes
 105      using only ICMPv4. If only an IPv6 test address  is  config-
 106      ured,  it  probes  using  only  ICMPv6.  If  both  type test
 107      addresses are configured, it probes using  both  ICMPv4  and
 108      ICMPv6.
 109 
 110 
 111      The in.mpathd  daemon  accesses  three  variable  values  in
 112      /etc/default/mpathd:  FAILURE_DETECTION_TIME,  FAILBACK  and
 113      TRACK_INTERFACES_ONLY_WITH_GROUPS.
 114 
 115 
 116      The  FAILURE_DETECTION_TIME  variable  specifies   the   NIC
 117      failure  detection  time  for  the  ICMP  echo request probe
 118      method of detecting NIC failure.  The  shorter  the  failure
 119      detection time, the greater the volume of probe traffic. The
 120      default value of FAILURE_DETECTION_TIME is 10 seconds.  This
 121      means  that NIC failure will be detected by in.mpathd within
 122      10 seconds. NIC failures detected by  the  IFF_RUNNING  flag
 123 
 124 
 125 
 126 SunOS 5.11           Last change: 8 Sep 2006                    2
 127 
 128 
 129 
 130 
 131 
 132 
 133 System Administration Commands                      in.mpathd(1M)
 134 
 135 
 136 
 137      being  cleared  are acted on as soon as the in.mpathd daemon
 138      notices the change in the flag.  The  NIC  repair  detection
 139      time  cannot be configured; however, it is defined as double
 140      the value of FAILURE_DETECTION_TIME.
 141 
 142 
 143      By default, in.mpathd does failure detection  only  on  NICs
 144      that are configured as part of a multipathing group. You can
 145      set  TRACK_INTERFACES_ONLY_WITH_GROUPS  to  no   to   enable
 146      failure detection by in.mpathd on all NICs, even if they are
 147      not part of a multipathing group. However, in.mpathd  cannot
 148      do  failover  from  a failed NIC if it is not part of a mul-
 149      tipathing group.
 150 
 151 
 152      The in.mpathd daemon will restore network  traffic  back  to
 153      the  previously  failed  NIC,  after  it  has detected a NIC
 154      repair. To disable this, set the value of FAILBACK to no  in
 155      /etc/default/mpathd.
 156 
 157 FILES
 158      /etc/default/mpathd    Contains default values used  by  the
 159                             in.mpathd daemon.
 160 
 161 
 162 ATTRIBUTES
 163      See attributes(5) for descriptions of the  following  attri-
 164      butes:
 165 
 166 
 167 
 168      ____________________________________________________________
 169     |       ATTRIBUTE TYPE        |       ATTRIBUTE VALUE       |
 170     |_____________________________|_____________________________|
 171     | Availability                | SUNWcsr                     |
 172     |_____________________________|_____________________________|
 173 
 174 
 175 SEE ALSO
 176      ifconfig(1M), attributes(5), icmp(7P), icmp6(7P),
 177 
 178 
 179 DIAGNOSTICS
 180      Test address address is not unique;  disabling  probe  based
 181      failure detection on interface_name
 182          Description:
 183 
 184 
 185          For in.mpathd to perform probe-based failure  detection,
 186          each test address in the group must be unique. Since the
 187          IPv6 test address is a link-local address  derived  from
 188          the  MAC  address,  each  IP interface in the group must
 189 
 190 
 191 
 192 SunOS 5.11           Last change: 8 Sep 2006                    3
 193 
 194 
 195 
 196 
 197 
 198 
 199 System Administration Commands                      in.mpathd(1M)
 200 
 201 
 202 
 203          have a unique MAC address.
 204 
 205 
 206 
 207      NIC interface_name of group group_name is  not  plumbed  for
 208      IPv[4|6] and may affect failover capability
 209          Description:
 210 
 211 
 212          All NICs in a multipathing group must  be  homogeneously
 213          plumbed. For example, if a NIC is plumbed for IPv4, then
 214          all NICs in the group must  be  plumbed  for  IPv4.  The
 215          streams modules pushed on all NICs must be identical.
 216 
 217 
 218 
 219      No test address configured on interface interface_name disa-
 220      bling probe-based failure detection on it
 221          Description:
 222 
 223 
 224          In order for in.mpathd to  perform  probe-based  failure
 225          detection  on  a  NIC, it must be configured with a test
 226          address: IPv4, IPv6, or both.
 227 
 228 
 229 
 230      The link has come up on interface_name more than 2 times  in
 231      the last minute; disabling failback until it stabilizes.
 232          Description:
 233 
 234 
 235          In  order  to  prevent  interfaces   with   intermittent
 236          hardware,  such  as  a  bad cable, from causing repeated
 237          failovers and failbacks, in.mpathd does not failback  to
 238          interfaces with frequently fluctuating link states.
 239 
 240 
 241 
 242      Invalid failure detection time assuming default 10000
 243          Description:
 244 
 245 
 246          An     invalid     value     was     encountered     for
 247          FAILURE_DETECTION_TIME in the /etc/default/mpathd file.
 248 
 249 
 250 
 251      Too small failure detection time of  time  assuming  minimum
 252      100
 253          Description:
 254 
 255 
 256 
 257 
 258 SunOS 5.11           Last change: 8 Sep 2006                    4
 259 
 260 
 261 
 262 
 263 
 264 
 265 System Administration Commands                      in.mpathd(1M)
 266 
 267 
 268 
 269          The  minimum   value   that   can   be   specified   for
 270          FAILURE_DETECTION_TIME is currently 100 milliseconds.
 271 
 272 
 273 
 274      Invalid value for FAILBACK value
 275          Description:
 276 
 277 
 278          Valid values for the boolean variable FAILBACK  are  yes
 279          or no.
 280 
 281 
 282 
 283      Invalid value for TRACK_INTERFACES_ONLY_WITH_GROUPS value
 284          Description:
 285 
 286 
 287          Valid    values     for     the     boolean     variable
 288          TRACK_INTERFACES_ONLY_WITH_GROUPS are yes or no.
 289 
 290 
 291 
 292      Cannot meet requested failure detection time of time  ms  on
 293      (inet[6]  interface_name)  new  failure  detection  time for
 294      group group_name is time ms
 295          Description:
 296 
 297 
 298          The round trip time  for  ICMP  probes  is  higher  than
 299          necessary  to  maintain  the  current  failure detection
 300          time. The network is probably  congested  or  the  probe
 301          targets  are  loaded.  in.mpathd automatically increases
 302          the failure detection time to whatever  it  can  achieve
 303          under these conditions.
 304 
 305 
 306 
 307      Improved  failure  detection  time  time  ms   on   (inet[6]
 308      interface_name) for group group_name
 309          Description:
 310 
 311 
 312          The round trip time for ICMP probes  has  now  decreased
 313          and  in.mpathd  has  lowered  the failure detection time
 314          correspondingly.
 315 
 316 
 317 
 318      NIC failure detected on interface_name
 319          Description:
 320 
 321 
 322 
 323 
 324 SunOS 5.11           Last change: 8 Sep 2006                    5
 325 
 326 
 327 
 328 
 329 
 330 
 331 System Administration Commands                      in.mpathd(1M)
 332 
 333 
 334 
 335          in.mpathd has detected NIC  failure  on  interface_name,
 336          and has set the IFF_FAILED flag on NIC interface_name.
 337 
 338 
 339 
 340      Successfully failed over from  NIC  interface_name1  to  NIC
 341      interface_name2
 342          Description:
 343 
 344 
 345          in.mpathd has caused the  network  traffic  to  failover
 346          from  NIC  interface_name1 to NIC interface_name2, which
 347          is part of the multipathing group.
 348 
 349 
 350 
 351      NIC repair detected on interface_name
 352          Description:
 353 
 354 
 355          in.mpathd  has  detected  that  NIC  interface_name   is
 356          repaired  and operational. If the IFF_FAILED flag on the
 357          NIC was previously set, it will be reset.
 358 
 359 
 360 
 361      Successfully failed back to NIC interface_name
 362          Description:
 363 
 364 
 365          in.mpathd has  restored  network  traffic  back  to  NIC
 366          interface_name, which is now repaired and operational.
 367 
 368 
 369 
 370      The link has gone down on interface_name
 371          Description:
 372 
 373 
 374          in.mpathd has detected that the IFF_RUNNING flag for NIC
 375          interface_name has been cleared, indicating the link has
 376          gone down.
 377 
 378 
 379 
 380      The link has come up on interface_name
 381          Description:
 382 
 383 
 384          in.mpathd has detected that the IFF_RUNNING flag for NIC
 385          interface_name  has  been  set,  indicating the link has
 386          come up.
 387 
 388 
 389 
 390 SunOS 5.11           Last change: 8 Sep 2006                    6
 391 
 392 
 393 
 394 
 395 
 396 
 397 System Administration Commands                      in.mpathd(1M)
 398 
 399 
 400 
 401 
 402 
 403 
 404 
 405 
 406 
 407 
 408 
 409 
 410 
 411 
 412 
 413 
 414 
 415 
 416 
 417 
 418 
 419 
 420 
 421 
 422 
 423 
 424 
 425 
 426 
 427 
 428 
 429 
 430 
 431 
 432 
 433 
 434 
 435 
 436 
 437 
 438 
 439 
 440 
 441 
 442 
 443 
 444 
 445 
 446 
 447 
 448 
 449 
 450 
 451 
 452 
 453 SunOS 5.11           Last change: 8 Sep 2006                    7
 454 
 455 
 456 
 457 
 458 
 459