INTEGRATED LOAD BALANCER DESIGN DOCUMENT
			         [Rev 1.1]
			         Authors: 
			         Sangeeta Misra
			         Kacheong Poon
			         Michael Schuster

1. Overview
----------------
    
   This document describes the functional components and the overall
   design of ILB project(PSARC 2008/575). The project will deliver the
   basic features needed to use Solaris on a x86/SPARC platform as a
   L3/L4 load balancer.

   The project will deliver the following features: 

        o Stateless DSR(Direct Server Return) and NAT operation modes offering
          the following load balancing algorithms:

	  round-robin, src IP hash, <src-IP, port> hash, <src-IP, VIP> hash
      
          IPv4 and IPv6 support will be provided for both operations.  

        o A CLI and a configuration API to configure the various features
          as well as view statistics and configuration details.

        o Simple server monitoring features 

        o High availability between load balancers in a active-standby
	  configuration mode via VRRP protocol(RFC 3768). Note that VRRP 
          will be delivered in Solaris as a seperate but parallel project[1].


   The project includes kernel and userland components. Thus ILB will
   be delivered as separate packages from the core stack with their
   following package names:  

    SUNWilbr	ILB kernel component
    SUNWilb	components delivered in /usr which are:
		  o ilbadm
		  o libilb
		  o ilbd

2. Terms used in this document
-------------------------------

   Stateless Direct Server return - Direct Server Return mode (DSR) refers
   to using the load balancer to only load balance incoming requests
   to the back-end servers and letting the return traffic from the
   servers to the clients bypass the load balancer. With stateless DSR, the
   load balancer will not keep any state information of the packets
   processed (load balanced), except for simple statistics.

   Server group - A server group comprises of a set of back-end servers.
   If the ILB user wants to load balance HTTP requests, he/she will
   configure the load balancer with a server group consisting of several
   servers. The load balancer will balance the HTTP traffic across this
   set of servers.

   Virtual Service - A virtual service is what the world sees as
   VIP:port (eg www.foo.com:80).  Although the service is being handled by a
   server group consisting of several servers, the server group appears to the
   clients of the Virtual service as a single IP address:port. Note that
   a single server can be included in multiple server group and thus may
   serve multiple virtual services.

   VIP - Virtual IP address (VIP) is the IP address for the virtual service.

   Load balancing algorithm - The algorithm that the load balancer uses to
   select back-end servers from a server group for incoming packets.

   Load balancing Rule - For the purposes of this document a load balancing
   rule is defined by the following parameters:
	o IP version: the IP version (IPv4 or IPv6) of a packet
	o virtual IP address (VIP)
	o transport protocol: TCP or UDP
	o port number (or a port range)
	o load balancing algorithm
	o type of load balancing operation (DSR or NAT)
	o a server group and optional health checks that
	  are to be executed for each server in the server
	  group. 
	o rule name

   The load balancer uses the {VIP, transport protocol, port number}
   values to determine if an incoming packet matches a rule. If there
   is a match then the load balancer uses the specified load balancing
   algorithm to select a server from the server group.

3 Load balancer operation modes
-------------------------------

   The ILB project will provide in-kernel implementations of IPV4 and IPv6
   supported Direct Server Return(DSR) and NAT(half and full) based operation
   modes. Direct Server Return mode (DSR) refers to using the load
   balancer(LB) to only load balance incoming requests to the back-end
   servers and letting the return traffic from the servers to the
   clients bypass the load balancer.  NAT-based load balancing involves
   rewriting of header information, and handles both the request and the
   response traffic. Phase 1 will support single legged and dual legged
   topologies (see Appendix D).

   As part of proof of concept work we implemented in-kernel implementations
   of DSR and half-NAT and compared the performance of our half NAT
   load balancer implementation with that of IP Filter(to ensure that
   ours does not perform worse than that of IP Filter's).  The comparative
   performance results are listed in Appendix A.

   After careful review of both implementations, we decided to use the
   standalone NAT load balancer version because it met the following criteria
   better than IP Filter's implementation:

	-  Lightweight code, containing only the NAT-based load
           balancing feature, that can be extended easily to add
	   load balancing algorithms as requested by customer

	-  Fits well with rest of ILB code so that the load balancing
	   algorithms can be shared by DSR and NAT

	-  Minimizes conflict when system is running NAT based ILB and
	   IP Filter NAT at the same time.  
 
   It is important to note here that unlike IP Filter NAT, the standalone
   version is not a full-blown NAT implementation; instead it is strictly
   limited to just load balancing functionality. 

4. Command-line Interface 
----------------------------------

   The core functionality of the load balancer administration will be
   implemented  in a library(libilb) for consumption by the CLI(ilbadm)
   and 3rd party applications. The location of the CLI will be
   /usr/sbin/lbadm. The location of the API will be /usr/lib/libilb. 

   The CLI will include commands to configure load balancing rules, server
   groups and health checks.  In addition to this it will
   also include various commands to display statistics as well as
   configuration details. The user will require privilege to invoke
   configuration commands. The view commands can be invoked by regular user. 

        Configuration commands:
		o create and destroy load balancing rules
		o add and remove servers from a server group

	View commands:
		o view configured load balancing rules 
		o view packet forwarding statistics
		o view nat connection table
		o view health check results   

   A detailed list of commands are provided in Appendix B

5. Server monitoring details
----------------------------------

   ILB project will offer optional server monitoring feature and
   will provide the following types of health checks: 

	o ping
	o TCP
	o UDP probes
	o user supplied tests that would be run as health checks ( the
          test can be a binary or a shell script)

   The health checks are specified for the associated server group, when
   creating a load balancing rule. One can only configure one health check 
   per load balancing rule. The following user configurable parameters
   apply health check configuration:

	 o hc-test    - type of health check 

	 o hc-timeout - timeout when a health check is considered to
			have failed if it does not complete

	 o hc-interval- interval between consecutive health checks. Note that 
			the implementation will randomize the interval
			between 0.5[hc-interval] - 1.5[hc-interval]
			avoid synchronization[6]

	 o hc-count   - number of consecutive failed health checks before
			the server is considered to be down. 

			        
6. High availability and redundancy capability
------------------------------------------------

   ILB will provide optional HA capability for active-standby redundancy
   configuration. The active-standby configuration consists of a pair of
   load balancers, yet only one of them is active (this is the primary
   load balancer) while the other stays in standby mode. Should the primary
   fail, the standby will take over the primary's job. The VRRP protocol
   will be used for the selection of primary load balancer[5].

   Note that ILB will only provide redundancy for machine failures, and will
   not handle switch failures. We will use other existing mechanisms like
   link aggregation to handle switch failures.

   In order to make load balancer failover transparent to client
   applications, the primary load balancer needs to synchronize its
   state (e.g connection information) with the standby load balancer.
   This is needed so that when the primary fails and the standby takes
   over, it will have the state of most connections, so that almost all
   connections can continue to access the virtual service through the
   standby. ILB project will not deliver this synchronization capability.
   Note that HA without synchronization is still valuable as upon the 
   primary's failure, it allow the user to have service to reconnect to.

   To set up HA capability, the user will have to manually configure
   both the primary and the standby via VRRP CLI and use the export
   subcommand of ILB CLI(see Appendix B) to acquire an editable copy of
   the primary's persistent configuration, modify it as necessary and
   copy it over to the standby.   
 
   
7 Other capabilies
-------------------

  Other capabilities include the following:
  
  1. Ability for clients to ping VIP address - The load balancer 
     needs to be able to respond to ICMP echo requests to VIPs from clients.
     Both DSR and NAT will provide support for this feature.  

  2. Ability to  add and remove servers from a server group without
     interrupting service - This capability allows one to dynamically add
     and remove servers from a server group of an active rule, without
     interruption of existing connections established to back-end servers.
     NAT will provide support for this feature.  

  3. Session persistence - For many applications, it is important that
     a series of connections from the same client are sent to the same
     back-end server. Ideally, the addition or removal of a back-end server
     shouldn't interfere with established persistent sessions. ILB will
     provide the admin the capability to configure the following 
     session persistence (also called "stickiness") mechanisms for NAT-based
     load balancing operation mode:

	o src-IP sticky (Layer 3 stickiness): stick a client to a server
          based on the client IP address.

	o src-IP,dstport sticky (Layer4 stickiness):stick a client to a
          server based on both the client IP address and the server
          destination port number

  4. Connection  draining - ILB will provide support for this feature
     only for servers of NAT-based virtual services that have session
     persistence enabled. This feature allows the administrator to
     specify a back-end server for draining. No new connections will be
     sent to this server, except for connections from clients with established
     session persistence to that server.  As the session persistence timers
     expire, all clients will gradually be migrated off the selected server,
     which can then be taken down for maintenance. Once the server is ready
     to handle requests, the admin will turn off the feature for the server
     so that the load balancer can forward new connections to it.  

     This allows administrators to take down servers for maintenance
     without disrupting active connections/sessions.

  5. Load balancing all ports - ILB will provide the ability to load
     balance all ports on a given IP address across the set of servers,
     without having to set up explicit rules for each port. This feature
     will be available for NAT and DSR operation modes. 

  6. Independent ports for virtual services in same pool - For NAT, it should
     be possible to specify different destination ports for different
     servers in the pool.

  7. Load balance simple port range - This capability allows one to 
     load balance  a range of ports on  the VIP to the given server group.
     It's sometimes convenient to be able to conserve IP addresses
     by load balancing different port ranges on the same VIP to different
     sets of back-ends.  Both DSR and NAT will provide support for this
     feature. In addtion, when session persistence is enabled for NAT based
     load balancing, requests from the same client IP for different ports
     in the range should be sent to the same back-end server.  

  8. Port range shifting and collapsing - These features will be provided
     by NAT operation mode. Port range means the following: 

               Rule: VIP(n:N) -> {IP1(n1:N1), IP2(n2:N2), ... } 

     When the load balancer  gets a packet <src IP:x, VIP:m> where 
     n <= m <= N, it will load balance the packet to IP1, IP2 etc, and
     re-write the packet to IP1 as <srcIP:x, IP1:n1+m-n> 

     Port range collapsing means the following. 

              Rule: VIP(n:N) -> {IP1:n1, IP2:n2, ... } 

     When load balancer  gets a packet <srcIP:x, VIP:m> where n <= m <= N,
     it will load balance the packet to IP1, IP2 etc  and re-write the
     packet (suppose half NAT) to IP1 as  <srcIP:x, IP1:n1>


8. Architecture
----------------

   The following diagram shows the major components of ILB:

	---------------------
	|ilbadm CLI interface|	
	---------------------
	      |      
	      |
	      V	
	----------   AF_UNIX sockets  -------
	| libilb |<------------------>|ilbd |
	----------                    -------
					 ^
	                                 |
       				         |ioctls 
					 |
					 V
		------------------------------------
		|     Kernel  ILB Engine           |
		------------------------------------

   The major components are:

   ilbadm  -  This is the CLI of ILB. An admin will use this interface to
              configure load balancing  and optional health checks as well
	      view statistics   
   libilb  - This is the configuration library.
   ilbd    - The ilb daemon  has the following tasks:  
	      o manage persistent configuration
	      o serialize access to the kernel ILB module
	        by processing configuration and statistics
	        display requests from libilb and feeding it
	        to kernel ILB for execution.  
	      o perform health checks (built-in health checks
	        and as well as run user-supplied test scripts
		as health checks) and notify the kernel ILB module
		for server health so that the load distribution
		is adjusted properly.


9. The specifics of ilbd daemon
---------------------------------

   The ilbd source code will reside in /usr/src/cmd/cmd-inet/usr.lib directory.

   9.1 IPC details and privileges for ilbd daemon

       We will use AF_UNIX socket (socket type of SOCK_SEQPACKET) 
       for IPC between libilb and ilbd as both processes will run on the same
       machine.  A subset of ilbadm commands will require privileges
       (specifically the configuration commands) while others (the statistics
       and configuration display commands) would not. The /var/run directory
       will hold the AF_UNIX rendevous files. We propose that the project
       implement "ilbadm" uid. The ilbd  daemon will be run by the "ilbadm"
       user with PRIV_SYS_IP_CONFIG privilege and will use ioctls to
       communicate with the kernel. The kernel should check the ioctl
       credential to make sure its PRIV_SYS_IP_CONFIG before servicing it.
       Since the persistent config files can only be modified by the
       daemon, the files will be owned by user "ilbadm" and  will belong
       in /etc/ilbadm directory. The ILB project will audit administration
       using the auditing interfaces that are defined by PSARC 200/517 

   9.2 ilbd daemon internals 

       The core of ilbd daemon will be a single-threaded event loop using
       event completion framework; it will receive request from libilb,
       handle timeouts, perform health checks,  and populate the kernel
       state.  We choose to use event_port framework[2,3] over poll/select
       because of ease of implementation, some of which are these:

 	o Unlike with poll() one does not need to walk the entire set of
	  file descriptors to find out which one(s) had activity. Walking
	  the list is an O(N) activity which does not scale well as N gets
	  large. 
	o The necessity to handle timers via signal goes away. You can simply
	  associate a timer with a event port.  

       To do health check, the daemon will create a timer for every health
       check (this means if there are 100 servers and the load balancer is
       configured to run 3 health checks per server,  there will be a total of
       300 timers created). Each of these timers will be  associated with the
       event port. When a timer goes off, the daemon will initiate a 
       pipe to a separate process (using popen) to do the specific healthcheck.
	
       All health checks will be implemented as external methods (binary or
       script) that is to be executed by ilbd as a seperate process.
       The following arguments will be passed to external methods:

		$1	VIP (literal IPv4 or IPv6 address)
		$2	Server IP (literal IPv4 or IPv6 address)
		$3	Protocol (UDP, TCP as a string)
		$4	Numeric port
		$5	maximum time (in seconds) the method
			should wait before returning failure. If
			the method runs for longer, it may be
			killed, and the test considered failed.

       Return values: Process writes RTT on stdout, or 0, if it
       does not calculate it. A value of 255 signifies failure. 

       To keep things simple, all the health checks that ILB provides will
       be run with a specific set of privileges (one of them being
       PRIV_NET_ICMPACCESS to allow ICMP echo health check). By default,
       the user-supplied health checks will also run with the same set of
       privileges. If the administrator has some user-supplied scripts that
       require a larger privilege set, he/she will have to run it with
       setuid explicitly.

       Each health check will have a timeout, such that if the health
       check process is hung it will be killed after the timeout interval
       and the daemon will notify the kernel ILB engine of the server's
       unresponsiveness, so that load distribution can be appropriately
       adjusted. If on the other hand the health check is sucessful
       the timeout timer is cancelled.  

       Here is the pseudo code: 

		port_create()
		associate periodic timers for each health check with port
		associate socket to obtain requests from libilb
		forever() {
			port_get()

			switch (event type) {
			case data on socket:
				apply config change to kernel
				update internal state
				re-associate socket with event
			case periodic timer for HC:
				FILEp = popen(HC test program)
				create timeout timer for this test;
				associate timeout timer with port;
				port_associate(fileno(FILEp);
			case return value for HC test
				record RTT
				cancel associated timeout timer
			case timeout
				kill the HC process
				update kernel with "serverX for
				Loadbalancing rule A is dead"
			}
	       }

   9.3 Error handling and monitoring 

       Errors will be reported to syslog. In addition to that, the "monitor"
       option of ilbadm command can be used to monitor the ilbd daemon's
       execution of events and communication with kernel.  Note that one does
       not require priviledged access to run the 'monitor" option. By default
       the output of "ilbadm monitor" command will be appended to a file.
       The verbosity of the output can be dialed up with -d option (useful
       for debugging purpose).   

  9.4 Signals handling

      The ilbd daemon will handle SIGALRM and SIGTERM  via event ports. 

10 ILB kernel components
----------------------------

   The ILB code resides in the IP module.  It provides two load
   balancing mechanisms, stateless DSR and NAT (half and full)
   for UDP and TCP traffic.  User land application can open a
   socket() and issue ioctl() on this socket to communicate with
   the ILB code.

   The ILB code intercepts incoming packets right before IP decides if
   a packet is destined locally or to be forwarded. It is after the
   "physical in" and before the "forwarding" Packet Filtering Hooks
   (PSARC 2005/33) processing. If there is a load balancing rule, the
   ILB code will be invoked to check if the packet needs to be load balanced.
   Note that the placement of the interception implies that the ILB code
   cannot do load balancing for local traffic. We have chosen this design
   instead of extending IP FIlter hooks to ensure that the order of ILB 
   processing and that of IP Filter is correct. Furthermore, should we in
   the future need an ILB hook on the transmit side, that hook wouldn't
   belong where pfhooks sits on the transmit side; we'd need the transmit hook
   to be before IRE lookup and fragmentation, instead of at the bottom of the
   IP output code(where pfhook transmit is). 

   If an incoming packet matches a load balancing rule, the rule's
   algorithm will be used to select a back end server.  If the
   rule requires the use of NAT, the header of the packet will be
   re-written with the NAT info.  After the server selection and
   header re-write, the normal IP incoming packet process will
   continue using the selected server's IP address as the destination.
 
   If an incoming packet is a fragment destined to a VIP of any
   load balancing rules, ILB will drop it.  This is a potential RFE
   for future phases of this project.

   10.1 ICMP processing

        The ILB code has some special handling for incoming ICMP packets
        destined to one of the load balancing rules' VIP.  If the ICMP
        packet is an echo request, the ILB code will reply this request
        on behalf of the back end servers.  Note that a VIP can be used
        in more than one rule.  And an ICMP echo message does not include
        enough information for the ILB to decide which rule to use to
        handle it.  So the ILB code needs to handle this itself.  

        If the ICMP message is "destination unreachable: fragmentation
        needed," the ILB code checks the payload of the message and
        finds out if the message should be forwarded to a back end
        server.  If the ICMP message needs to be forwarded, the ILB
        code will re-write the ICMP IP header and the header inside the
        ICMP message appropriately.  This forwarding is possible for rules
        using NAT or rules using DSR with persistence enabled.  

        ILB will drop all other ICMP messages destined to a VIP.

   10.2. ioctl() interface

         The ioctl() interface is Private to this project.

         Details TBD.
 
   10.3. Interaction with other IP technologies

         IPMP - The position of the interception point ensures that
         ILB works  well with IPMP.

         IPsec - ILB cannot load balance IPsec encrypted traffic since
	 ILB cannot  read the transport header.

         Packet Filter Hooks - ILB does not interfere with any registered
         hook.  For example,it should work well with a firewall module using
         PF hooks. But since ILB may modify the header information, it
         can have unwanted interaction with modules which also modify header
         information.  Note that this interaction is deterministic since the
         position of ILB interception is fixed and the possible modification
         of a packet can be derived from the ILB rules.  A system administrator
         just needs to be careful in using ILB with this kind PF hook module,
         such as IP Filter NAT.  


11. Reference
---------------
    1. http://www.opensolaris.org/os/project/vrrp/vrrp_design.pdf
    2. http://developers.sun.com/solaris/articles/event_completion.html
    3. Man pages port_get(3C), port_associate(3C), port_create(3C)
    4. Man page privileges(5)
    5. http://www.ietf.org/rfc/rfc3768.txt
    6. ftp://ftp.ee.lbl.gov/papers/sync_94.ps.Z

Appendix A:  POC performance results
----------------------------------------

Test Topology

   		                ------------
   		                |	   |	       
   --------------		| L3/L4	LB | 
   |             |              |          |           ----------------- 
   |	         | subnet 1     | x4200m2  |	       |                | 
   |Ixia mimicing|--------------|e1000g0   |	       |Ixia mimicing 4 |
   |238 clients  |		|  e1000g1 |---------- |back-end servers| 
   |	         |		|	   | subnet 2  ------------------ 
   ---------------	        ------------
Hardware
---------
DUT x4200 with e1000g nics
Ixia details: Ixia 400T 8 port chassis with
	      IxLoad version 3.30.42.143
Traffic:  HTTP1.0/1.1 reqyuests
Page size for concurrent connections and CPS: 1 byte HTML file
Page size for throughput: 64Kbyte HTML file

	    Performance Results
	   =====================

# of CPUS	Mode	CPS	Concurrent	Tput
				Connections     (Mbps)
========================================================
4	      IPFNAT-RR 34,000  450,000		932
4	      ILBNAT-RR 41,500  850,000		920
4	      DSR(srcIP -hash)   -	 -	2296



Appendix B:  ILB Commands
--------------------------


NAME
	ilbadm - manipulate load balancing rules

SYNOPSIS

	ilbadm create-rule [-e] \
		-i vip=value,port=value[,protocol=value][,ipversion=value] \
		   [,interface=ifname] \
		-m lbalg=value,type=value \
		[-h hc-name=value]
		-o servergroup=value name

	ilbadm list-rules [-a] [-p|-f] [-o key[,key ...]] [name ...] 
	ilbadm destroy-rule -a | name ... 

	ilbadm enable-rules [-t] [name ... ]
	ilbadm disable-rules [-t] [name ... ]

	ilbadm show-statistics [-thaAd] [-r rulename] [interval [count]]

	ilbadm show-nat [opts ...]

	ilbadm create-servergroup [-s server=hostspec[:portspec...]] groupname

	ilbadm destroy-servergroup groupname
	ilbadm list-servergroup [-s|-f|[-p] -o field[,field]] [name]

	ilbadm enable-server [-t] -s server=value[,value] 
	ilbadm disable-server [-t] -s server=value[,value] 

	ilbadm add-server -s server=value[,value ... ] name
	ilbadm remove-server -s server=value[,value ... ] name

	ilbadm create-healthcheck hc-test=value[,hc-timeout=value]	\
		[,hc-count=value][,hc-interval=value][hc-port=value]	\
		hcname
	ilbadm destroy-healthcheck hcname

	ilbadm export-rules [filename]
	ilbadm import-rules [filename]

	ilbadm export-servergroups [filename]
	ilbadm import-servergroups [filename]

	ilbadm monitor filename

DESCRIPTION

	ilbadm manipulates or displays information about ILB rules using the
	subcommands outlined below. 

	Rulenames are case insensitive, but case is preserved as it is
	entered. Names are limited to 80 characters.

	Parsable output: all parsable output requires that the fields to
	be printed be given with the -o option. Fields will be printed in the
	same order they are encountered on the commandline, seperated by
	':' characters (if there's more than one value). If this character
	occurrs in the printed string itself, it will be preceeded by a '\';
	the same is done for the '\' character itself.
	No headers will be printed for parsable output.

	Synopses below only contains short options, long options are shown 
	in the explanation

	Global Options:

	-t
		causes the modicifation (rule creation ...) to be temporary:
		this modification will not be reflected in persistent
		storage, ie it will not persist across reboots/restarts of
		the daemon.

	Subcommands:

	create-rule [-e] -i <incoming> -m <method attributes> \
		-o <outgoing spec> [-h <healthcheck> ] name

		creates a rule "name" with the given characteristics.
		<incoming> and <dst> are both specified as a set of 
		"key=value" pairs.
		The following keys and values are valid:

		-i
			introduces the matching criteria for incoming packets:
		vip		(virtual) destination ip address
		port[-port]	port number or name (eg, "telnet", "dns")
				a port can be specified by port number or
				symbolic name (as in /etc/services)
				port ranges are also supported (numeric only)
		protocol	the protcol: "TCP" (default),
				"UDP" (see /etc/protocols)
		ipversion	ip version: "IPv4" (default), "IPv6" or 
				"both" for an unspecified vip.
		interface	optional, for the case when the interface to 
				"watch" for VIP cannot be derived by the 
				system

		-m
				the keys describing how to handle a packet:
			lbalg	"round-robin" (default), "hash-IP",
				"hash-IP-port", "hash-IP-VIP"
			type	aka topology: "DSR", "half-NAT", "NAT"

		-o
			specifies to which destinations a packets matching the 
			criteria specified with -i will be distributed
			among:
		servergroup	specify a single server group as target. The
				server group must already have been created.
				If -t is not used, the servergroup must
				also be created without -t.

		-h
			hc-name	specifies the name of a pre-defined
				healthcheck method



		OPTIONS:

		-e
			create rule enabled (default: disabled)

		If "name" already exists, the command will fail. The command
		will also fail if a rule exists that matches the given vip.

	destroy-rule -a | name ... 

		remove all information pertaining to rule "name". If "name"
		doesn't exist, command will fail. 

		-a	destroy all rules. (name will be ignored)

	enable-rules [-t] [name ... ]

		enables a named rule (or all, if no names are given). 
		Enabling rules that are already enabled is a noop.

	disable-rules [-t] [name ... ]

		disables a named rule (or all, if no names are given). 
		Disabling rules that are already disabled is a noop.

	show-statistics [-thaAd] [-r rulename] [interval [count]]

		shows statistics (see examples below)

	show-nat [[-p] -o field[,field ...]] [count]

		displays NAT information (options, format TBD)

		displays "count" lines of type "value", or a given default,
		if no count is given (currently 20). Specifying 0 for count
		means "all".

		Specifying an offset will cause the display to start at
		the specified position in the list. This offset should be
		less than the current number of elements of type value, or
		nothing will be printed.

		No assumptions should be made about the relative positions
		of elements in consecutive runs of this command, ie
		executing "show-nat 10" twice is not guaranteed to show the
		same 10 items twice, esp. on a busy system.

		-o
			specifies which fields to print: legal values:
			in_local, in_global, out_local, out_global
		-p
			print fields in a parsable manner (requires -o)

	list-rules [-f] [-d|-e] [[-p] -o field[,...]] [name ...] 

		prints characteristics of the specified rules, or all, if
		none is specified.

		-o	lists fields to be printed.
		-p	print parsable output in the format explained above.
			requires -o
		-f	prints a full list. 
		-e	print only enabled rules (default: all)
		-d	print only disabled rules

		-s, -p, and -f are mutually exclusive

		for an example of the output, see examples, below

	export-rules [filename]

		exports the complete set of rules in a way that can be 
		re-imported using import-rules
		Format TBD

	import-rules [filename]

		reads rulesets from filename (or stdin) and applies them. 
		The format (TBD) used will be the one created by 
		export-rules.
		NOTE: existing rules are not destroyed first, so if a 
		"clean slate" is required, rules need to be destroyed first.

	create-servergroup [-s server=hostspec[:portspec...]] \
		[-i interface=name|proxy-src=src] groupname

		creates a server group. additional servers can later be 
		added using the "add-server" subcommand. an optional
		server-facing interface can also be specified if desired.

		Server groups are the only entity that can be used during
		rule creation to indicate back-end servers

		Options:
		-s	specifies a list of servers to add to the
			servergroup.
			hostspec:	hostname|ip[-ip]
					IPv6 addresses must be enclosed in 
					brackets "[]" to distinguish them 
					from ":port"
			portspec:	service|port[-port]

		-i	adds incoming options

			name:	interface name (eg "e1000g0")
			src:	ip[-ip] (NAT only): src ip address to
				replace incoming packets' src address,
				or a range of hosts (if second ip is given)

				for both ip and port ranges: the second value
				must be greater than the first, with IP
				addresses being interpreted as notated
				MSB-first. Ranges aren't supported when
				using hostnames.


	disable-server [-t] -s server=hostspec[:portspec ...] 

		disables the given servers *for all servergroups*, ie, if
		a given server's details are found in more than one
		servergroup, every one of these servergroups will be affected.

		-t	temporarily disable a list of servers

		-s 	server (list)
			hostspec:	ip|hostname (see "create-servergroup"
					for IPv6 syntax rules)
			portspec:	port#|service

		This is reduced from what can be given for servers with
		"create-servergroup" - we believe it only makes sense to 
		disable one server at a time, or even only a port at
		a time, if one is given.

		This information is not persistent across reboots. To 
		permanently remove a server from a servergroup, use
		"remove-server"

	enable-server [-t] -a|-s server=value[,value] 

		(re)enables a disabled server with the given value. see
		"disable-server" for what information goes into "value"
		if no port is specified, all ports for the given server
		are enabled.

		see "disable-server" above for details on options.

	destroy-servergroup groupname

		destroys a server group.

	list-servergroup [-f|[-p] -o field[,field]] [name]

		lists a servergroup (or all, if no name is given)
		Options:
		-f	full (the default is names only)
		-o	print the specified fields
		-p	print fields in parsable format (see above), requires
			-o

		the options -f and -o (with or without -p) are
		mutually exclusive.

	add-server -s server=value[,value ...] servergroup

		add server(s) specified to servergroup
		see "create-servergroup" for definition of value

		-s	create-servergroup

	remove-server -s server=value[,value ...] servergroup

		remove server(s) from servergroup

		-s	see create-servergroup

	export-servergroups [filename]
	import-servergroups [filename]

		these subcommands behave in a fashion analogous to 
		export-rules and import-rules, resp.

	create-healthcheck hc-test=value[,hc-timeout=value][,hc-count=value] \
		[,hc-interval=value][hc-port=value] hcname

		sets up healthcheck information for rules to use.

		the hc-test is performed hc-count times until it succeeds
		or hc-timeout has expired.

		For this implementation, all servers for a rule are 
		checked using the same test.

		hc-test		"PING", "TCP", external method
				(script, binary ...)
		hc-timeout	until a test is to be considered 
				failed if hc-test never succeeds.
				Optional; default TBD
		hc-count	number of attempts to run hc-test
				Optional; default TBD
		hc-interval	time between two tests (must be greater
				than hc-timeout * hc-count)
		hc-port		Optional. Port to use for the test.
				When not used, ilbd will determine which port
				to use. 

		The following arguments are passed to external
		methods:
		$1	VIP (literal IPv4 or IPv6 address)
		$2	Server IP (literal IPv4 or IPv6 address)
		$3	Protocol (UDP, TCP as a string)
		$4	Numeric port
		$5	Maximum time (in seconds) the method
			should wait before returning failure. If
			the method runs for longer, it may be
			killed, and the test considered failed.
		External methods should return 0 for success and 255
		for failure. All other return values are reserved for
		future use.


	destroy-healthcheck hcname

	monitor filename

		causes monitoring information to be appended to file
		<filename>. use '-' for stdout

Examples:

example: round-robin all dns traffic

	ilbadm create-servergroup -s servers=dnsserver1,dnsserver2 dnsgroup
	ilbadm create-rule -e -i proto=UDP,ipversion=ipv4,vip=1.2.3.4,port=DNS \
		-m lbalg=round-robin,type=DSR \
		-o servergroup=dnsgroup dnsrule 

example: add a server to the servergroup defined above:

	ilbadm add-server -s server=dnsserver22 dnsgroup

example: distribute http traffic between 4 servers

	ilbadm create-servergroup -s servers=webserv1,webserv2,webserv3 webgroup
	ilbadm add-server -s servers=webserv4 webgroup
	ilbadm create-rule -i port=80,vip=15.192.0.0,ipversion=IPv4 \ 
		-m lbalg=hash-IP-port,type=NAT \
		-o servergroup=webgroup webrule 

example: prepare two sets of rules:
(notice there's an overlap here - perhaps because 10.1.1.3 is a bigger
box than the other ones.)

	ilbadm create-servergroup -s servers=10.1.1.0,10.1.1.2,10.1.1.3 \
		websg
	ilbadm create-servergroup -s servers=10.1.1.3,ftpserv.our.org \
		ftpgroup

	ilbadm create-rule -e -i port=http -m lbalg=hash-IP-port,type=NAT \ 
		-o servergroup=websg  webrule 

	ilbadm create-rule -i port=ftp -m lbalg=hash-IP-port,type=NAT \
		-o servergroup=ftpgroup ftprule
	ilbadm create-rule -e -i port=ftp-data -m lbalg=hash-IP-port,type=NAT \
		-o servergroup=ftpgroup ftpdatarule

Example: print a list of rules ('$' prompt added for readability):

	$ ilbadm list-rules

	rule4
	rule3
	RULE-all
	
	$ ilbadm list-rules -f

      RULE ACT  IPv. PROTO             VIP  PORT   ALGORITHM     TYPE S.GROUP
     rule4   Y IPv4   tcp         1.2.3.4   ftp  roundrobin      DSR ftpgroup
     rule3   N IPv6   tcp         2003::1   ftp  roundrobin      DSR ftpgroup6
  RULE-all   Y IPv6   tcp         2002::1  http  roundrobin      DSR webgrp_v6

	in the following example, long lines are wrapped for easier
	reading:

Example: export rules. import-rules 

	$ ilbadm export-rules

	create-rule -e ipversion=IPv4,protocol=tcp,VIP=1.2.3.4,port=ftp \
		-m algorithm=roundrobin,type=DSR \
		-o servergroup=ftpgroup rule4
	create-rule ipversion=IPv6,protocol=tcp,VIP=2003::1,port=ftp \
		-m algorithm=roundrobin,type=DSR \
		-o servergroup=ftpgroup6 rule3
	create-rule -e ipversion=IPv6,protocol=tcp,VIP=2002::1,port=http \
		-m algorithm=roundrobin,type=DSR \
		-o serverrgroup=webgrp_v6 RULE-all


NAME 
	ilbadm show-statistics

DESCRIPTION

	We define these set of kstats for the ilb project:

	module: "ilb"
	instance: 0
	
	class: "kstat"
		name "global"
			statistic: "num_rules"

	class: "rulestat"
		name <rule>
			statistic: "create_time"
				   "num_servers"
				   "bytes_dropped"
				   "pkt_dropped
				   "ip_frag_in"
				   "ip_frag_dropped"

	class "serverstat"
		name <ip-of-server>
			statistic: "bytes_processed"
				   "pkt_processed"

NAME
	ilbadm show-statistics

SYNOPSIS

	ilbadm show-statistics [-thaAd] [-r rule] [interval [count]]

        -t	print a timestamp with every header
        -d	print delta over whole interval
		(default: changes per second)
        -a	print absolute numbers as well delta
        -A	print only absolute numbers
		(since module initialisation)
                if both -a and -A are given, last takes precedence

DESCRIPTION

	while for the most part the behaviour of lbstat is intuitive and 
	usage can be directly adapted from vmstat etc., a few points:
	- headers are printed once for every 10 samples. This is hard-coded.
	- timestamps, if chosen, are printed before the header. The format is
	  fixed to the system's "date" format for the C locale.
	- currently, addition or removal of a rule is neither detected nor
	  indicated.

EXAMPLES

	$ ilbadm show-statistics 1 
         pkts           not                   bytes            not
    processed        proc'd       dropped     processed        proc'd       dropped
          232            16             0         10286           738             0
            0             0             0             0             0             0
            0             0             0             0             0             0


	$ ilbadm show-nat

inside:global          local               outside:local      global
171.16.68.5:80         10.10.10.1:80       171.16.68.1:80     171.16.68.1:80
171.16.68.5            10.10.10.1             ---                 ---

the following passage is stolen from http://www.cisco.com/en/US/tech/tk648/tk361/technologies_tech_note09186a0080094837.shtml:

==== begin quote
Cisco defines these terms as:

    * Inside local address:     The IP address assigned to a host on
    the inside network. This is the address configured as a parameter
    of the computer OS or received via dynamic address allocation
    protocols such as DHCP. The address is likely not a legitimate IP
    address assigned by the Network Information Center (NIC) or service
    provider.

    * Inside global address:    A legitimate IP address assigned by the
    NIC or service provider that represents one or more inside local IP
    addresses to the outside world.

    * Outside local address:    The IP address of an outside host as it
    appears to the inside network. Not necessarily a legitimate
    address, it is allocated from an address space routable on the
    inside.

    * Outside global address:   The IP address assigned to a host on
    the outside network by the host owner. The address is allocated
    from a globally routable address or network space.

These definitions still leave a lot to be interpreted. For this
example, this document redefines these terms by first defining local
address and global address. Keep in mind that the terms inside and
outside are NAT definitions. Interfaces on a NAT router are defined as
inside or outside with the NAT configuration commands, ip nat inside
and ip nat outside. Networks to which these interfaces connect can then
be thought of as inside networks or outside networks, respectively.

    * Local address:    A local address is any address that appears on
    the inside portion of the network.

    * Global address:   A global address is any address that appears on
    the outside portion of the network.

==== end quote

Appendix C:  Redundancy scenarios that ILB will be able to handle 
----------------------------------------------------------------------

DSR  Topology
===============

   Clients in the Internet
            |
            |
        ----------
       |  ROUTER |
       ----------- 
	   | 192.168.6.1
	   |	
    ===================================================== 192.168.6.0/24
                 |        |                         |   |
                 |        |VIPS for virtual services|   |
 eth0 192.168.6.3|        |eth0 192.168.6.2         |   |
	        ---------  ---------                |   |
                | LB1   |  | LB2    |               |   |
                |Primary|  |Standby |               |   |
                |       |  |        |               |   |
                 --------  ----------               |   |
                 |              |                   |   |
                 |              |                   |   | 
                 |              |                   |   | 
                 |              |                   |   |
                 |              |                   |   |
             ---------       ---------------        |   |
             |SWITCH 1|------|SWITCH 2     |        |   |
             |        |      |             |        |   | 
             |        |      |             |        |   |
             ---------       ---------------        |   |
             |  \            / |                    |   |
       ================================ 10.0.0.0/24 |   | 
                |                |                  |   |
             Server1           Server2 --------------   |
                |                                       | 
                ----------------------------------------- 
	Server IP address: 10.0.0.x/24
        Default router on servers = 192.168.6.1 

        All VIPs on LBs are configured on interfaces facing
	subnet 192.168.6.0/24 . LB1 runs a VRRP instance
	per VIP. 


 NAT Topology
==============

 Clients in the Internet
            |
            |
        ----------
       |  ROUTER |
       ----------- 
	   | 192.168.6.1
	   |	
    ================================== 192.168.6.0/24
                 |        |
                 |        |        VIPs for virtual services 
 eth0 192.168.6.3|        |eth0 192.168.6.2 
	        -------   ---------        
                | LB1  |  | LB2   |      
                |Master|  |Backup |  
                |      |  |       | 
                 -------  -------- 
                 |              |
                 |              |              
                 |              |             
                 |              |   Floating default gateway 10.0.0.1        
                 |              |         
             ---------       ---------------  
             |SWITCH 1|------|SWITCH 2     | 
             |        |      |             | 
             |        |      |             |
             ---------       --------------- 
               |  \            / |             
       ================================ 10.0.0.0/24 
                |                |                 
             Server1           Server2 

	Server IP address: 10.0.0.x/24
        Default router on servers = 10.0.0.1 

        All VIPs on LBs are configured on interfaces facing
	subnet 192.168.6.0/24 . LB1 runs a VRRP instance
	per VIP and the floating default gateway. 


Failure scenario 1 : LB1 is dead
Solution: 
LB2 will detect the failure and take over as the primary for all the VIPs 


Failure scenario 2: LB1:eth0 LB1:eth1 are down 
Solution: 
LB1 continues to think its the Primary and sends VRRP advertisements,
which never reaches LB2. LB2 becomes Primary. so now LB1 and LB2 are
*both* primary load balancers and this is fine as nothing from LB1 will
reach the servers. When the links of LB1 are back up, LB2 will receive
the advertisements and will relinquish its position as Primary and
become a Standby again.  

Appendix D Load balancer topologies
---------------------------------------
Single legged topology

                             --------------
                             |Load Balancer| 
                             --------------
                                 |             
                                 |
              -------            |                      --------
Internet----- |Router| ------ Local Network ----------  |Server1|
              -------                                   --------
                                                        ---------
                                           ------------ |Server2|
                                                        --------



Dual legged topology
                --------                        --------------
Internet -------| Router|----- Local Network----|Load Balancer|
                ---------                       ---------------
						     |
                                                Target Network
                                                 |      |
                                            --------  ---------
                                            |Server1| |Server2|
                                            --------- ---------