*** spec.txt.1	Wed Dec  3 13:37:48 2008
--- PSARC.0107/spec.txt.new	Tue Jan 20 00:49:43 2009
***************
*** 3,11 ****
  This case will extend PSARC/2005/334, by adding the ability to intercept
  packets in MAC layer using the PFHooks infrastructure.
  
! This case only makes two changes, an addition, to the interfaces that were
! committed to by PSARC/2008/219 (see "hook_pkt_event_t" and "new NIC event"
! below for details)
  
  Release Biding
  --------------
--- 3,11 ----
  This case will extend PSARC/2005/334, by adding the ability to intercept
  packets in MAC layer using the PFHooks infrastructure.
  
! This case only makes a few changes, an addition, to the interfaces that were
! committed to by PSARC/2008/219 (see "net_getlifaddr", "hook_pkt_event_t",
! and "new NIC event" below for details)
  
  Release Biding
  --------------
***************
*** 75,80 ****
--- 75,125 ----
  hpe_mp  - points to the mblk_t that is the start of the packet.
  hpe_hpeinfo - points to mac_header_info_t which contains MAC header information
  
+ net_getlifaddr
+ --------------
+ The net_getlifaddr() function returns the address for a given interface.
+ For existing IP netinfo it returns the IP address, and for MAC layer netinfo
+ it returns MAC address for an interface, the the usage of this function
+ is slightly different in the two situations.
+ 
+ int net_getlifaddr(const net_data_t net, const phy_if_t ifp,
+     const net_if_t lif,  int const type, struct sockaddr *storage);
+ 
+     net
+          value returned from a successful call to net_protocol_lookup.
+ 
+     ifp
+          value returned from a successful call to net_phylookup
+          or net_phygetnext, indicating which network interface
+          the information should be returned from.
+ 
+     lif
+          indicating which logical interface to fetch the address from.
+ 
+     type
+          this indicates what type of address should be returned.
+ 
+     storage
+          pointer to an area of memory to store the address data.
+ 
+ This case introduces a slightly different usage for this function
+ when used to retrieve MAC layer information. Unlike IP, MAC doesn't
+ have the concept of logical interface, so the caller should pass in
+ the physical interface as ifp, and pass in a 0 as the lif because
+ there is no valid lif for MAC.
+ 
+ Each call to net_getlifaddr requires that the caller pass in
+ a pointer to an array of address information types to retrieve
+ and an accompanying pointer to an array of pointers to struct
+ sockaddr_dl structures in which to copy the address information
+ into. See below for an example of how to use this function.
+ 
+ Each member of the address type array should be one of NA_ADDRESS,
+ NA_PEER, NA_BROADCAST or NA_NETMASK, and it is up to each layer 2
+ protocol to implement the address type. For Ethernet, NA_ADDRESS
+ and NA_BROADCAST are supported, and NA_BROADCAST always return
+ ff:ff:ff:ff:ff:ff.
+ 
  hook_pkt_event_t
  ----------------
  In order to intercept IP packets at MAC layer, IPFilter needs to know the 
***************
*** 87,96 ****
  While adding a header length field to hook_pkt_info_t solves the problem above,
  down the road we may want to provide the ability to match wifi header, which
  requires information of the wifi header fields in IPFilter, not just the header
! length, thus we propose to add a pointer to hook_pkt_info_t, which points at
  a structure of mac_header_info_t, and pass this through the Hook framework,
  so the hook consumers, like IPFilter, can have the needed information for
! the MAC header. The new hook_pkt_info_t strucuture would look like:
  
  typedef struct hook_pkt_event {
          net_handle_t            hpe_protocol;
--- 132,141 ----
  While adding a header length field to hook_pkt_info_t solves the problem above,
  down the road we may want to provide the ability to match wifi header, which
  requires information of the wifi header fields in IPFilter, not just the header
! length, thus we propose to add a pointer to hook_pkt_event_t, which points at
  a structure of mac_header_info_t, and pass this through the Hook framework,
  so the hook consumers, like IPFilter, can have the needed information for
! the MAC header. The new hook_pkt_event_t strucuture would look like:
  
  typedef struct hook_pkt_event {
          net_handle_t            hpe_protocol;
***************
*** 108,148 ****
  For existing IP/ARP Hooks, the header format is self explained, so hpe_hdrinfo
  will be NULL and IPFilter does the header parsing itself as before.
  
  
! Name to interface resolution
! ----------------------------
! After Clearview UV all the data link related operations use link names, this
! applies to IPFilter as well. When the administrator wants to specify a rule
! that works on certain interface, link name is used to specify which interface
! this rule applies to. So link name consititutes the interface name for
! MAC layer netinfo.
  
- Since layer 2 filtering is based on the MAC client which Crossbow project is 
- introducing, in this project we'll introduce the MAC client index as the
- MAC layer interface pointer, to uniquely indentify a MAC layer interface
- in the kernel. This is similar to the existing ifindex that is used as IP
- layer interface pointer today.
- 
- Netinfo provides functions to translate from a interface name (link name)
- to the corresponding interface pointer (MAC client index) and back, via
- net_phylookup() and net_getifname(). And these functions can be called in
- data path so the existing procedures such as dls_mgmt_get_linkid() and
- dls_mgmt_get_linkinfo() cannot be used as they involve door calls.
- Thus we propose to add a link name <-> link id hash table in dls, and provide
- the following routines to translate between link name and mac name. The MAC
- layer netinfo will use these routines to implement mapping between link name
- and MAC client index.
- 
- +------------------------------------------------------------+
- | Interface                                 | Classification |
- |------------------------------------------------------------|
- | dls_devnet_macname2linkname(const char *, |                |
- |     char *, const size_t);                | consolidation  |
- | dls_devnet_linkname2macname(const char *, | private        |
- |     char *, const size_t);                |                |
- +------------------------------------------------------------+
-      Table: Fuctions for link name and mac name mapping
- 
  new NIC event
  -------------
  The status of network in the operating system often changes, from unplugging
--- 153,171 ----
  For existing IP/ARP Hooks, the header format is self explained, so hpe_hdrinfo
  will be NULL and IPFilter does the header parsing itself as before.
  
+ MAC client index
+ ----------------
+ L2 filtering is based on MAC client which is introduced by Crossbow project,
+ and the filtering is done on a per MAC client basis. When users specify a
+ link name "net0", this corresponds to the traffic going through the primary
+ MAC client of net0, e.g. IP on top of that data link. 
  
! The MAC client index is introduced in this project, which uniquely identifies
! a MAC client and is used by the layer 2 netinfo interface in the same way
! as the ifindex is used by the IP netinfo interface. And layer 2 netinfo
! provides the mapping between data link name and index of the primary MAC
! client of that data link, through net_getifname() and net_phylookup().
  
  new NIC event
  -------------
  The status of network in the operating system often changes, from unplugging
***************
*** 254,259 ****
--- 277,351 ----
  API project, and it will register the hook for that protocol so it can
  receive and match wifi packets.
  
+ Dynamic data path modification
+ ------------------------------
+ To make sure layer 2 filtering has no performance impact when disabled,
+ instead of inserting hooks check into the fast path code, we make use of
+ the function pointer driven approach provided by Crossbow where possible.
+ On RX side layer 2 filter implements its own receive function, and will
+ replace the default function with its own one when l2 filtering is enabled.
+ The l2 filter specfic receive function does layer 2 firewall processing
+ before calling the original receive function. So when l2 filter is disabled
+ there's zero additional processing on the RX path. On TX side layer 2 filter
+ will force packets off the fast path when filtering is enabled, and add
+ the hooks check into the non fast path to avoid impacting performance.
+ In both cases the data path will be modified dynamically when the filtering
+ is enabled/disabled, and this is done on a per MAC client basis.
+ 
+ To do this a function will need to be called when the first hook is registered
+ on a specific hook event, and when the last hook is unregistered from the
+ event, to do the the necessary data path setup. The hook_event_t strcture is
+ changed to accomodate this so that hook providers, MAC plugins in this case,
+ could specify their own callbacks which will be called from hooks_register/
+ hook_unregister().
+ 
+ typedef struct hook_event_s {
+          int             he_version;
+          char            *he_name;       /* name of this hook list */
+          int             he_flags;       /* 1 = multiple entries allowed */
+          boolean_t       he_interested;  /* true if callback exist */
+ +        void            (*he_enable_cb)(hook_event_token_t, hook_event_t *,
+ +                            void *);
+ +        void            (*he_disable_cb)(hook_event_token_t, hook_event_t *,
+ +                            void *);
+ +        void            *he_arg_cb;
+  } hook_event_t;
+ 
+ he_arg_cb points at a mactype_t structure, to identify which MAC plugin the
+ hook is registered on, as l2 filtering is enabled/disabled per MAC plugin.
+ The two callback functions, pointed by he_enable_cb and he_disable_cb, will
+ walk through the MAC clients in the system, and does the necessary data path
+ setup/cleanup for the corresponding MAC clients, which are primary MAC clients
+ on top of data links of the specific MAC plugin.
+ 
+ Relative Hooks ordering
+ =======================
+ Order with Bridging
+ -------------------
+ L2 filtering is done on a per MAC client basis. When the users specify "net0",
+ this refers to the traffic going through the primary MAC client of net0, for
+ example IP on top of that data link. This is different from all traffic going
+ through the physical MAC instance which is shared by multiple MAC clients.
+ And L2 hooks intercept traffic both from/to the wire, and those occur between
+ multiple MAC clients defined on top of the same data link.
+ 
+ With regard to bridging, L2 filter works on top of the bridge, instead of
+ underneath it, as the filtering is based on MAC clients instead of the MAC
+ instance that the bridge uses. This means in certain cases L2 hooks is not
+ able to see the actual interface used for transmit or receive, but only the
+ interface that the network layer believes it's using, as when IP sends a
+ packet on one interface, the bridge may end up transmitting that packet
+ on another interface - if that's the interface on which the destination
+ exists or if the destination is unknown. This is by design as l2 filtering
+ aims more on controling what packets a VM can send to the wire via the data
+ link it is using, insted of which physical link the packets actually get sent
+ out from.
+ 
+ Order with bandwidth limit
+ --------------------------
+ L2 filter sits underneath bandwidth shaping by Crossbow. On RX side, the
+ filtering is done before the bandwidth limit is applied; on TX side, it
+ is applied after the bandwidth limit. 
+ 
  IPFilter changes
  ================
  Users can use ipf(1M) to add ethernet filtering rules in addition to IP 
***************
*** 284,290 ****
  
  [INPUT] -> L2 firewall -> "layer2" IP NAT -> "layer2" IP firewall ->
  ... -> IP NAT -> IP firewall -> { IP }  -> IP firewall -> IP NAT -> ...
! -> L2 firewall -> "layer2" IP firewall -> "layer2" IP NAT -> [OUTPUT]
  
  Input processing
  ~~~~~~~~~~~~~~~~
--- 376,382 ----
  
  [INPUT] -> L2 firewall -> "layer2" IP NAT -> "layer2" IP firewall ->
  ... -> IP NAT -> IP firewall -> { IP }  -> IP firewall -> IP NAT -> ...
! -> "layer2" IP firewall -> "layer2" IP NAT -> L2 firewall -> [OUTPUT]
  
  Input processing
  ~~~~~~~~~~~~~~~~
***************
*** 308,325 ****
  When the packet reaches IP, IP layer filtering/NAT processing is invoked,
  and it works just as it does today.
  
- Design considerations
- ~~~~~~~~~~~~~~~~~~~~~
- This processing order is designed so that
- 
- - The processing order between IP NAT rules and IP Filtering rules is 
- consistent with existing IPFilter today;
- 
- - Since MAC level filtering rules is processed before layer2 IP rules,
- down the road it is possible to combine filtering at both level together,
- allow or block a packet based on a mixture of L2 and L3 criteria, thus
- providing more fine grained control.
- 
  Changes to output
  -----------------
  With layer 2 filtering, each type of rules have its own distinct orders,
--- 400,405 ----
***************
*** 423,450 ****
  
  Interfaces
  ==========
! +----------------------------------------+----------------+
! | Interface                              | Classification |
! +----------------------------------------+----------------+
! | dls_devnet_macname2linkname            |     Private    |
! | dls_devnet_linkname2macname            |     Private    |
! +----------------------------------------+----------------+
! | NE_NAME_CHANGE			 |    Committed   |
! | NHF_ETHER				 |    Committed   |
! | NHF_WIFI				 |    Committed   |
! | NHF_IB	 			 |    Committed   |
! +----------------------------------------+----------------+
! | "ipfilter_hook_eth_in" 		 |   Uncommitted  |
! | "ipfilter_hook_eth_out" 		 |   Uncommitted  |
! | "ipfilter_hook_wifi_in" 		 |   Uncommitted  |
! | "ipfilter_hook_wifi_out" 		 |   Uncommitted  |
! | "ipfilter_hook_ib_in" 		 |   Uncommitted  |
! | "ipfilter_hook_ib_out" 		 |   Uncommitted  |
! +----------------------------------------+----------------+
! | "family ether"			 |     Volatile   |
! | "layer2"			 	 |     Volatile   |
! +----------------------------------------+----------------+
! | IPN_LAYER2				 |     Volatile   |
! | /usr/include/netinet/ip_fil.h		 |   Uncommitted  |
! | /usr/include/netinet/ip_nat.h		 |   Uncommitted  |
! +----------------------------------------+----------------+
--- 503,530 ----
  
  Interfaces
  ==========
! +----------------------------------------+-------------------+
! | Interface                              |  Classification   |
! +----------------------------------------+-------------------+
! | NE_NAME_CHANGE			 |     Committed     |
! | NHF_ETHER				 |     Committed     |
! | NHF_WIFI				 |     Committed     |
! | NHF_IB	 			 |     Committed     |
! | <sys/hook.h>			 	 |     Committed     |		
! | <sys/hook_event.h>		 	 |     Committed     |		
! | <sys/neti.h>			 	 |     Committed     |		
! +----------------------------------------+-------------------+
! | "ipfilter_hook_eth_in" 		 |    Uncommitted    |
! | "ipfilter_hook_eth_out" 		 |    Uncommitted    |
! | "ipfilter_hook_wifi_in" 		 |    Uncommitted    |
! | "ipfilter_hook_wifi_out" 		 |    Uncommitted    |
! | "ipfilter_hook_ib_in" 		 |    Uncommitted    |
! | "ipfilter_hook_ib_out" 		 |    Uncommitted    |
! +----------------------------------------+-------------------+
! | "family ether"			 |      Committed    |
! | "layer2"			 	 | Obsolete Volatile |
! +----------------------------------------+-------------------+
! | IPN_LAYER2				 |      Volatile     |
! | /usr/include/netinet/ip_fil.h		 |    Uncommitted    |
! | /usr/include/netinet/ip_nat.h		 |    Uncommitted    |
! +----------------------------------------+-------------------+
