1. What specifically is the proposal that we are reviewing?  

- What is the technical content of the project? 

  This project redesigns the existing attempts at IP Duplicate Address
  Detection (DAD) in Solaris and brings the system behavior in line
  with current expected norms.

  It also allows for future implementation of Apple's Rendezvous
  protocol and the IETF's "Detecting Network Attachment" features.

- Is this a new product, or a change to a pre-existing one? If it is
  a change, would you consider it a "major", "minor", or "micro" change?

  This is a Minor change to an existing product.

- If your project is an evolution of a previous project, what
  changed from one version to another?

  The previous design, which had separate and limited implementations
  of DAD scattered about the system, has grown unworkable over time.
  This new project centralizes the implementation and brings it up to
  industry standard.

- What is the motivation for it, in general as well as specific terms? 
  (Note that not everyone on the ARC will be an expert in the area.)  

  Duplicate addresses in IP can result from either administrative
  error or configuration protocol (DHCP, RARP) failure.  When
  duplicates happen, the results are usually catastrophic: the
  offender who caused the problem usually doesn't know that anything
  has gone wrong, the legitimate address holder is shut out, and the
  administrator has to look at unstable text messages in the system
  log file to know that anything's amiss.

- What are the expected benefits for Sun?

  Reduced reliance on "snoop" to discover and fix this sort of
  configuration error; system enters a predictable fail-safe mode and 
  the node that causes the problem logs useful messages.

  Reduced calls to service when networks are disrupted by accidental
  misconfiguration.

  Slight simplification for DHCPv6 client and DNA implementation.

- By what criteria will you judge its success?

  Duplicate addresses are detected and handled in some reasonable
  manner with a mix of old and new systems and with other vendor's
  boxes.

2. Describe how your project changes the user experience, upon
   installation and during normal operation.

- What does the user perceive when the system is upgraded from a
  previous release?

  None.  This affects operation only in error cases.

3. What is its plan?

- What is its current status? Has a design review been done?  Are there 
  multiple delivery phases?

  A prototype has been put together to aid detailed design.  A design
  review will be held once the design nears completion.

4. Are there related projects in Sun?  

- If so, what is the proposal's relationship to their work? Which
  not-yet- delivered Sun (or non-Sun) projects (libraries, hardware,
  etc.) does this project depend upon? What other projects, if any,
  depend on this one?

  Surya proposes to integrate the ARP module into IP.  If this were to
  be done before this project, then this project will need to adapt to
  these changes by removing the inter-module interfaces used in the
  prototype.  (Discussions with the Surya project team indicate that
  their schedule is much longer than this one, and that the changes in
  this project should not affect them significantly, so this is
  unlikely to be an issue.)

  A future project to add Rendezvous into Solaris is being
  contemplated by the a13y team.  This feature will play a small part
  in that.

  Future projects to add DNA (Detecting Network Attachment) and DHCPv6
  will need to use the features provided by this project.

  The old ATM interfaces use the ARP/IP STREAMS interface via a
  contract outlined in LSARC 1993/101.  There's no chance that these
  EOL products will be enhanced with DAD support, so the existing
  interface will need to be preserved.

- Are you updating, copying or changing functional areas maintained
  by other groups?  How are you coordinating and communicating with
  them?  Do they "approve" of what you propose?  If not, please
  explain the areas of disagreement.

  As above; there's an interaction with the Surya team.  As the teams
  are divided along lines that aren't entirely functional, the
  responsibility for these bits is shared between the groups.

  We (well, *I*) have direct contact with them by email.  So far, they
  approve of the general ideas, and are on the review team.

5. How is the project delivered into the system?  

- Identify packages, directories, libraries, databases, etc.

  No new packages, directories, libraries, or other parts.  Delivered
  through enhancements to existing bits.

6. Describe the project's hardware platform dependencies.

- Explain any reasons why it would not work on both SPARC and Intel?

  None; it's fully platform-independent.

7. System administration

- How will the project's deliverables be installed and (re)configured?

  Pkgadd as a part of regular install.

- How will the project's deliverables be uninstalled?  

  Setting fire to the system.

- Does it use inetd to start itself? 
- Does it need installation within any global system tables?  
- Does it use a naming service such as NIS, NIS+ or LDAP? 

  No.

- What are its on-going maintenance requirements (e.g. Keeping global
  tables up to date, trimming files)?

  It uses regular syslog for logging, which has its own maintenance
  requirements.

  Internally, it has ARP resolution tables that are maintained by
  timers.  Only peers with which we are actively communicating are
  represented there.

- How does this project's administrative mechanisms fit into Sun's system
  administration strategies?  E.g., how does it fit under the Solaris
  Management Console (SMC) and Web-Based Enterprise Management (WBEM), how
  does it make use of roles, authorizations and rights profiles?
  Additionally, how does it provide for administrative audit in support of
  the Solaris BSM configuration?

  N/A

- What tunable parameters are exported? Can they be changed without
  rebooting the system?  Examples include, but are not limited to,
  entries in /etc/system and ndd(8) parameters. What ranges are
  appropriate for each tunable? What are the commitment levels
  associated with each tunable (these are interfaces)?

  The project introduces several ndd tunables.  They're all listed as
  Project Private for now, as we don't intend to document them for
  customer use unless they're needed to resolve specific deployment
  issues.

8. Reliability, Availability, Serviceability (RAS)

- Does the project make any material improvement to RAS?

  Yes.  It improves the ability to detect and resolve IP address
  misconfiguration, and provides a mechanism that allows true static
  ARP entries to be configured.

  Availability is enhanced by removing a possible (and unfortunately
  common) mechanism for taking down a crucial network server:
  accidental duplicate address configuration on a client.  Instead,
  the offender alone is shut down.

  Serviceability is enhanced by having the offender become aware of
  the problem he's caused (currently, no notice is provided; only the
  victim knows that something is wrong), and updating the standard
  interfaces (i.e. ifconfig(1M)) to show the status.

- How can users/administrators diagnose failures or determine
  operational state?  (For example, how could a user tell the
  difference between a failure and very slow performance?)

  Several ways:

	- The IFF_DUPLICATE flag will be set on interfaces that have
          failed due to duplicate address detection (DAD).  This flag
          will be visible in ifconfig(1M).

	- The "netstat -p" (aka "arp") output is enhanced to show
          flags representing the state of DAD on each entry.

	- The kernel will tear down misconfigured interfaces, and thus
          cause routing socket messages to be sent to interested
          clients when something has gone wrong.

- What are the project's effects on boot time requirements?

  Not yet measured, but the IPv6 changes should speed boot time by
  about 1 second per interface because we no longer serialize IPv6
  interface configuration on DAD operation.

- How does the project handle dynamic reconfiguration (DR) events?

  N/A (DR handling is part of normal IP)

- What mechanisms are provided for continuous availability of service?

  N/A

- Does the project call panic()?  Explain why these panics
  cannot be avoided.

  No.

- How are significant administrative or error conditions
  transmitted? SNMP traps? Email notification?  

  Routing socket messages, interface flags (that can be read by
  SIOCGLIFFLAGS), and syslog.  (Sysevents are possible and under
  consideration.)

- How does the project deal with failure and recovery?

  The project is all about failure and recovery.  It makes sure that
  when a failure occurs, we limit the damage to as few systems as
  possible -- ideally just one.

- Does it ever require reboot?  If so, explain why this situation
  cannot be avoided.

  No.

- How does your project deal with network failures (including
  partition and re- integration)?  How do you handle the failure
  of hardware that your project depends on?

  For regular hardware failure, we restart DAD when the hardware comes
  back.  (Failed hardware cannot cause duplicate addresses on the
  network, so there's no good reason to restart DAD before repair.)

  Silent hardware failure, if it's possible, appears as a false DAD
  success.  This is no different from the situation today: there's no
  way to tell if the reason you're not hearing from anyone is because
  you're broken, they're broken, or they just don't exist.  The
  implication is that the local system *may* have a duplicate address
  if the network itself is broken or partitioned.

  For network partition and reintegration, we specifically handle this
  by defending our chosen addresses periodically, and scaling the
  defense based on the difficulty of obtaining a new address.  (I.e.,
  we try harder for static addresses than for DHCP or IPv6 temporary
  addresses.)

- Can it save/restore or checkpoint and recover?  

  N/A

- Can its files be corrupted by failures?  Does it clean up any
  locks/files after crashes?

  N/A

9. Observability

- Does the project export status, either via observable output
  (e.g., netstat) or via internal data structures (kstats)?

  "netstat -p" and equivalent "arp" output, as well as "ifconfig"
  output.

- How would a user or administrator tell that this subsystem
  is or is not behaving as anticipated? 

  Observing ifconfig output or system log messages.

- What statistics does the subsystem export, and by what mechanism?

  N/A (There probably should be a 'netstat -s' section for ARP, but
  there currently isn't one.)

- What state information is logged?

  Initial detection of a duplicate address and eventual interface
  failure caused by a duplicate are logged separately.

- In principle, would it be possible for a program to tune the
  activity of your project?

  Yes.

10. What are the security implications of this project?

- What security issues do you address in your project?

  This project adds "permanent" ARP entries, a long sought-after
  security-like feature.

  (Neither ARP nor Ethernet has any real security, so it's illusionary
  at best, but many people seem to think that nailing down ARP
  provides "security."  This project humors them.)

- The Solaris BSM configuration carries a Common Criteria (CC) Controlled
  Access Protection Profile (CAPP) -- Orange Book C2 -- and a Role Based
  Access Control Protection Profile (RBAC) -- rating, does the addition
  of your project effect this rating?  E.g., does it introduce interfaces
  that make access or privilege decisions that are not audited, does it
  introduce removable media support that is not managed by the allocate
  subsystem, does it provide administration mechanisms that are not audited?

  No.

- Is system or subsystem security compromised in any way if your
  project's configuration files are corrupt or missing?

  No.

- Please justify the introduction of any (all) new setuid executables.

  None.

- Include a thorough description of the security assumptions,
  capabilities and any potential risks (possible attack points) being
  introduced by your project.  A separate Security Questionnaire
	http://sac.sfbay/cgi-bin/bp.cgi?NAME=Security.bp
  is provided for more detailed guidance on the necessary information.
  Cases are encouraged to fill out and include the Security
  questionnaire (leveraging references to existing documentation) in the
  case materials.

   Projects must highlight information for the following important areas:
   - What features are newly visible on the network and how are they
     protected from exploitation (e.g. unauthorized access, eavesdropping)
   - If the project makes decisions about which users, hosts, services, ...
     are allowed to access resources it manages, how is the requestor's
     identity determined and what data is used to determine if the access
     granted.  Also how this data is protected from tampering.
   - What privileges beyond what a common user (e.g. 'noaccess') can 
     perform does this project require and why those are necessary.
   - What parts of the project are active upon default install and how it 
     can be turned off.

  N/A -- system security posture is essentially the same as before.
  Neither ARP nor NDP has any real security.

11. What is its UNIX operational environment:

- Which Solaris release(s) does it run on?

  Nevada (Solaris 11)

- Environment variables? Exit status? Signals issued? 
  Signals caught? (See signal(3HEAD).)
- Device drivers directly used (e.g. /dev/audio)?
  .rc/defaults or other resource/configuration files or databases?
- Does it use any "hidden" (filename begins with ".") or temp files?  
- Does it use any locking files? 
- Command line or calling syntax:  
  What options are supported?  (please include man pages if available)
  Does it conform to getopt() parsing requirements? 
- Is there support for standard forms, e.g. "-display" for X programs? 
  Are these propagated to sub-environments?
- What shared libraries does it use?  (Hint: if you have code use "ldd"
  and "dump -Lv")? 
- Identify and justify the requirement for any static libraries.
- Does it depend on kernel features not provided in your packages and
  not in the default kernel (e.g. Berkeley compatibility package,
  /usr/ccs, /usr/ucblib, optional kernel loadable modules)?

  N/A

- Is your project 64-bit clean/ready?  If not, are there any 
  architectural reasons why it would not work in a 64-bit environment?
  Does it interoperate with 64-bit versions?

  Yes, fully 64-bit clean.

- Does the project depend on particular versions of supporting software 
  (especially Java virtual machines)?  If so, do you deliver a private
  copy?  What happens if a conflicting or incompatible version is
  already or subsequently installed on the system?

  No.

- Is the project internationalized and localized?

  N/A

- Is the project compatible with IPV6 interfaces and addresses?

  Yes.

12. What is its window/desktop operational environment?

- Is it ICCCM compliant (ICCCM is the standard protocol for
  interacting with window managers)?
- X properties: Which ones does it depend on?  Which ones does it
  export, and what are their types?
- Describe your project's support for User Interface facilities
  including Help, Undo, Cut/Paste, Drag and Drop, Props, Find, Stop?
- How do you respond to property change notification and ICCCM client
  messages (e.g. Do you respond to "save workspace")?
- Which window-system toolkit/desktop does your project depend on?
- Can it execute remotely? Is the user aware that the tool is
  executing remotely?  Does it matter?
- Which X extensions does it use (e.g. SHM, DGA, Multi-Buffering? 
  (Hint: use "xdpyinfo")
- How does it use colormap entries?  Can you share them? 
- Does it handle 24-bit operation?

  All N/A.

13. What interfaces does your project import and export?

- Please provide a table of imported and exported interfaces,
  including stability levels.  Pay close attention to the
  classification of these interfaces in the Interface Taxonomy --
  e.g., "Standard," "Stable," and "Evolving;" see:

	http://sac.sfbay/cgi-bin/bp.cgi?NAME=interface_taxonomy.bp


			Interfaces Exported
Interface 		Classification 		Comments
ARP/IP STREAMS msgs	Contracted Consolidation Private
						LSARC 1993/101
ndd tunables		Project Private		May become Unstable
ARP Probe		Standard		RFC 3927*

[*] final implementation _may_ differ in documented and compatible
ways from RFC-recommended procedure.


- Exported public library APIs and ABIs
  Protocols (public or private)
  Drag and Drop
  ToolTalk
  Cut/Paste
- Other interfaces

  N/A

- What other applications should it interoperate with?  How will it do so?

  It must interoperate with the ATM "ba" (SAHI3) module until that
  product EOF is completed.  (See LSARC 1993/101.)

- Is it "pipeable"?  How does it use stdin, stdout, stderr?
- Explain the significant file formats, names, syntax, and semantics.
- Is there a public namespace? (Can third parties create names in your 
  namespace?)  How is this administered?

  N/A

- Are the externally visible interfaces documented clearly enough for a
  non-Sun client to use them successfully?

  Yes.

14. What are its other significant internal interfaces 
    inter-subsystem and inter-invocation)?

- Protocols (public or private)
- Private ToolTalk usage
- Files
- Other
- Are the interfaces re-entrant?

    A private STREAMS-based interface is used between ARP and IP.  A
    new message is introduced that allows the enhanced ARP to detect
    that the client is IP -- if not, it disables DAD.  The new message
    also allows IP to detect if ARP is enhanced.  If it's not (i.e.,
    the old ATM module is in use), then it falls back to the previous
    behavior.

    A significant change is that ARP now keeps a copy of newly
    resolved entries, rather than just deleting them when sending the
    update to IP as the old code did.  This change was forced by the
    need to mitigate denial of service attacks by ARP hurricane.
    Previously, the code just discarded everything unrecognized.
    Since some unrecognized things now imply changes that IP must
    handle, this requires that ARP must know when to send a message to
    IP and when to avoid it.

    Low-level testing on a SPARC blade system shows about 22
    microseconds between 'putnext' entry from the driver to return if
    ARP does the discard, and 96 microseconds if IP must discard it.
    Since the existing system gets by with the 22 microsecond time,
    that's the reference goal.

    One benefit of this change is that it's no longer necessary to
    send ARP messages on the wire for the cases where IP "forgets"
    about resolved entries, such as SO_DONTROUTE.

15. Is the interface extensible?  How will the interface evolve?

- How is versioning handled?  

    A new message announcing the enhanced abilities is used.  Old
    modules will discard the message.

- What was the commitment level of the previous version? 

    Contracted Consolidation Private

- Can this version co-exist with existing standards and with earlier
  and later versions or with alternative implementations (perhaps by
  other vendors)?

    Yes.

- What are the clients over which a change should be managed?

    ARP, IP, and ATM 'ba' driver.

- How is transition to a new version to be accomplished? What are the 
  consequences to ISV's and their customers?

    As documented elsewhere; the new message allows the old code to
    run.  No consequences to ISVs, as the interfaces were never made
    public.

16. How do the interfaces adapt to a changing world?

- What is its relationship with (or difficulties with) multimedia?
  3D desktops? Nomadic computers?  Storage-less clients? A networked
  file system model (i.e., a network-wide file manager)?

  N/A

17. Interoperability

- If applicable, explain your project's interoperability with the
  other major implementations in the industry.  In particular, does
  it interoperate with Microsoft's implementation, if one exists?

  Implementation is interoperable with Microsoft, Cisco, and Apple (at
  least).  Others will be tested.

  The implementation will likely differ from the newer link-local RFC
  in several ways due to our desire to reduce CPU usage (the RFC
  recommends a strategy that results in excess broadcasting) and to
  deal with network partitioning and repair (the RFC places no bounds
  on detecting this problem).  The differences will be documented and
  will be compatible with RFC-compliant systems.

- What would be different about installing your project in a
  heterogeneous site instead of a homogeneous one (such as Sun)?

  Nothing.

- Does your project assume that a Solaris-based system must be in
  control of the primary administrative node?

  No.

18. Performance

- How will the project contribute (positively or negatively) to
  "system load" and "perceived performance"?

  Configuring IPv6 interfaces via ifconfig will now be nearly
  instantaneous, rather than requiring a 1-second delay inside
  ifconfig itself.

- What are the performance goals of the project?  How were they
  evaluated? What is the test or reference platform?

  . Boot time for the "desktop system" must be as fast as it was
    before the change.

  . CPU load under ARP "hurricane" attack must be the same or less
    than before.

- Does the application pause for significant amounts of time?  
  Can the user interact with the application while it is performing
  long-duration tasks?

  N/A

- What is your project's MT model? How does it use threads internally? 
  How does it expect its client to use threads?  If it uses callbacks,
  can the called entity create a thread and recursively call back?

  IP itself is fully MT-hot.  ARP uses a single big lock around the
  module.  This project does not change that design.

- What is the impact on overall system performance?  What is the
  average working set of this component?  How much of this is
  shared/sharable by other apps?

  This project will cause ARP to cache more entries than it did
  before.  It's expected that this is at most on the order of hundreds
  of entries even on the worst-managed networks, and not more.

- Does this application "wake up" periodically?  How often and under
  what conditions?  What is the working set associated with this behavior?  

  Yes, it wakes up to defend idle addresses periodically.  This is
  driven by the existing ARP module kernel timer.

- Will it require large files/databases (for example, new fonts)?

  No.

- Do files, databases or heap space tend to grow with time/load? What 
  mechanisms does the user have to use to control this? What happens to 
  performance/system load?

  No.

19. Please identify any issues that you would like the ARC to address.

- Interface classification, deviations from standards, architectural
  conflicts, release constraints...
- Are there issues or related projects that the ARC should advise the 
  appropriate steering committees?

  N/A

20. Appendices to include

- One-Pager.
- Prototype specification.
- References to other documents.  (Place copies in case directory.)

  Draft design document (version 1.1); unreviewed.