PSARC Questions Version 1.17 Here is a comprehensive list of questions that ARC members may ask of project presenters. Providing this information in advance of your review will greatly simplify the ARC's process of identifying the critical information relevant to your project. It is expected that many of these issues may be unresolved at the time of an inception review. However, they should be answerable at commitment review, and may be addressed at inception if significant. Please make your answers concise! Most if not all of these questions will be addressed in related documents such as 1-pagers, specs, design documents, etc. There is no reason to duplicate effort, and "see section 3.2 in the design spec" is an excellent answer. Of course, the referenced material must be provided with your submission. Entire sets of questions may be N/A for your project. For example, device drivers rarely have GUIs, and so the entire GUI section can just be deleted. In such cases, PLEASE NOTE N/A FOR THE MAIN QUESTION, AND DELETE THE REST OF THAT QUESTION SET. This questionnaire is meant to provide the ARC with an overview of your project, and it touches on the main areas of architectural interest. This template will be revised based on its users' experiences; your comments and suggestions are welcome. Send them to john.plocher@sun.com. For advice about architectural choices, pointers to various SAC guidelines, and other project considerations including Licensing and Patents, see http://sac.sfbay/arc/ARC-Considerations.html ------------------------------------------------------------------------ 1. What specifically is the proposal that we are reviewing? - What is the technical content of the project? This project introduces a new backup and restore method into ndmpd, zfs send and receive. It also introduces support for ZFS volume backup and restore, also using send and receive. - Is this a new product, or a change to a pre-existing one? If it is a change, would you consider it a "major", "minor", or "micro" change? See the Release Taxonomy in: This is a change to the existing ndmpd. Though the change does not directly impact current functionality (as it is an addition of a backup method), it may be considered a "major" change. The new backup/restore method will be made available for versions 3 and 4 of the NDMP protocol (ndmpd currently supports versions 2, 3, and 4). - If your project is an evolution of a previous project, what changed from one version to another? N/A - What is the motivation for it, in general as well as specific terms? (Note that not everyone on the ARC will be an expert in the area.) See one-pager as well as "zfs-based ndmpd backup" (henceforth "functional specification"), Section 2. - What are the expected benefits for Sun? See one-pager as well as functional specification, Section 2. - By what criteria will you judge its success? Usage in the field as a practical backup method. 2. Describe how your project changes the user experience, upon installation and during normal operation. There should be no difference upon installation. A new backup type will be available to the user. See functional specification for details. - What does the user perceive when the system is upgraded from a previous release? No difference except in the size of the ndmpd binary or other similar details of the SMF service, assuming the NDMP service is already available on the prior release. 3. What is its plan? - What is its current status? Has a design review been done? Are there multiple delivery phases? Two phases are planned for integration. The development work for the first phase (modulo several issues being investigated) is complete, and QC testing is underway. This includes testing with key data management applications (DMAs) certified to work with Solaris ndmpd. An initial design review was done early on and a reviewed functional specification (covering both phases) is available (and included with the case materials). Development work for the second phase is in progress. Additionally, we plan to speak with DMA vendors regarding the new backup type, though this may not occur prior to integration. Such conversations may occur once more extensive ISV testing is underway post-integration. The functional specification describes the interface for the initial release (covering both phases). Enhancements are expected to be made in subsequent releases. Regarding the two phases, the first phase will include all functionality described in the functional spec except for the server-side option (Section 4.4. of spec). The second phase will consist of the server-side option, which is to be used with DMAs that are unable to handle a new backup type (contrary to NDMP protocol). 4. Are there related projects in Sun? - If so, what is the proposal's relationship to their work? Which not-yet- delivered Sun (or non-Sun) projects (libraries, hardware, etc.) does this project depend upon? What other projects, if any, depend on this one? This project utilizes the zfs send/receive technology as a new backup method within ndmpd. (Please refer to the functional specification.) - Are you updating, copying or changing functional areas maintained by other groups? How are you coordinating and communicating with them? Do they "approve" of what you propose? If not, please explain the areas of disagreement. We are utilizing zfs send/recv, and will also be interfacing with the SS7000 functionality. (See spec.) A FW engineer is closely involved in the design of this feature. Specification drafts have been sent to key FW/ZFS engineers for review. General approval. 5. How is the project delivered into the system? - Identify packages, directories, libraries, databases, etc. No difference from current ndmpd delivery. (See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions.) The functionality will require modification to the SS7000 metadata plugin (see functional spec for more details). No plugin is needed when the functionality is used on Solaris Next. The SS7000 plugin will be delivered per SS7000 release mechanisms. 6. Describe the project's hardware platform dependencies. - Explain any reasons why it would not work on both SPARC and Intel? No difference from current ndmpd dependencies. See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. 7. System administration - How will the project's deliverables be installed and (re)configured? - How will the project's deliverables be uninstalled? - Does it use inetd to start itself? - Does it need installation within any global system tables? - Does it use a naming service such as NIS, NIS+ or LDAP? - What are its on-going maintenance requirements (e.g. Keeping global tables up to date, trimming files)? - How does this project's administrative mechanisms fit into Sun's system administration strategies? E.g., how does it fit under the Solaris Management Console (SMC) and Web-Based Enterprise Management (WBEM), how does it make use of roles, authorizations and rights profiles? Additionally, how does it provide for administrative audit in support of the Solaris BSM configuration? - What tunable parameters are exported? Can they be changed without rebooting the system? Examples include, but are not limited to, entries in /etc/system and ndd(8) parameters. What ranges are appropriate for each tunable? What are the commitment levels associated with each tunable (these are interfaces)? There is no difference from current ndmpd system administration in general. [See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions.] Some new NDMP environment variables will be exported. In addition, two new SMF properties (linked to the existing ndmpd SMF service) will be introduced. (See the functional spec for more details, specifically sections 4 and 5.) See also interface table below (Q13). 8. Reliability, Availability, Serviceability (RAS) - Does the project make any material improvement to RAS? - How can users/administrators diagnose failures or determine operational state? (For example, how could a user tell the difference between a failure and very slow performance?) - What are the project's effects on boot time requirements? - How does the project handle dynamic reconfiguration (DR) events? - What mechanisms are provided for continuous availability of service? - Does the project call panic()? Explain why these panics cannot be avoided. - How are significant administrative or error conditions transmitted? SNMP traps? Email notification? - How does the project deal with failure and recovery? - Does it ever require reboot? If so, explain why this situation cannot be avoided. - How does your project deal with network failures (including partition and re- integration)? How do you handle the failure of hardware that your project depends on? - Can it save/restore or checkpoint and recover? - Can its files be corrupted by failures? Does it clean up any locks/files after crashes? See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. Since the new functionality is based on zfs send/recv, for zfs-related concerns, see Zettabyte Filesystem, PSARC/2002/240/20questions.txt The /var/ndmp logs can also be used for diagnosis. See also the functional spec for a more detailed discussion on error handling (Section 6.3). 9. Observability - Does the project export status, either via observable output (e.g., netstat) or via internal data structures (kstats)? - How would a user or administrator tell that this subsystem is or is not behaving as anticipated? - What statistics does the subsystem export, and by what mechanism? - What state information is logged? - In principle, would it be possible for a program to tune the activity of your project? See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. Since the new functionality is based on zfs send/recv, for zfs-related concerns, see Zettabyte Filesystem, PSARC/2002/240/20questions.txt The /var/ndmp logs can also be used for diagnosis. In addition, any diagnostic or statistical tools useful with ZFS might be useful on the NDMP server, where zfs send/recv are called. Any ZFS method to tune the performance of zfs send/recv might be applicable. 10. What are the security implications of this project? - What security issues do you address in your project? - The Solaris BSM configuration carries a Common Criteria (CC) Controlled Access Protection Profile (CAPP) -- Orange Book C2 -- and a Role Based Access Control Protection Profile (RBAC) -- rating, does the addition of your project effect this rating? E.g., does it introduce interfaces that make access or privilege decisions that are not audited, does it introduce removable media support that is not managed by the allocate subsystem, does it provide administration mechanisms that are not audited? - Is system or subsystem security compromised in any way if your project's configuration files are corrupt or missing? - Please justify the introduction of any (all) new setuid executables. - Include a thorough description of the security assumptions, capabilities and any potential risks (possible attack points) being introduced by your project. A separate Security Questionnaire http://sac.sfbay/cgi-bin/bp.cgi?NAME=Security.bp is provided for more detailed guidance on the necessary information. Cases are encouraged to fill out and include the Security questionnaire (leveraging references to existing documentation) in the case materials. Projects must highlight information for the following important areas: - What features are newly visible on the network and how are they protected from exploitation (e.g. unauthorized access, eavesdropping) - If the project makes decisions about which users, hosts, services, ... are allowed to access resources it manages, how is the requestor's identity determined and what data is used to determine if the access granted. Also how this data is protected from tampering. - What privileges beyond what a common user (e.g. 'noaccess') can perform does this project require and why those are necessary. - What parts of the project are active upon default install and how it can be turned off. See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. 11. What is its UNIX operational environment: - Which Solaris release(s) does it run on? - Environment variables? Exit status? Signals issued? Signals caught? (See signal(3HEAD).) - Device drivers directly used (e.g. /dev/audio)? .rc/defaults or other resource/configuration files or databases? - Does it use any "hidden" (filename begins with ".") or temp files? - Does it use any locking files? - Command line or calling syntax: What options are supported? (please include man pages if available) Does it conform to getopt() parsing requirements? - Is there support for standard forms, e.g. "-display" for X programs? Are these propagated to sub-environments? - What shared libraries does it use? (Hint: if you have code use "ldd" and "dump -Lv")? - Identify and justify the requirement for any static libraries. - Does it depend on kernel features not provided in your packages and not in the default kernel (e.g. Berkeley compatibility package, /usr/ccs, /usr/ucblib, optional kernel loadable modules)? - Is your project 64-bit clean/ready? If not, are there any architectural reasons why it would not work in a 64-bit environment? Does it interoperate with 64-bit versions? - Does the project depend on particular versions of supporting software (especially Java virtual machines)? If so, do you deliver a private copy? What happens if a conflicting or incompatible version is already or subsequently installed on the system? - Is the project internationalized and localized? - Is the project compatible with IPV6 interfaces and addresses? See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. For NDMP environment variables and error handling, see functional specification (NDMP env variables: Sections 4.2., 4.4., 5.1.1., 5.1.4., 5.1.5. Error handling: Section 6.3 and throughout) SIGCANCEL is sent upon abort to help terminate running threads. 12. What is its window/desktop operational environment? - Is it ICCCM compliant (ICCCM is the standard protocol for interacting with window managers)? - X properties: Which ones does it depend on? Which ones does it export, and what are their types? - Describe your project's support for User Interface facilities including Help, Undo, Cut/Paste, Drag and Drop, Props, Find, Stop? - How do you respond to property change notification and ICCCM client messages (e.g. Do you respond to "save workspace")? - Which window-system toolkit/desktop does your project depend on? - Can it execute remotely? Is the user aware that the tool is executing remotely? Does it matter? - Which X extensions does it use (e.g. SHM, DGA, Multi-Buffering? (Hint: use "xdpyinfo") - How does it use colormap entries? Can you share them? - Does it handle 24-bit operation? See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. 13. What interfaces does your project import and export? - Please provide a table of imported and exported interfaces, including stability levels. Pay close attention to the classification of these interfaces in the Interface Taxonomy -- e.g., "Committed," "Uncommitted," and "*Private;" see: http://sac.sfbay/cgi-bin/bp.cgi?NAME=interface_taxonomy.bp Interfaces Imported Interface Classification Comments --------------------------------------------------------------------- NDMP protocol Standard V4 Draft Specification 4/2003 zfs send/receive Committed Private =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Interfaces Exported Interface Classification Comments --------------------------------------------------------------------- NDMP environment Committed variables/semantics NDMP path semantics Committed Backup stream header Committed Private and overall structure Metadata plugin interface changes Committed Private Original i/f in PSARC/2008/331 NDMP Metadata Plug-in API - Exported public library APIs and ABIs N/A Protocols (public or private) NDMP V4 protocol Drag and Drop ToolTalk Cut/Paste N/A - Other interfaces - What other applications should it interoperate with? How will it do so? (Answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions): A: Uses NDMP protocol to work with NDMP client/server applications such as NBU (Symantec), Tivoli (IBM), EBS (Legato), Netvault (Bakbone), Brightstor (CA). - Is it "pipeable"? How does it use stdin, stdout, stderr? No. Internal piping is used to write zfs send/recv to/from tape. No stdin. Stdout/stderr is used for print outs. - Explain the significant file formats, names, syntax, and semantics. See functional specification. - Is there a public namespace? (Can third parties create names in your namespace?) How is this administered? N/A - Are the externally visible interfaces documented clearly enough for a non-Sun client to use them successfully? See functional spec for externally visible interfaces. These are planned to be clearly documented in documentation. 14. What are its other significant internal interfaces inter-subsystem and inter-invocation)? - Protocols (public or private) - Private ToolTalk usage - Files - Other - Are the interfaces re-entrant? See functional specification for description of internal functional interfaces. 15. Is the interface extensible? How will the interface evolve? - How is versioning handled? The zfs send/recv format is "committed" and versioning is handled by zfs (see zfs man page). (See functional specification for more information regarding versioning, specifically for - "zfs" backup type tape header - incremental snapshot ZFS properties.) - What was the commitment level of the previous version? N/A (no previous version for the "zfs" backup type) - Can this version co-exist with existing standards and with earlier and later versions or with alternative implementations (perhaps by other vendors)? The new ndmpd functionality (i.e. the new "zfs" backup type) must be used with a compatible version of zfs--the version it is introduced with or later. The feature is not envisioned to be used with alternative implementations of ndmpd or with other file systems. The new ndmpd functionality can coexist with existing Sun ndmpd functionality. - What are the clients over which a change should be managed? The clients are data management applications (DMAs). - How is transition to a new version to be accomplished? What are the consequences to ISV's and their customers? A new version of the feature hopefully should be transparent to DMA clients. If it is not, backward compatibility issues will be carefully considered and managed. Also see answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions for general information on versioning with respect to the NDMP protocol. 16. How do the interfaces adapt to a changing world? - What is its relationship with (or difficulties with) multimedia? 3D desktops? Nomadic computers? Storage-less clients? A networked file system model (i.e., a network-wide file manager)? See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. We will monitor customer requests, etc. to add enhancements to the "zfs" backup type as appropriate. 17. Interoperability - If applicable, explain your project's interoperability with the other major implementations in the industry. In particular, does it interoperate with Microsoft's implementation, if one exists? - What would be different about installing your project in a heterogeneous site instead of a homogeneous one (such as Sun)? - Does your project assume that a Solaris-based system must be in control of the primary administrative node? See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. As long as the ndmpd binary is installed on a Solaris system, there should be no difference if other systems in the site are (or are not) non-Solaris. 18. Performance - How will the project contribute (positively or negatively) to "system load" and "perceived performance"? The project is expected to contribute positively to perceived performance. (See functional specification, Section 2 and References for details.) The effect on system load should be similar to that of zfs send/recv on the system. Backups and restores are often done during "down" times. - What are the performance goals of the project? How were they evaluated? What is the test or reference platform? See functional spec, Section 2 and References. No specific performance goals, though we are interested in capitalizing on any performance benefit from the use of zfs send/recv. Preliminary numbers appear promising but we need further research. - Does the application pause for significant amounts of time? Can the user interact with the application while it is performing long-duration tasks? See answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. - What is your project's MT model? How does it use threads internally? How does it expect its client to use threads? If it uses callbacks, can the called entity create a thread and recursively call back? Pthreads are used. A thread is spawned from the main backup or restore handler to start the backup or restore. This allows the handler to return to the DMA client. The backup/restore spawns two threads, one a reader and the other a writer. Depending on whether the main operation is a backup or restore, one of these threads will interface with the tape layer and the oa will execute the zfs send/recv. The two threads communicate via a pipe. An additional thread might be spawned if an abort is requested by the client DMA. This thread will cause a SIGCANCEL to be sent to the zfs send/recv thread. No callbacks are used. - What is the impact on overall system performance? What is the average working set of this component? How much of this is shared/sharable by other apps? Impact on overall system performance should be low. Average working set dependent on data size. [The effect on system load should be similar to that of zfs send/recv on the system. Backups and restores are often done during "down" times.] - Does this application "wake up" periodically? How often and under what conditions? What is the working set associated with this behavior? (Answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. No waking up. The NDMP client can remotely schedule periodic backups but the "wake up" would happen on another system.) For "zfs"-based backup, an internal pipe is used (see answer to next bullet for description). There will be blocking due to the use of the pipe. The pipe transmits data to/from zfs send/receive. - Will it require large files/databases (for example, new fonts)? (Answer from NDMP Service, PSARC/2007/397/final.materials/ndmp_20questions. If debugging is enabled, it could create a large text file. Debugging is disabled by default and if it is enabled, it will disable across the boots.) A file descriptor is not passed directly to zfs_send/receive(), but rather a pipe is used so that tape management can be done. This internal pipe may contain large amounts of data during data transfer. - Do files, databases or heap space tend to grow with time/load? What mechanisms does the user have to use to control this? What happens to performance/system load? See above answer on the internal pipe. (Internal zfs send/recv behavior is not addressed here.) 19. Please identify any issues that you would like the ARC to address. - Interface classification, deviations from standards, architectural conflicts, release constraints... - Are there issues or related projects that the ARC should advise the appropriate steering committees? No issues. 20. Appendices to include - One-Pager. - Prototype specification. - References to other documents. (Place copies in case directory.)