PSARC Questions Version 1.17 1. What specifically is the proposal that we are reviewing? Tape Logical Block positioning. - What is the technical content of the project? This project adds a new positioning mode to the st driver. The current st driver supports File Block positioning, that being the count of files from the beginning of the tape partition and the block within that file. The Logical Block position is simply a count of any entity recorded (data blocks or file marks) on the tape counting from the beginning of the partition. Example: In this diagram '-' are data blocks and '|' are file marks. | File 1 2 V 3 45 6 7 BOP-----|-----|------------|---------||-----------|--|EOD LB 5 11 24 34 47 50 In the above example File 2 block 2 would be the same position as Logical Block 13. So, basically terms this project adds a new way of counting tape position. It does not remove the old File Block positioning. So what does this buy us? In File Block positioning, if the position becomes questionable for any reason (failed command, reset, etc), the only way to validate the position is to rewind, forward space file, then forward space block to the desired position. If the logical were count tracked at the same time, as proposed in this project, when the position becomes questionable, a SCSI Read Position command could be issued. If the value is what is expected, the position is still valid and no further commands are required. However, if the position was not as expected, the position can be recovered with a single SCSI locate command. To summarize this project adds a counting position in logical blocks, enabling st and the drive hardware to have a common concept of position. This project adds ioctls to perform the SCSI Read Position, Locate and portioning commands that could not be done because File Block positioning could not track there resulting position. This project does not include the recovery of failed commands. It adds the position so that failed commands can be recovered. - Is this a new product, or a change to a pre-existing one? If it is a change, would you consider it a "major", "minor", or "micro" change? See the Release Taxonomy in: This project modifies the current st driver and the mt CLI. The intent is to release this project as part of Solaris Nevada, so this would be minor release taxonomy. This project is contained to st, mt and a couple of header files, so it would be micro change if back porting becomes necessary. However that is not the current intent. - If your project is an evolution of a previous project, what changed from one version to another? This project is based on the st tape driver and the mt command. The current implementation of st has forced higher-end backup and restore utilities to implement logical positioning in their applications using USCSI or SCSI generic drivers. This strategy works in the short term but creates a situation where the st driver is maintaining a position that is not really correct. These applications must keep st thinking that its position is valid, regardless of the truth. Otherwise st will not allow any commands to be done until the position has been validated by rewinding and positioning back. This relationship is far from being honest. This project enables st to not only tracks position using logical blocks but, also to understand the USCSI positioning commands the applications are using makes so it can understand what the real position is. Because of this greater understanding of position, st can make better choices based on the real position state. Though this change does not specifically recover failures of any kind, the applications that use st already are. They already perform read position commands to verify they are where they want to be. When an error occurs, the position mode is set to invalid. If the application does a read position the position mode is promoted to logical and the position data returned is snooped to glean the correct position. At this point if the application thinks the position is correct, st will allow the application to continue without rewinding and repositioning. - What is the motivation for it, in general as well as specific terms? (Note that not everyone on the ARC will be an expert in the area.) Being able to recover lost position requires the ability to accurately track position. The reason for doing it now is that logical block positioning is required for command-level error recovery. Both capabilities are required to successfully implement multi-path for tape. - What are the expected benefits for Sun? Being able to position directly to desired data would enhance performance. Being able to continue a backup that failed, rather then starting over, keeps customers backup window from growing larger. Having multi-path tape turnkey included as part of the OS would be a competitive advantage. - By what criteria will you judge its success? Logical Block positioning will serve as a means to recover position and as an enabler of path failover for tape. 2. Describe how your project changes the user experience, upon installation and during normal operation. No change - What does the user perceive when the system is upgraded from a previous release? After upgrading they will notice that mt commands and ioctls to handle Logical Block positioning like other UNIX platforms. 3. What is its plan? - What is its current status? Has a design review been done? Are there multiple delivery phases? Currently, I have a prototype that has been tested on many platforms and tape drives, and t passes all current qualification tests. Test cases have been written and successfully run against the prototype. Once this arc case is approved, the qualification tests will be extended to test the interfaces added by this project. Logical Block positioning will be put back to Solaris Nevada as a single put back. 4. Are there related projects in Sun? - If so, what is the proposal's relationship to their work? Which not-yet- delivered Sun (or non-Sun) projects (libraries, hardware, etc.) does this project depend upon? What other projects, if any, depend on this one? Tape Command-Level error recovery This would use logical block positioning to attempt to recover failed commands and to reestablish position after resets and cable pulls. Tape Multi-path This would aggregate multiple paths to a tape drive device and provide path failover if a path were to fail. - Are you updating, copying or changing functional areas maintained by other groups? How are you coordinating and communicating with them? Do they "approve" of what you propose? If not, please explain the areas of disagreement. All groups are under the same director and well aware of this project and they are cooperating with this project and the road map behind it. 5. How is the project delivered into the system? - Identify packages, directories, libraries, databases, etc. SUNWckr st SUNWcsu mt SUNWckr misc/scsi (Only needed to decode/display added scsi commands) SUNWhea mtio.h stdef.h commands.h 6. Describe the project's hardware platform dependencies. Works on all platforms where tape currently works. - Explain any reasons why it would not work on both SPARC and Intel? N/A 7. System administration N/A 8. Reliability, Availability, Serviceability (RAS) N/A 9. Observability - Does the project export status, either via observable output (e.g., netstat) or via internal data structures (kstats)? Ioctl MTIOCGETPOS exports the tapepos_t structure with all current position state. - How would a user or administrator tell that this subsystem is or is not behaving as anticipated? No change - What statistics does the subsystem export, and by what mechanism? Nothing new - What state information is logged? Console messages on failed locates and invalid read position data are logged. - In principle, would it be possible for a program to tune the activity of your project? N/A 10. What are the security implications of this project? N/A 11. What is its UNIX operational environment: - Which Solaris release(s) does it run on? Nevada - Environment variables? Exit status? Signals issued? Signals caught? (See signal(3HEAD).) None - Device drivers directly used (e.g. /dev/audio)? .rc/defaults or other resource/configuration files or databases? No change - Does it use any "hidden" (filename begins with ".") or temp files? No - Does it use any locking files? No - Command line or calling syntax: What options are supported? (please include man pages if available) Does it conform to getopt() parsing requirements? See man mt man page included. Options have been added parsing has not. - Is there support for standard forms, e.g. "-display" for X programs? Are these propagated to sub-environments? No - What shared libraries does it use? (Hint: if you have code use "ldd" and "dump -Lv")? libc.so.1 and libm.so.2 are used. This has not changed. - Identify and justify the requirement for any static libraries. N/A - Does it depend on kernel features not provided in your packages and not in the default kernel (e.g. Berkeley compatibility package, /usr/ccs, /usr/ucblib, optional kernel loadable modules)? No - Is your project 64-bit clean/ready? If not, are there any architectural reasons why it would not work in a 64-bit environment? Does it interoperate with 64-bit versions? Yes - Does the project depend on particular versions of supporting software (especially Java virtual machines)? If so, do you deliver a private copy? What happens if a conflicting or incompatible version is already or subsequently installed on the system? No - Is the project internationalized and localized? No - Is the project compatible with IPV6 interfaces and addresses? N/A 12. What is its window/desktop operational environment? N/A 13. What interfaces does your project import and export? Interfaces Exported Interface Classification Comments MTF_LOGICAL_BLOCK Committed Adds to mt_flags returned by MTIOCGET MTTELL Committed New mt_op to get position MTSEEK Committed New mt_op to goto position MTFSSF Committed New mt_op to forward space to sequential file marks MTBSSF Committed New mt_op to back space to sequential file marks MTLOCK Committed New mt_op to lock media MTUNLOCK Committed New mt_op to unlock media MTIOCLTOP Committed Same function as MTIOCTOP except passes 64 bit mt_count tapepos_t structure Committed Contains position state MTIOCGETPOS Committed Gets current position state MTIOCRESTPOS Committed Sets current position state 14. What are its other significant internal interfaces inter-subsystem and inter-invocation)? N/A 15. Is the interface extensible? How will the interface evolve? This is an extension of the MTIOC interface. - How is versioning handled? This project adds a new MTF_LOGICAL_BLOCK mt_flag so that applications can test if the current st driver has the new interfaces. - What was the commitment level of the previous version? Evolving - Can this version co-exist with existing standards and with earlier and later versions or with alternative implementations (perhaps by other vendors)? Yes it maintains backward compatibility for all old functionality. - What are the clients over which a change should be managed? Backup products from Veritos, Legato and others. - How is transition to a new version to be accomplished? What are the consequences to ISV's and their customers? The intent is that it will be seamless. By putting the Logical Block positioning back to Nevada (Open Solaris) well ahead of release ISV's will have plenty of time to qualify their products before GA. However it is very possible that the utilities that use pass through drivers to do positioning but transfer data using st might have issues. Because they are positioning the drive outside of what st is aware of, the positioning state that st is tracking might not be correct. Internal ISV groups have been contacted regarding potential impact of this change. They are contacting their vendors and making them aware of it and preparing to test their products with Nevada. 16. How do the interfaces adapt to a changing world? - What is its relationship with (or difficulties with) multimedia? 3D desktops? Nomadic computers? Storage-less clients? A networked file system model (i.e., a network-wide file manager)? N/A 17. Interoperability - If applicable, explain your project's interoperability with the other major implementations in the industry. In particular, does it interoperate with Microsoft's implementation, if one exists? No change - What would be different about installing your project in a heterogeneous site instead of a homogeneous one (such as Sun)? No change - Does your project assume that a Solaris-based system must be in control of the primary administrative node? No change 18. Performance - How will the project contribute (positively or negatively) to "system load" and "perceived performance"? The impact of tracking logical block counts as well as file block should be negligible. It doesn't add read positions during data transfers. It does add read position commands after space commands so that the logical block count will be in sync. The ability to directly locate to a position will improve positioning time over multiple space commands to get to the same destination. Being able to issue a Read Position command and verify position could save as much as 5 minutes or more over rewinding to the beginning of th tape and counting files and block to validate position. - What are the performance goals of the project? How were they evaluated? What is the test or reference platform? Benchmarks of data throughput showed no change compared the same hosts and tape drives. - Does the application pause for significant amounts of time? Can the user interact with the application while it is performing long-duration tasks? N/A - What is your project's MT model? How does it use threads internally? How does it expect its client to use threads? If it uses callbacks, can the called entity create a thread and recursively call back? Single threaded per instance. - What is the impact on overall system performance? What is the average working set of this component? How much of this is shared/sharable by other apps? the impact is negligible in non-error scenarios. If applications take advantage of read position verses rewind recovery after errors will save the time of unnecessary positioning. - Does this application "wake up" periodically? How often and under what conditions? What is the working set associated with this behavior? st doesn't initiate any actions on its own. - Will it require large files/databases (for example, new fonts)? N/A - Do files, databases or heap space tend to grow with time/load? What mechanisms does the user have to use to control this? What happens to performance/system load? No files are used. Heap space is used and freed only as needed. 19. Please identify any issues that you would like the ARC to address. - Interface classification, deviations from standards, architectural conflicts, release constraints... N/A - Are there issues or related projects that the ARC should advise the appropriate steering committees? N/A 20. Appendices to include - One-Pager. Logical_block.txt - Prototype specification. - References to other documents. (Place copies in case directory.) mtio.man.diffs mt.man.diffs