SearchWiki:

Main.SideBar (edit)

PmWiki

pmwiki.org

Recent Changes Printable View Page History Edit Page

Draft PRD

Cheating from Sam's pnfs PRD, I'm going to create one much like it for ck3

Executive Summary

NFS has historically counted on ethernet CRC and transport level checksums (TCP checksums) in order to ensure data integrity and catch data corruption on the wire. Since the default TCP checksum algorithm is relatively weak, it can't detect all the data corruption taking place on the wire. Thus, NFSv4 must employ some form of checksumming at the NFS protocol layer in order to ensure that the data it delivers to the higher layers is actually "good" data.

This PRD covers the requirements for project ck3 that'll introduce over-the-wire checksums at the NFSv4 protocol layer. ck3 plans to introduce checksums only for data over-the-wire coupled with the READ/WRITE operations. It doesn't, however, plan to introduce checksums for the entire NFSv4 packet (aka checksums at the RPC layer) - that will be accomplished by a separate project.

Customer Analysis

Target Markets

All Sun customers using NFS but more importantly those that are using it in the data center.

Key Customers

End Users

Users with filesystems that are NFS mounted.

Product Requirements

To make things clearer, requirements for the otw protocol and for the client/server have been separated

Requirements List for OTW protocol

# Date Category Source Customer Problem Customer Need Product Requirement
12005-05-04 Interoperability n/a If I have NFSv4 servers from other vendors, you should be able to interoperate with them Customers need to make sure they have interoperability in their environment The checksum solution should gracefully transition between implementations that support checksums and those that don't
22005-05-04 RAS n/a If I have failover mounts, the checksum protocol should be able to provide continuity in file serving alongwith data integrity The service needs to work in high availability scenarios In case of failover mounts, the checksum protocol should renegotiate with the failed over server and continue to provide data integrity even with the failed over server

Requirements Description

1. Marketing Input - Semantics of the client and server negotiation need to be clarified. It should be possible for the server to *prefer* a certain cksum algorithm but *require* a different algorithm. If the server prefers SHA256 and requires SHA1 at the very least, and if the client can not provide either of these cksum capabilities, the server should refuse the connection to such a client.

2. Input from the Sponsors - The otw protocol should be flexible enough to allow for possible future extensions, i.e. there should enough reserved fields to enable implementation of file level checksums in the future.

Requirements List for client/server

# Date Category Source Customer Problem Customer Need Product Requirement
12005-05-04 Approachability n/a It should be easy to turn checksums on and off on a per fs/share basis Ease of Administration We must make the changes to the mount/share command intuitive
22005-05-04 Approachability n/a It should be easy to specify a checksum algorithm on a per fs/share basis Ease of Administration We must make the changes to the mount/share command intuitive
32005-05-04 Approachability n/a It should be possible to upgrade/downgrade the client or the server at will without significant performance degradation Ease of Administration The checksum solution should adapt itself to changes in checksum capabilities at the client and the server
42005-05-24 RAS n/a If I upgrade/downgrade the server, the client should just work Upgrades/downgrades should be seamless The client shouldn't need to unmount and remount after a server upgrade/downgrade
52005-05-04 RAS n/a With checksums enabled the NFSv4 performance degradation should be minimal Performance hit shouldn't be significant otherwise it's a non-starter We need to make sure the checksum protocol as well as the default algorithms are well optimised
62005-05-04 Observability n/a With checksums enabled, we should be able to tell the relative performance degradation between various algorithms It's useful to be able to characterize performance hit so the client/server can be tuned appropriately We need to be able somehow get performance data and provide it to the user via some common nfs command
72005-06-08 Approachability Marketing Input Administering checksums should fit in nicely with the new shareadm command Ease of administration There shouldn't be a new command to administer checksums
82005-06-8 Approachability Marketing Input It should be easy to administer policies on the client and the server Ease of administration It should be easy to specify a recommended policy or a mandatory policy for checksum use

Requirements Description

1. Input from Sponsors - Checksumming should be on by default with a reasonable checksum algorithm otherwise no one will use it. For the rare case that the cksums need to be turned off, it should be easy to do so.

2. Input from Sponsors - There should be a single way to configure the default use and the algorithms for all shares and mounts. In a default configuration, the dfstab should be unchanged from what it is today and still be checksumming.

3. Marketing Input - Adminstering checksums should fit in nicely with the new command (shareadm?) Doug is doing

4. Marketing Input - Changes introduced by this project shouldn't preclude any work needed to enable file level checksums sometime in the future.

5. Marketing Input - Atleast one checksum algorithm should meet military requirements.

6. Marketing Input - This project should not preclude any future work required to specify a recommended/mandatory policy for checksum use. NOTE: Creating a way to "specify" policies is NOT a requirement on this specific project.

7. If the client has krb5i protection enabled, otw checksums should be configured to be turned off thereby not duplicating the effort.

Competitive Analysis

Sun will be one of the early NFSv4 vendors to implement checksums at the protocol level, I don't believe any of the other vendors do implement cksums.

Competitive Listing

We will need to keep an eye on what Linux does in this space as well as what vendors like NetApp do

Competitive Assessment

Strategy Fit

It fits well with the Insight strategy of ensuring data integrity for the entire Solaris I/O stack.

Impact

Business Justification

Ensuring data integrity at the NFS protocol level will facilitate deployment of NFSv4 in the data center.

Issues/Risks and Proposed Mitigation

  • Solaris doesn't have an implementation of a cksum algorithm that this project might be able to use. This risk may be mitigated by providing a private implementation in the interim.
  • Performance degradation to a point where enabling cksums would be a significant performance hit needs to kept in consideration (perhaps by doing a performance evaluation right upfront at the prototype stage).
  • An alternate cksum proposal might come up at the IETF WG that is vastly different than our implementation thus making interoperability hard. This risk will be mitigated by watching the IETF WG closely and influencing it so that we don't paint ourselves into a corner.
Edit Page - Page History - Printable View - Recent Changes - WikiHelp - SearchWiki
Page last modified on September 09, 2005, at 07:20 PM