Reparse Points and Referrals Umbrella Case (PSARC 2009/399) Alan Wright (amw@sun.com) Draft 1, July 28th, 2009 1. Introduction This project is the umbrella case to document the roadmap for reparse points and referrals. The roadmap will be delivered in phases and will be documented in subsequent PSARC cases. The intent of this case is to provide the overall context and motivation to add reparse points by describing an implementation (referrals) that will be built on the fundamental infrastructure provided by reparse points. This roadmap will improve both UNIX and Windows interoperability by adding support for features that are being delivered by other UNIX vendors and Microsoft (for Windows). The explosive growth of data storage and proliferation of file servers and NAS appliances has created a namespace management nightmare for corporate data centers and storage administrators. Administrators spend a great deal of time on file management and data movement tasks, which are tedious and time-consuming for administrators, disruptive to users, and expensive for companies. Companies are looking for better ways to scale and manage their file systems, and global namespaces provide a means to ease and simplify the problem. A namespace is a logical abstraction layer between clients and file systems; it provides a method of viewing and accessing files that is independent of the physical file locations. Administrators can use namespaces to logically arrange and present data to users, irrespective of where the data is located. The two most popular file sharing protocols, NFS and SMB, offer a mechanism to implement global namespaces. NFSv4.x offers it primarily for UNIX environments through referrals and SMB offers it primarily for Windows environments through the Distributed File System (DFS). In this context, a namespace is a group of exports or shared folders, possibly located on different servers, that is presented as a virtual directory tree: it appears to users as if they are accessing a local file system on the server that is hosting the namespace. A virtual namespace on a host server is typically comprised of three components: - A root folder that is exported or shared by the host server so that the namespace is visible over network - Regular folders - Links that represent exports or shares [typically on other systems] that hold the actual data behind the virtual namespace This unified namespace is managed centrally on the server that hosts it. The links in a namespace provide the mapping between the virtual namespace and the physical backend servers, serving as a layer of abstraction to provide administration relief. Solaris is in the unique position of offering both NFS and SMB services natively, which means it can offer a true heterogeneous unified namespace for these network file protocols: With the addition of reparse points and referrals, Solaris will support global namespaces that can be shared over both NFS and SMB simultaneously or separately. Reparse points are a feature of NTFS, the main file system used with Microsoft Windows, and provide the underpinning for symbolic links, junctions, mount points and, of particular interest here, the Microsoft Distributed File System (MS-DFS). MS-DFS is a referral mechanism that is conceptually similar to NFS4.x referrals. The addition of reparse points will provide the enabling infrastructure to support both SMB referrals (MS-DFS) and NFS referrals, which may be used to provide powerful and flexible features, such as namespace aggregation. In essence, referrals extend the concept of a symbolic link to go beyond the local system and refer clients to locations within file system namespaces on other systems. The result would be the ability for customers to create a unified namespace from a collection of separate and/or disparate file systems on a collection of independent machines. 2. Scope This case does not deal with the specifics of reparse points or referrals, the intent is to introduce the subject matter and present the roadmap. The infrastructure being proposed may introduce changes to existing file system objects and VFS interfaces. Significant efforts will be made to prevent incompatibilities and minimize the potential for any changes proposed in this or the follow-on cases to affect existing behavior. Automounter extensions were considered but rejected because of the desire to create centrally administered namespaces, served by a group of file servers to near-zero-administration clients. It is expected to be easier to keep the namespaces uniform if only a small number of servers need to participate. Also, for both NFS and SMB referrals it is the client that selects the target rather than the server. The server only provides target information, which may include several possible targets, and it is up to clients to select a specific target to access data at the alternate location. The intent is to avoid introducing changes that will affect the Single UNIX Specification or the IEEE and ISO POSIX standards. 3. Reparse Points and Referrals An overview of identified projects follows below. The specifics of each sub-project will be detailed in the follow-on projects. Reparse points are file system objects that allow applications and services (consumers) to store tagged identifiers. Identifiers are typed such that consumers can identify their own identifiers and take consumer specific action when such an object is encountered in the file system. Referrals represent a specific use of reparse points to redirect a service, such as NFS or SMB, to another location. Referrals are similar to symlinks in that they redirect path processing to resume at a target location but they extend the model to include targets on different, disparate file systems, which may reside on other machines running different operating systems. Referrals provide a means to create aggregated namespaces from a collection of separate and/or disparate file systems on a collection of independent machines. File systems should treat reparse points and referrals as opaque objects and are not expected to interpret them. The follow-on projects will define the specific implementation details and syntax but, for the purpose of illustration only, the NFS service might create a reparse point represented by the tag NFS and a UUID. When the NFS service encounters this reparse point, it can take the appropriate action indicated by the UUID. A reparse point of the form NFS:550e8400-e29b-41d4-a716-446655440000 may resolve to path@host. The NFS server would refer an NFS client to connect to 'path' on 'host'. Other services, such as SMB, would not attempt to act on this reparse point. The following RFEs have been raised: 6232743 Server should support NFS4ERR_MOVED error and fs_locations attribute 6711751 SMB/CIFS Distributed File System (DFS) The follow-on projects are: 1. Reparse Points The reparse points project will define and implement the infrastructure to support these objects, which will include a library and service to parse reparse tags. 2. NFSv4 Referrals This project will provide support for NFSv4 referrals as defined in RFC 3530. 3. Standalone MS-DFS This project will provide support for SMB referrals in a standalone (single root) configuration as defined in [MS-DFSC]: Distributed File System (DFS): Referral Protocol Specification and [MS-DFSNM]: Distributed File System (DFS): Namespace Management Protocol Specification. A DFS root is the top of a DFS topology and the start of a shared folders hierarchy. A DFS root can be defined at the server level or at the domain level. A standalone DFS root is hosed and stored on a single machine, as opposed to being stored in Active Directory. 4. Domain based MS-DFS This project will extend the functionality from the standalone MS-DFS project to add support for domain based DFS. Domain based DFS is hosted on Active Directory member servers or domain controllers, with the topology stored in Active Directory. 5. NFS Support for FedFS This project will implement the Federated FS work being standardized in the IETF NFSv4 working group. A referral will be resolved by consulting a centrally-managed Name Services Database (NSDB) to support the construction of uniform enterprise namespaces. 4. Interfaces and Dependencies NFS and SMB referrals will depend on reparse points. Interfaces will be specified by the follow-on projects on a per-project basis. Configuration will be needed to define referrals with a goal to keep administration simple. Where possible, Windows MS-DFS tools will be supported. Configuration changes never require a reboot. 5. Security Impact Referrals depend on the security in place at the referral target. No security impact is envisaged with the addition of reparse points or referrals. 6. NFS Referrals in other UNIX Operating Systems Referrals have been implemented in other major UNIX distributions such as AIX, HP-UX and Linux but there is no unified approach or implementation, see [2,3,4,5,6]. AIX, HP-UX and Linux specify referrals as an NFS export option. The option format is basically the same in all three operating systems (refer=path@host) but the presentation is somewhat different in each case: - In AIX a special object is used to represent a referral. - In HP-UX a referral is a file system partition or logical volume. - In Linux a referral is presented as a mount point. These are all mechanisms to trigger a change in namespace while resolving a path. This case is somewhat aligned with the AIX approach but does not require a new object type to be defined, which has the advantage of not impacting existing applications. An NFS "refer" option will be supported to provide option format compatibility. Note that the Solaris requirements include support for both NFS and SMB referrals whereas these other operating systems only support NFS referrals, and they do not provide native SMB support. For the Solaris operating system, this proposal provides a generic solution to support multiple, disparate referral mechanisms without placing restrictions on the format required by each mechanism. To provide compatibility with other UNIX operating systems, sharemgr(1M) will be enhanced to support a refer option for NFS exports. Details of this option will be presented as part of the NFSv4 referrals project. 7. References [1] RFC 3530 NFSv4 Protocol http://www.ietf.org/rfc/rfc3530.txt [2] http://www.citi.umich.edu/projects/nfsv4/linux/using-referrals.html [3] http://nfsv4.bullopensource.org/doc/migration-and-replication-0.2.pdf [4] http://docs.hp.com/en/5900-0306/ch01s11.html?jumpid=reg_R1002_USEN [5] http://docs.hp.com/en/13578/nfsv4_whitepaper.pdf [6] http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.commadmn/doc/commadmndita/nfs_referrals.htm [7] Windows Server Protocols (WSPP) http://msdn.microsoft.com/en-us/library/cc197979(PROT.10).aspx) [8] [MS-DFSC]: Distributed File System (DFS): Referral Protocol Specification [9] [MS-DFSNM]: Distributed File System (DFS): Namespace Management Protocol Specification [10] Overview of DFS in Windows 2000 http://support.microsoft.com/kb/812487