1.0 CONTEXT This fast-track was spun off of the CIFS Service (PSARC 2006/715) case along with: PSARC 2007/218 caller_context_t in all VOPs PSARC 2007/227 VFS Feature Registration and ACL on Create PSARC 2007/244 ZFS case-insensitive support PSARC 2007/268 Support for CIFS share reservations Although each of these changes is part of the bigger picture, they have been broken down into smaller pieces so each gets the attention it deserves. Several of these CIFS-related fast-tracks will describe changes to the signatures of vnode operations (VOPs) or the data structures they use. In some cases, these fast-tracks will describe multiple changes to the same VOPs. The project team intends to put all of these changes into ON in a single putback. NOTE: The VFSDEF_VERSION number in sys/vfs.h will be bumped from 3 to 4 in order to prevent unbundled file system kernel modules with the old signatures from loading. Once the unbundled file system modules are updated with the new signatures and recompiled, they will also pick up the new VFSDEF_VERSION number and be allowed to load. The project team requests MINOR binding. 2.0 OVERVIEW The Solaris CIFS project requires file system support for a number of new "system" attributes. For file systems that fully support CIFS, the system attributes will be present on all regular files, directories, and opaque extended attribute files. The file system will need to update the attribute(s) as a result of various file system operations such as data written to a file. The CIFS server and application programs will also need to be able to query/update the attributes when needed. Additionally, the CIFS server will be using the Solaris extended attribute model for storing a number of opaque file attributes. ZFS will be the primary underlying file system for the CIFS server. In order to accommodate the existing extended attribute aware utilities in Solaris, the new attributes will be exposed as "views" via the existing Solaris extended attribute model. The new attributes will be grouped in a number of SUNWattr* files that will have multiple attributes in a single file composed as an XDR packed nvlist. Section 4.1 and 4.2 will describe the interfaces for manipulating system attributes and how they are exposed as Solaris extended attributes. It's important to note that there are three distinct types of file attributes being discussed: - Standard file attributes are the file attributes that are returned with the stat(2) system call. - Extended attributes as described by PSARC 1999/209 Extended File Attributes. - Extensible system attributes as described by this case. There are two specific, intentional areas where the interfaces that deal with extensible system attributes overlap with the other two types: * Standard file attributes and the extensible system attributes are retrieved and set using VOP_GETATTR() and VOP_SETATTR() respectively. Section 4.3 of the spec describes these interfaces and their extensions. * Extensible system attributes can be viewed through two special extended attribute files: SUNWattr_ro and SUNWattr_rw. Sections 4.1 and 4.1.1 describe the views and the attributes available through each. 3.0 NEW SYSTEM ATTRIBUTES REQUIRED 3.1 DOS Attributes The CIFS server requires the file system to support the old "DOS" attributes. These attributes can be set/cleared by the owner of a file or a user/group that has been granted the permission via the "write_attributes" ACE permission. CREATETIME The timestamp when a file is created. The owner of the file or any user with "write_attributes" permission has the ability modify this value to any time. ARCHIVE Attribute used to indicate if a file has been modified since it was last backed up. Whenever the modification time (mtime) of a file is changed the "archive" attribute will be set. READONLY Attribute to mark a file as readonly. Once a file is marked as readonly the content data of the file cannot be modified. Other metadata for the file can still be modified. This attribute can be set on directories but it has no semantic meaning. All attempts to modify the content of the file will return EPERM. HIDDEN Attribute to mark a file as hidden. Solaris allows the attribute to be set but it only has meaning in the context of a CIFS server. SYSTEM Solaris has no special semantics for this attribute, but it can be set/cleared with appropriate privilege. 3.2 BSD/MacOS X Attributes In order to ease the porting of ZFS to BSD and MacOS X the following "system" attributes will be supported in ZFS. Setting these attributes requires PRIV_FILE_FLAG_SET. Clearing these attributes requires ALL privileges. NOUNLINK Attribute that prevents a file from being deleted. On a directory, the attribute will also prevent any changes to the contents of the directory. That is, no files within the directory can be removed or renamed. The errno EPERM will be returned when attempting to unlink or rename files and directories that are marked as NOUNLINK. IMMUTABLE Attribute used to prevent the content of a file from being modified or deleted. Also prevents all metadata changes, except for access time updates. When placed on a directory the attribute will prevent the deletion and creation of files and directories. Attempts to modify the content of a file or directory marked as IMMUTABLE will fail with EPERM being returned. Attempts to modify any attributes (with the exception of access time and, with the proper privleges, the IMMUTABLE attribute) of a file marked as IMMUTABLE will fail with EPERM. APPENDONLY Used to allow a file to be modified only at offset EOF. Attempts to modify a file at a location other than EOF will fail with EPERM. 3.3 Third-Party Requested Attributes The following attributes have been requested by a third-party vendor porting ZFS to a different platform. Solaris will not have any semantics associated with them at this time. Callers will be permitted to read the attributes. Modifications to the attributes are described below. NODUMP Solaris has no special semantics for this attribute. Modification of NODUMP requires PRIV_FILE_FLAG_SET to set it and ALL privileges to clear it. SETTABLE Solaris has no special semantics for this attribute. Attempts to set this will fail with EPERM. OPAQUE Solaris has no special semantics for this attribute. Attempts to set this will fail with EPERM. 3.4 Anti-Virus attributes AV_QUARANTINED Set by anti-virus software to mark a file as quarantined. See PSARC/2007/118 for more details on the Virus Scan case. AV_MODIFIED Anti-virus attribute which ZFS will set whenever a file's content or size changes or when the file is renamed. See PSARC/2007/118 for more detail on the Virus Scan case. 4.0 INTERFACES At the application layer, the standard attributes are retrieved and modified by a variety of system calls (e.g., stat(2), chmod(2), chgrp(2), utimes(2)). The system call interface uses a vattr_t structure in conjunction with the VOP_GETATTR() and VOP_SETATTR() file system interfaces to retrieve and modify attributes at the kernel level. These interfaces are efficient but are limited to a fixed set of attributes. The CIFS server project requires a new set of system level attributes and file system support for their semantics. To make matters more interesting, these new attributes are not required for all file systems. The challenge is to maintain support for the existing interfaces and provide support for these new, optional attributes. This case proposes a set of solutions that: - Introduce new libc interfaces to allow applications to manipulate the optional attributes. The new libc interfaces will utilize nvlist/nvpairs to request the attributes an application is interested in. The attribute data will be returned in an nvlist. The nvlists are used so that we will have the ability to easily extend the list of attributes an application can manipulate. The new libc interfaces will be built on top of the extended attribute mechanism as described below. - Enhance the file system extended attribute interface to support system attributes. Support will be added for readonly system attributes as well as read-write system attributes. - Extend the standard Solaris vattr_t to allow additional, optional attributes to be set/retrieved as part of the standard VOP_GETATTR() and VOP_SETATTR() interface. 4.1 Extensions to Extended Attribute Interface for System Attributes One of the challenges of integrating optional system attributes into Solaris is the issue of support from the file management utilities such as mv, cp, tar, cpio, etc. The solution is to augment the existing extended attribute interface to handle system attributes. The file management utilities already support extended attributes so this solution enables these utilities to support system attributes without large modifications. Using this interface also centralizes attribute management and simplifies future changes to the attribute space as discussed in Section 4.4. For more on utilities, see section 4.5 "Changes to Utilities". Any regular file, directory, or opaque extended attribute file, may have system attributes. Recursive system attributes (sysattrs on sysattrs) are not allowed. For every filesystem object that supports system attributes, the reserved system attribute names will always appear to exist. However, these names will be virtual and should not exist in an on-disk directory structure. No on-disk extended attribute directory will be created until the first creation of a regular extended attribute on that object. For filesystem objects that have other non-system attributes, the set of system attribute names will be concatenated with the set of non-system attribute names so that they appear to comprise a single coherent directory. Filesystems may support any combination of extensible system attributes and regular extended attributes. The following files will be added to the top level directory in the extended attribute namespace of all regular files and directories: -r--r--r-- 1 root root 88 May 16 16:17 SUNWattr_ro -rw-r--r-- 1 root root 484 May 16 16:17 SUNWattr_rw Instead of exposing each individual system attribute as a named object, we are exposing named sets of system attributes as views into the system attribute space. This is being done to minimize clutter in the name space. The XATTR_VIEW_READONLY view contains only the readonly system attributes. The XATTR_VIEW_READWRITE view contains all of the read/write system attributes. Section 4.1.1 lists the attributes and data types for each view. The file SUNWattr_ro corresponds to the XATTR_VIEW_READONLY view. The file SUNWattr_rw corresponds to the XATTR_VIEW_READWRITE view. The size of each file is the size of the nvlist for the attributes associated with that file/view. The supported views, attributes, and data types for each view are defined in the header file /usr/include/sys/attr.h. Any attempt to create a file or directory named SUNWattr_ro or SUNWattr_rw in the top level directory in the extended attribute namespace of a file will fail with EINVAL. Any attempt to remove either the SUNWattr_ro or the SUNWattr_rw file will fail with EACCES. Attempts to rename SUNWattr_ro or SUNWattr_rw will fail with EINVAL. Attempts to open the SUNWattr_ro file for write will fail with EACCES. Any attempt to read a different number of bytes than the size of the SUNWattr_ro or the SUNWattr_rw file will return the lesser of: The number of bytes requested or the number of bytes in the file. If the number of bytes returned is less that the number of bytes in the file then the data will not likely be a valid nvlist. The expected usage for an application is call stat() or fstat() to determine the size of the buffer that must be provided to read(). The new libc interfaces described in 4.2.1 are implemented this way. All writes to the SUNWattr_rw file that do not contain a valid nvlist will fail with EINVAL. A valid nvlist must contain only valid nvpairs for one or more attributes associated with that file/view. Writes to the SUNWattr_rw will have no effect on any system attributes not contained in the nvlist. Note that since the Solaris NFSv3/v4 clients and servers are already aware of the Solaris extended attribute interface, file management utilities (mv, cp, cpio, tar, etc) will work over NFS in a Solaris environment as described above. This does not imply protocol support for individual attributes. This is simply stating that the SUNWattr_rw and SUNWattr_ro files can both be read and SUNWattr_rw can be written over NFS on Solaris. (See section 4.3.1 for more on NFSv4 protocol support.) 4.1.1 Supported views and the attributes and data types for each view: View Attribute Data type XATTR_VIEW_READONLY A_FSID uint64_t A_MDEV uint16_t XATTR_VIEW_READWRITE A_READONLY boolean_value A_HIDDEN boolean_value A_SYSTEM boolean_value A_ARCHIVE boolean_value A_CRTIME uint64_array[2] A_NOUNLINK boolean_value A_IMMUTABLE boolean_value A_APPENDONLY boolean_value A_NODUMP boolean_value A_SETTABLE boolean_value A_OPAQUE boolean_value A_AV_QUARANTINED boolean_value A_AV_MODIFIED boolean_value A_OWNERSID nvlist_t *** A_GROUPSID nvlist_t *** *** The nvlist_t values are composed of uint32_t types and strings NOTES: A_FSID and A_MDEV represent the file system ID and the mounted device, respectively. Those values are determined at mount time and cannot be set by an application. The A_OWNERSID, A_GROUPSID represent the CIFS owner and group, respectively, and will only be returned if they exist. 4.2 User-Level API for Optional System Attributes The following sections describe the new interfaces to support system attributes and a discussion of the utility changes necessary. 4.2.1 New Interfaces for System Attributes The following interfaces will be added to libc. A manual page that describes these interfaces has been added to the case directory. Applications that wish to retrieve or modify system attributes should include the new header file /usr/include/attr.h which defines the interfaces below. int fgetattr(int fildes, xattr_view_t view, nvlist_t **response); int fsetattr(int fildes, xattr_view_t view, nvlist_t *request); int getattrat(int fildes, xattr_view_t view, const char *filename, nvlist_t **response); int setattrat(int fildes, xattr_view_t view, const char *filename, nvlist_t *request); Where: 'fildes' must be an open file descriptor associated with the file object from which system attributes will be obtained or updated. 'view' must be one of the supported "views" into the system attribute associated with each file. The current list of supported views is XATTR_VIEW_READONLY and XATTR_VIEW_READWRITE. Additional views may be added in the future 'filename' must be a file in the extended attribute directory associated with fildes. getattrat() will obtain system attributes from filename and setattrat() will update the system attributes for filename 'response' must be the address of a pointer to an nvlist which will contain one nvpair for each of the system attributes associated with view 'request' must be a pointer to an nvlist containing one or more nvpairs of system attributes associated with view fgetattr() obtains an nvlist of system attribute information about the file associated with the provided open file descriptor. getattrat() obtains an nvlist of system attributes information about the extended attribute file with the provided filename in the extended attribute directory associated with the provided open file descriptor. Upon successful completion, the nvlist will contain one nvpair for each of the system attributes associated with the provided view. The current list of supported views is XATTR_VIEW_READONLY and XATTR_VIEW_READWRITE. Not all filesystems support all views and all attributes. The nvlist will not contain an nvpair for any view or attribute not supported by the underlying filesystem. fsetattr() uses the provided nvlist to update one or more of the system attributes of the file associated with the provided open file descriptor. setattrat() uses the provided nvlist to update one or more of the system attributes of the extended attribute file with the provided filename in the extended attribute directory associated with the provided open file descriptor. If completion is not successful then no system attribute information is updated. For more info on these APIs see the man page in the case directory. 4.2.2 Changes for pathconf(2) Retrieving the pathconf variable _PC_XATTR_EXISTS on a filesystem object currently returns 1 if the object has any extended attributes. For filesystems that support system attributes, every filesystem object will implicitly have extended attributes. In order for this pathconf to remain useful, _PC_XATTR_EXISTS will now return 1 if and only if the object has real (non-system) extended attributes. Two new pathconf variables will be added: _PC_SATTR_ENABLED is similar to _PC_XATTR_ENABLED in that its value will be 1 if the object supports system attributes. Unlike _PC_XATTR_ENABLED, this is not necessarily enforced at VFS granularity. Filesystems may choose not to support system attributes for certain object types (e.g. .zfs). _PC_SATTR_EXISTS has the same semantics as _PC_SATTR_ENABLED and is being provided only for similarity to the extended attribute pathconf calls. 4.3 Extensible vattr_t for VOP_GETATTR()/VOP_SETATTR() The CIFS server requires several new file attributes to be added to ZFS to support Windows (CIFS) clients. These attributes are required on file systems that fully support CIFS, but are not required in general. They are considered to be optional attributes and aren't required to be supported on every file system. In order to support new, optional attributes, Solaris needs a fast, flexible attribute interface at the VOP level that can set and retrieve both standard and optional attributes. This interface must take into account the fact that the underlying file system may choose not to support those attributes. In addition, the interfaces must allow new, optional attributes to be added with minimal disruption of the existing interfaces. Ideally, attributes can be added in a patch or update release without affecting unbundled file systems. The following proposes an extensible structure that can be used with the existing VOP_SETATTR/VOP_GETATTR interface to set/retrieve both required and optional system attributes. This structure can also be used by VOP_MKDIR() and VOP_CREATE() in place of their existing vattr_t pointer to set optional attributes at create time. This new structure, xvattr_t, contains the following: - the existing vattr_t structure (embedded) as the first member - two counted arrays of 32 bit words each of which represents an extensible bitmap of optional attributes. The first bitmap represents the requested attributes. The second represents the returned attributes. That is, the attributes that the file system was able to process. - a structure of values for the optional file attributes - additional fields to support extensibility In vnode.h, we have the existing bit mask values which are used in the va_mask field of the existing vattr_t structure: /* * Attributes of interest to the caller of setattr or getattr. */ #define AT_TYPE 0x0001 #define AT_MODE 0x0002 #define AT_UID 0x0004 ... #define AT_SEQ 0x8000 #define AT_ALL (AT_TYPE|AT_MODE|AT_UID| ... |AT_SEQ) We now add: #define AT_XVATTR 0x10000 /* Optional attributes present */ Which indicates that the structure contains optional attributes set in the bitmap array. Note that we do *not* add AT_XVATTR to the AT_ALL #define. The callers who desire optional attributes will be required to set/check AT_XVATTR explicitly. This allows callers to use existing code without modification. The definition of xvattr_t is: typedef struct xvattr { vattr_t xva_vattr; /* Embedded vattr structure */ uint32_t xva_magic; /* Magic Number */ uint32_t xva_mapsize; /* Size of attr bitmaps */ uint32_t *xva_rtnattrmapp; /* Ptr to xva_rtnattrmap[] */ uint32_t xva_reqattrmap[XVA_MAPSIZE]; /* Requested attrs */ uint32_t xva_rtnattrmap[XVA_MAPSIZE]; /* Returned attrs */ xoptattr_t xva_xoptattrs; /* Optional attributes */ } xvattr_t; The fields of the xvattr structure are as follows: xva_vattr - The first element of an xvattr is an embedded legacy vattr structure which includes the common attributes. If AT_XVATTR is set in the va_mask (member of vattr_t) then the entire structure is treated as an xvattr_t. If AT_XVATTR is not set, then only the xva_vattr structure can be used. xva_magic - 0x78766174 (hex for "xvat"). Magic number for verification set by xva_init(). xva_mapsize - Size of requested and returned attribute bitmaps. This is initialized to XVA_MAPSIZE by xva_init(). xva_rtnattrmapp - Pointer to xva_rtnattrmap[]. In an update/patch, the size of xva_reqattrmap[] could change which means the location of xva_rtnattrmap[] could change. This will allow unbundled file systems to reliably locate the xva_rtnattrmap[] if the size of the attribute bitmaps change. This value is initialized by the xva_init() routine. xva_reqattrmap[] - Array of requested attributes. Each attribute is represented by a specific bit in a specific element of the attribute map array. Callers set the bits corresponding to the attributes that the caller wants to get/set. xva_rtnattrmap[] - Array of attributes that the file system was able to process. Not all file systems support all optional attributes. This map informs the caller which attributes the underlying file system was able to set/get. (Same structure as the requested attributes array in terms of each attribute corresponding to a specific bit and in a specific array elements.) xva_xoptattrs - Structure containing values of optional attributes. For VOP_SETATTR(), these values are only valid to the file system if AT_XVATTR is set in the va_mask and the corresponding bits in xva_reqattrmap are set. Upon return from VOP_GETATTR(), these values are only valid if AT_XVATTR is set in the va_mask and the corresponding bits in xva_rtnattrmap are set (indicating that the underlying file system supports those attributes.) The optattr_t structure consists of all optional attributes (described above): typedef struct xoptattr { timestruc_t xoa_createtime; uint8_t xoa_archive; uint8_t xoa_system; uint8_t xoa_readonly; uint8_t xoa_hidden; uint8_t xoa_nounlink; uint8_t xoa_immutable; uint8_t xoa_appendonly; uint8_t xoa_nodump; uint8_t xoa_settable; uint8_t xoa_opaque; uint8_t xoa_av_quarantined; uint8_t xoa_av_modified; } xoptattr_t; The attribute map arrays, xva_reqattrmap[] and rtnattrmap[], are used as follows: Each element of the bitmap array is represented as a 32 bit word. The member xva_mapsize determines the number of elements in the bitmap arrays. (Both xva_reqattrmap[] and rtnattrmap[] have the same length.) element 0: 0 bit represents first attribute, bit n is the n+1th attribute element 1: 0 bit represents 33rd attribute (assuming 32 bit masks), bit m is the 33+mth attribute ...and so on. Each attribute is represented by a specific bit in a specific array element of the attribute bitmaps. To this end, the position of each attribute in the bitmap is represented by a 64 bit word: The most significant 32 bits represents the index into the bitmaps, the least significant bits contains the bit corresponding to this attribute. The attributes are represented numerically as follows: INDEX BIT XAT_CREATETIME 0x00000000 00000001 XAT_ARCHIVE 0x00000000 00000002 XAT_SYSTEM 0x00000000 00000004 XAT_READONLY 0x00000000 00000008 XAT_HIDDEN 0x00000000 00000010 XAT_NOUNLINK 0x00000000 00000020 XAT_IMMUTABLE 0x00000000 00000040 XAT_APPENDONLY 0x00000000 00000080 XAT_NODUMP 0x00000000 00000100 XAT_SETTABLE 0x00000000 00000200 XAT_OPAQUE 0x00000000 00000400 XAT_AV_QUARANTINED 0x00000000 00000800 XAT_AV_MODIFIED 0x00000000 00001000 For example, the representation in the bitmaps for the "hidden" attribute is the 0th element, bit 0x10. In the header file, these are represented by specific #defines which detail the individual bits and the manipulation necessary to represent the index. (See below) For the #defines representing the individual bits, convention is to use XAT{n}_{attrname} where "n" is the element in the bitmap (starting at 0). This convention is for the convenience of the maintainer to keep track of which element each attribute belongs to. The bitmap array index for the group is also defined so that the full representation (see below) can be built. Callers are strongly discouraged from using the XAT{n}_* #defines but providers (e.g., file systems) may choose to use these for performance reasons. #define XAT0_CREATETIME 0x00000001 /* Create time of file */ #define XAT0_ARCHIVE 0x00000002 /* Archive */ #define XAT0_SYSTEM 0x00000004 /* System */ #define XAT0_READONLY 0x00000008 /* Readonly */ #define XAT0_HIDDEN 0x00000010 /* Hidden */ #define XAT0_NOUNLINK 0x00000020 /* Nounlink */ #define XAT0_IMMUTABLE 0x00000040 /* immutable */ #define XAT0_APPENDONLY 0x00000080 /* appendonly */ #define XAT0_NODUMP 0x00000100 /* nodump */ #define XAT0_SETTABLE 0x00000200 /* settable */ #define XAT0_OPAQUE 0x00000400 /* opaque */ #define XAT0_AV_QUARANTINED 0x00000800 /* anti-virus quarantine */ #define XAT0_AV_MODIFIED 0x00001000 /* anti-virus modified */ The following defines present a "flat namespace" so that consumers don't need to keep track of which element belongs to which bitmap entry. #define XAT0_INDEX 0LL /* Index into bitmap for XAT0 attrs */ #define XVA_SHFT 32 /* Used to shift index */ #define XAT_CREATETIME ((XAT0_INDEX << XVA_SHFT) | XAT0_CREATETIME) #define XAT_ARCHIVE ((XAT0_INDEX << XVA_SHFT) | XAT0_ARCHIVE) #define XAT_SYSTEM ((XAT0_INDEX << XVA_SHFT) | XAT0_SYSTEM) #define XAT_READONLY ((XAT0_INDEX << XVA_SHFT) | XAT0_READONLY) #define XAT_HIDDEN ((XAT0_INDEX << XVA_SHFT) | XAT0_HIDDEN) #define XAT_NOUNLINK ((XAT0_INDEX << XVA_SHFT) | XAT0_NOUNLINK) #define XAT_IMMUTABLE ((XAT0_INDEX << XVA_SHFT) | XAT0_IMMUTABLE) #define XAT_APPENDONLY ((XAT0_INDEX << XVA_SHFT) | XAT0_APPENDONLY) #define XAT_NODUMP ((XAT0_INDEX << XVA_SHFT) | XAT0_NODUMP) #define XAT_SETTABLE ((XAT0_INDEX << XVA_SHFT) | XAT0_SETTABLE) #define XAT_OPAQUE ((XAT0_INDEX << XVA_SHFT) | XAT0_OPAQUE) #define XAT_AV_QUARANTINED ((XAT0_INDEX << XVA_SHFT) | XAT0_AV_QUARANTINED) #define XAT_AV_MODIFIED ((XAT0_INDEX << XVA_SHFT) | XAT0_AV_MODIFIED) The following routines and macros are provided for the convenience of both callers and providers (e.g., file systems). void xva_init(xvattr_t *xvap); * Initializes the xvattr structure pointed to by xvap. It zeros the structure, sets the size of the bitmaps, initializes the "magic" number, and sets the xva_rtnattrmapp pointer. xoptattr_t *xva_getxoptattr(xvattr_t *xvap); * If the structure pointed to by xvap is an xvattr_t (AT_XVATTR is set in the xva_vattr's va_mask, XVA_MAGIC is set in xva_magic), then return a pointer to the xva_xoptattr structure. Otherwise, return NULL. XVA_SET_REQ(xvap, attr) * Sets an attribute bit in the proper element in the bitmap of requested attributes (xva_reqattrmap[]). XVA_SET_RTN(xvap, attr) * Sets an attribute bit in the proper element in the bitmap of returned attributes (xva_rtnattrmap[]). XVA_ISSET_REQ(xvap, attr) * Checks the requested attribute bitmap (xva_reqattrmap[]) to see of the corresponding attribute bit is set. If so, returns non-zero. XVA_ISSET_RTN(xvap, attr) * Checks the returned attribute bitmap (xva_rtnattrmap[]) to see of the corresponding attribute bit is set. If so, returns non-zero. Consumers and file systems that use these accessor routines are protected against changes in the size of the attribute bitmap arrays. This allows new attributes to be added in a patch/update without affecting unbundled file systems. The following describes how a caller can retrieve and set optional attributes using the xvattr_t structure: - The caller uses xva_init() to initialize the xvattr_t structure. Among other things, this zeros the structure and sets the size of the attribute bitmap arrays. - The caller uses the XVA_SET_REQ() macro to set individual attribute bits in the xva_reqattrmap[] array. - For retrieving attributes, the xva_xoptattrs field has already been cleared in xva_init() so the caller simply passes a pointer to the xvattr_t structure to VOP_GETATTR(). - For setting attributes, the caller sets the corresponding values in the xva_xoptattrs field then passes a pointer to the xvattr_t structure to VOP_SETATTR(). - On a successful return from VOP_GETATTR/VOP_SETATTR: * The caller checks the va_mask of xva_vattr to see if AT_XVATTR is set. This bit will be cleared in fop_getattr()/fop_setattr() if the underlying file system does not support optional attributes. If the AT_XVATTR bit is not set, then the underlying file system was unable to process the optional attributes. * If AT_XVATTR was set, then the caller uses XVA_ISSET_RTN() on each attribute to see if the file system processed the attribute. For any attribute bit that was not set, the attribute value was not processed. In the case of VOP_GETATTR(), it means that the corresponding value in the xva_xoptattrs field is invalid. In the case of VOP_SETATTR(), it means that the value was not set. * For retrieving attributes, the value of the corresponding attribute is found in the xva_xoptattrs structure. Existing callers and file systems that do not participate in the new attributes do not need to change. They simply manipulate and access the vattr_t passed to VOP_GETATTR()/VOP_SETATTR() as before. A new VFS Feature (PSARC 2007/227) is being added to indicate that a file system supports xvattr_t and the AT_XVATTR bit: VFSFT_XVATTR. The fop_getattr() and fop_setattr() routines will query for this feature. If this feature is not present, then the fop routines will clear the AT_XVATTR bit from the va_mask field of the (x)vattr_t. Kernel modules can also query for this VFS Feature to determine if a file system supports optional attributes (xvattr_t/AT_XVATTR). 4.3.1 NFSv4 Support for Optional Attributes (Potential Future Work) The NFSv4 protocol has an attribute model that supports extensions. In fact, four of the new attributes (ARCHIVE, HIDDEN, SYSTEM, and CREATETIME) are part of the "recommended" list of attributes for a client and server implementation. The Solaris NFSv4 client and server could be modified to support the new, optional attributes as specified by the protocol. This work is not currently funded. 4.4 Managing Change in the Attribute Space The issue of adding new attributes and the semantics of those attributes need to be discussed in an opensolaris community dedicated to file system framework issues. There is a proposal in the opensolaris governing board to reorganize (among other things) the file system area. The project team will work with the governing board to determine the appropriate forum to discuss general file system issues. In the meantime, we ask that the ARC work with project teams to ensure new proposals for attributes, their names, and their semantics are reasonable. Members of this project team will be monitoring both the opensolaris and PSARC aliases for related projects. Until an opensolaris community is up and running, project teams and PSARC are welcome to forward attribute-related questions to fs-framework@sun.com . 4.5 Changes to Utilities The project team is aware of an impact to the following utilities: ls, tar, cpio, pax, cp, mv, pack, unpack, pcat, etc. These utilities will need to handle a new option flag (similar to "-@") to specify archive/restore of sysattrs. Since _PC_XATTR_EXISTS will return 0 for file system objects that only have system attributes (and no regular extended attributes), these utilities will need to call _PC_SATTR_EXISTS to determine the correct handling of system attributes. The utilities team will be submitting fast-tracks to address these changes in detail. 5.0 EXPORTED INTERFACE TABLE |Proposed |Specified | |Stability |in what | Interface Name |Classification |Document? | Comments =============================================================================== |Consolidation |This | |Private |Document | | | | SUNWattr_ro | | | Read-only view of | | | extended attribute | | | namespace | | | SUNWattr_rw | | | Read-write view of | | | extended attribute | | | namespace | | | A_FSID, A_MDEV | | | Contents of read-only | | | view of extended | | | attribute namespace | | | A_READONLY, A_HIDDEN, | | | Contents of A_SYSTEM, A_ARCHIVE, | | | read-write view of A_CRTIME, A_NOUNLINK, | | | extended attribute A_IMMUTABLE, A_NODUMP | | | namespace A_APPENDONLY, A_OPAQUE,| | | A_AV_QUARANTINED, | | | A_AV_MODIFIED, | | | A_OWNERSID, A_GROUPSID | | | | | | _PC_SATTR_ENABLED | | | New pathconf(2) _PC_SATTR_EXISTS | | | variables | | | struct xvattr | | | Extensible version xvattr_t | | | of vattr_t | | | AT_XVATTR | | | New flag for the | | | va_mask of vattr_t | | | struct xoptattr | | | New structure to hold xoptattr_t | | | values of optional | | | system attributes | | | XAT_CREATETIME | | | Values for new XAT_ARCHIVE | | | system attributes XAT_SYSTEM | | | to be used with XAT_READONLY | | | XVA_SET_*() and XAT_HIDDEN | | | XVA_ISSET_*() macros XAT_NOUNLINK | | | XAT_IMMUTABLE | | | XAT_APPENDONLY | | | XAT_NODUMP | | | XAT_SETTABLE | | | XAT_OPAQUE | | | XAT_AV_QUARANTINED | | | XAT_AV_MODIFIED | | | | | | xva_init() | | | Initialization for | | | xvattr_t structure | | | xva_getxoptattr() | | | Retrieves pointer to | | | xoptattr_t, if it is | | | available | | | XVA_SET_REQ() | | | Sets the appropriate XVA_SET_RET() | | | attribute bit in the | | | request or return | | | bitmap. | | | XVA_ISSET_REQ() | | | Checks the appropriate XVA_ISSET_RET() | | | attribute bit in the | | | request or return | | | bitmap. | | | VFSFT_XVATTR | | | VFS feature to be | | | registered by an FS | | | with xvattr_t support | | | fgetattr() | | This document | New interfaces in fsetattr() | | and the man | libc for retrieving getattrat() | | page for | and modifying system setattrat() | | fgetattr(3c) | attributes in the | | | extended attribute | | | /usr/include/attr.h | Committed | This document | Defines libc APIs | | | described in section | | | 4.2.1 | | | /usr/include/sys/attr.h| | | Defines views and | | | attribute names for | | | nvlists