draft-ietf-nfsv4-minorversion1-25.txt | draft-ietf-nfsv4-minorversion1-26.txt | |||
---|---|---|---|---|
NFSv4 S. Shepler | NFSv4 S. Shepler | |||
Internet-Draft M. Eisler | Internet-Draft M. Eisler | |||
Intended status: Standards Track D. Noveck | Intended status: Standards Track D. Noveck | |||
Expires: February 20, 2009 Editors | Expires: March 7, 2009 Editors | |||
August 19, 2008 | September 03, 2008 | |||
NFS Version 4 Minor Version 1 | NFS Version 4 Minor Version 1 | |||
draft-ietf-nfsv4-minorversion1-25.txt | draft-ietf-nfsv4-minorversion1-26.txt | |||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 1, line 35 | skipping to change at page 1, line 35 | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
This Internet-Draft will expire on February 20, 2009. | This Internet-Draft will expire on March 7, 2009. | |||
Abstract | Abstract | |||
This Internet-Draft describes NFS version 4 minor version one, | This Internet-Draft describes NFS version 4 minor version one, | |||
including features retained from the base protocol and protocol | including features retained from the base protocol and protocol | |||
extensions made subsequently. Major extensions introduced in NFS | extensions made subsequently. Major extensions introduced in NFS | |||
version 4 minor version one include: Sessions, Directory Delegations, | version 4 minor version one include: Sessions, Directory Delegations, | |||
and parallel NFS (pNFS). | and parallel NFS (pNFS). | |||
Requirements Language | Requirements Language | |||
skipping to change at page 2, line 46 | skipping to change at page 2, line 46 | |||
2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 38 | 2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 38 | |||
2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 38 | 2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 38 | |||
2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 39 | 2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 39 | |||
2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 39 | 2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 39 | |||
2.9.2. Client and Server Transport Behavior . . . . . . . . 39 | 2.9.2. Client and Server Transport Behavior . . . . . . . . 39 | |||
2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 41 | 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 41 | |||
2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 41 | 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 41 | |||
2.10.1. Motivation and Overview . . . . . . . . . . . . . . 41 | 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 41 | |||
2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 42 | 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 42 | |||
2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 44 | 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 44 | |||
2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 45 | 2.10.4. Server Scope . . . . . . . . . . . . . . . . . . . . 45 | |||
2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 48 | 2.10.5. Trunking . . . . . . . . . . . . . . . . . . . . . . 48 | |||
2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 61 | 2.10.6. Exactly Once Semantics . . . . . . . . . . . . . . . 51 | |||
2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 64 | 2.10.7. RDMA Considerations . . . . . . . . . . . . . . . . 64 | |||
2.10.8. The SSV GSS Mechanism . . . . . . . . . . . . . . . 69 | 2.10.8. Sessions Security . . . . . . . . . . . . . . . . . 67 | |||
2.10.9. Session Mechanics - Steady State . . . . . . . . . . 73 | 2.10.9. The SSV GSS Mechanism . . . . . . . . . . . . . . . 72 | |||
2.10.10. Session Inactivity Timer . . . . . . . . . . . . . . 75 | 2.10.10. Session Mechanics - Steady State . . . . . . . . . . 76 | |||
2.10.11. Session Mechanics - Recovery . . . . . . . . . . . . 75 | 2.10.11. Session Inactivity Timer . . . . . . . . . . . . . . 78 | |||
2.10.12. Parallel NFS and Sessions . . . . . . . . . . . . . 79 | 2.10.12. Session Mechanics - Recovery . . . . . . . . . . . . 78 | |||
3. Protocol Constants and Data Types . . . . . . . . . . . . . . 79 | 2.10.13. Parallel NFS and Sessions . . . . . . . . . . . . . 83 | |||
3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 79 | 3. Protocol Constants and Data Types . . . . . . . . . . . . . . 84 | |||
3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 80 | 3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 84 | |||
3.3. Structured Data Types . . . . . . . . . . . . . . . . . 82 | 3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 85 | |||
4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 90 | 3.3. Structured Data Types . . . . . . . . . . . . . . . . . 86 | |||
4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 90 | 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 95 | |||
4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 91 | 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 95 | |||
4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 91 | 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 95 | |||
4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 91 | 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 96 | |||
4.2.1. General Properties of a Filehandle . . . . . . . . . 92 | 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 96 | |||
4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 93 | 4.2.1. General Properties of a Filehandle . . . . . . . . . 97 | |||
4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 93 | 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 97 | |||
4.3. One Method of Constructing a Volatile Filehandle . . . . 94 | 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 98 | |||
4.4. Client Recovery from Filehandle Expiration . . . . . . . 95 | 4.3. One Method of Constructing a Volatile Filehandle . . . . 99 | |||
5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 96 | 4.4. Client Recovery from Filehandle Expiration . . . . . . . 99 | |||
5.1. REQUIRED Attributes . . . . . . . . . . . . . . . . . . 97 | 5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 100 | |||
5.2. RECOMMENDED Attributes . . . . . . . . . . . . . . . . . 97 | 5.1. REQUIRED Attributes . . . . . . . . . . . . . . . . . . 102 | |||
5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 98 | 5.2. RECOMMENDED Attributes . . . . . . . . . . . . . . . . . 102 | |||
5.4. Classification of Attributes . . . . . . . . . . . . . . 99 | 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 102 | |||
5.5. Set-Only and Get-Only Attributes . . . . . . . . . . . . 100 | 5.4. Classification of Attributes . . . . . . . . . . . . . . 104 | |||
5.6. REQUIRED Attributes - List and Definition References . . 100 | 5.5. Set-Only and Get-Only Attributes . . . . . . . . . . . . 105 | |||
5.6. REQUIRED Attributes - List and Definition References . . 105 | ||||
5.7. RECOMMENDED Attributes - List and Definition | 5.7. RECOMMENDED Attributes - List and Definition | |||
References . . . . . . . . . . . . . . . . . . . . . . . 101 | References . . . . . . . . . . . . . . . . . . . . . . . 106 | |||
5.8. Attribute Definitions . . . . . . . . . . . . . . . . . 103 | 5.8. Attribute Definitions . . . . . . . . . . . . . . . . . 108 | |||
5.8.1. Definitions of REQUIRED Attributes . . . . . . . . . 103 | 5.8.1. Definitions of REQUIRED Attributes . . . . . . . . . 108 | |||
5.8.2. Definitions of Uncategorized RECOMMENDED | 5.8.2. Definitions of Uncategorized RECOMMENDED | |||
Attributes . . . . . . . . . . . . . . . . . . . . . 105 | Attributes . . . . . . . . . . . . . . . . . . . . . 110 | |||
5.9. Interpreting owner and owner_group . . . . . . . . . . . 112 | 5.9. Interpreting owner and owner_group . . . . . . . . . . . 116 | |||
5.10. Character Case Attributes . . . . . . . . . . . . . . . 114 | 5.10. Character Case Attributes . . . . . . . . . . . . . . . 118 | |||
5.11. Directory Notification Attributes . . . . . . . . . . . 114 | 5.11. Directory Notification Attributes . . . . . . . . . . . 119 | |||
5.12. pNFS Attribute Definitions . . . . . . . . . . . . . . . 114 | 5.12. pNFS Attribute Definitions . . . . . . . . . . . . . . . 119 | |||
5.13. Retention Attributes . . . . . . . . . . . . . . . . . . 116 | 5.13. Retention Attributes . . . . . . . . . . . . . . . . . . 121 | |||
6. Access Control Attributes . . . . . . . . . . . . . . . . . . 119 | 6. Access Control Attributes . . . . . . . . . . . . . . . . . . 124 | |||
6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 119 | 6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 124 | |||
6.2. File Attributes Discussion . . . . . . . . . . . . . . . 120 | 6.2. File Attributes Discussion . . . . . . . . . . . . . . . 125 | |||
6.2.1. Attribute 12: acl . . . . . . . . . . . . . . . . . 120 | 6.2.1. Attribute 12: acl . . . . . . . . . . . . . . . . . 125 | |||
6.2.2. Attribute 58: dacl . . . . . . . . . . . . . . . . . 135 | 6.2.2. Attribute 58: dacl . . . . . . . . . . . . . . . . . 140 | |||
6.2.3. Attribute 59: sacl . . . . . . . . . . . . . . . . . 135 | 6.2.3. Attribute 59: sacl . . . . . . . . . . . . . . . . . 140 | |||
6.2.4. Attribute 33: mode . . . . . . . . . . . . . . . . . 135 | 6.2.4. Attribute 33: mode . . . . . . . . . . . . . . . . . 140 | |||
6.2.5. Attribute 74: mode_set_masked . . . . . . . . . . . 136 | 6.2.5. Attribute 74: mode_set_masked . . . . . . . . . . . 141 | |||
6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 137 | 6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 142 | |||
6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 137 | 6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 142 | |||
6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 138 | 6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 143 | |||
6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 139 | 6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 144 | |||
6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 139 | 6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 144 | |||
6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 141 | 6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 146 | |||
6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 141 | 6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 146 | |||
7. Single-server Namespace . . . . . . . . . . . . . . . . . . . 145 | 7. Single-server Namespace . . . . . . . . . . . . . . . . . . . 150 | |||
7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 145 | 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 150 | |||
7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 146 | 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 151 | |||
7.3. Server Pseudo File System . . . . . . . . . . . . . . . 146 | 7.3. Server Pseudo File System . . . . . . . . . . . . . . . 151 | |||
7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 147 | 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 152 | |||
7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 147 | 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 152 | |||
7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 147 | 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 152 | |||
7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 148 | 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 153 | |||
7.8. Security Policy and Namespace Presentation . . . . . . . 148 | 7.8. Security Policy and Namespace Presentation . . . . . . . 153 | |||
8. State Management . . . . . . . . . . . . . . . . . . . . . . 149 | 8. State Management . . . . . . . . . . . . . . . . . . . . . . 154 | |||
8.1. Client and Session ID . . . . . . . . . . . . . . . . . 150 | 8.1. Client and Session ID . . . . . . . . . . . . . . . . . 155 | |||
8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 150 | 8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 155 | |||
8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 151 | 8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 156 | |||
8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 152 | 8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 157 | |||
8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 154 | 8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 159 | |||
8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 155 | 8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 160 | |||
8.2.5. Stateid Use for I/O Operations . . . . . . . . . . . 158 | 8.2.5. Stateid Use for I/O Operations . . . . . . . . . . . 163 | |||
8.2.6. Stateid Use for SETATTR Operations . . . . . . . . . 159 | 8.2.6. Stateid Use for SETATTR Operations . . . . . . . . . 164 | |||
8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 159 | 8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 164 | |||
8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 161 | 8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 166 | |||
8.4.1. Client Failure and Recovery . . . . . . . . . . . . 162 | 8.4.1. Client Failure and Recovery . . . . . . . . . . . . 167 | |||
8.4.2. Server Failure and Recovery . . . . . . . . . . . . 163 | 8.4.2. Server Failure and Recovery . . . . . . . . . . . . 168 | |||
8.4.3. Network Partitions and Recovery . . . . . . . . . . 166 | 8.4.3. Network Partitions and Recovery . . . . . . . . . . 172 | |||
8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 171 | 8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 176 | |||
8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 172 | 8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 177 | |||
8.7. Clocks, Propagation Delay, and Calculating Lease | 8.7. Clocks, Propagation Delay, and Calculating Lease | |||
Expiration . . . . . . . . . . . . . . . . . . . . . . . 172 | Expiration . . . . . . . . . . . . . . . . . . . . . . . 178 | |||
8.8. Obsolete Locking Infrastructure From NFSv4.0 . . . . . . 173 | 8.8. Obsolete Locking Infrastructure From NFSv4.0 . . . . . . 179 | |||
9. File Locking and Share Reservations . . . . . . . . . . . . . 174 | 9. File Locking and Share Reservations . . . . . . . . . . . . . 180 | |||
9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 174 | 9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 180 | |||
9.1.1. State-owner Definition . . . . . . . . . . . . . . . 174 | 9.1.1. State-owner Definition . . . . . . . . . . . . . . . 180 | |||
9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 175 | 9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 180 | |||
9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 178 | 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 183 | |||
9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 178 | 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 184 | |||
9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 179 | 9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 184 | |||
9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 179 | 9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 185 | |||
9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 180 | 9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 185 | |||
9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 181 | 9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 186 | |||
9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 182 | 9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 187 | |||
9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 182 | 9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 188 | |||
9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 183 | 9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 189 | |||
9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 184 | 9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 189 | |||
10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 184 | 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 190 | |||
10.1. Performance Challenges for Client-Side Caching . . . . . 185 | 10.1. Performance Challenges for Client-Side Caching . . . . . 190 | |||
10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 186 | 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 191 | |||
10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 188 | 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 193 | |||
10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 190 | 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 196 | |||
10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 190 | 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 196 | |||
10.3.2. Data Caching and File Locking . . . . . . . . . . . 191 | 10.3.2. Data Caching and File Locking . . . . . . . . . . . 197 | |||
10.3.3. Data Caching and Mandatory File Locking . . . . . . 193 | 10.3.3. Data Caching and Mandatory File Locking . . . . . . 199 | |||
10.3.4. Data Caching and File Identity . . . . . . . . . . . 193 | 10.3.4. Data Caching and File Identity . . . . . . . . . . . 199 | |||
10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 195 | 10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 200 | |||
10.4.1. Open Delegation and Data Caching . . . . . . . . . . 197 | 10.4.1. Open Delegation and Data Caching . . . . . . . . . . 203 | |||
10.4.2. Open Delegation and File Locks . . . . . . . . . . . 198 | 10.4.2. Open Delegation and File Locks . . . . . . . . . . . 204 | |||
10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 199 | 10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 204 | |||
10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 202 | 10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 207 | |||
10.4.5. Clients that Fail to Honor Delegation Recalls . . . 204 | 10.4.5. Clients that Fail to Honor Delegation Recalls . . . 209 | |||
10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 204 | 10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 210 | |||
10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 205 | 10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 210 | |||
10.5. Data Caching and Revocation . . . . . . . . . . . . . . 206 | 10.5. Data Caching and Revocation . . . . . . . . . . . . . . 211 | |||
10.5.1. Revocation Recovery for Write Open Delegation . . . 206 | 10.5.1. Revocation Recovery for Write Open Delegation . . . 212 | |||
10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 207 | 10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 212 | |||
10.7. Data and Metadata Caching and Memory Mapped Files . . . 209 | 10.7. Data and Metadata Caching and Memory Mapped Files . . . 214 | |||
10.8. Name and Directory Caching without Directory | 10.8. Name and Directory Caching without Directory | |||
Delegations . . . . . . . . . . . . . . . . . . . . . . 211 | Delegations . . . . . . . . . . . . . . . . . . . . . . 217 | |||
10.8.1. Name Caching . . . . . . . . . . . . . . . . . . . . 211 | 10.8.1. Name Caching . . . . . . . . . . . . . . . . . . . . 217 | |||
10.8.2. Directory Caching . . . . . . . . . . . . . . . . . 213 | 10.8.2. Directory Caching . . . . . . . . . . . . . . . . . 218 | |||
10.9. Directory Delegations . . . . . . . . . . . . . . . . . 214 | 10.9. Directory Delegations . . . . . . . . . . . . . . . . . 219 | |||
10.9.1. Introduction to Directory Delegations . . . . . . . 214 | 10.9.1. Introduction to Directory Delegations . . . . . . . 219 | |||
10.9.2. Directory Delegation Design . . . . . . . . . . . . 215 | 10.9.2. Directory Delegation Design . . . . . . . . . . . . 220 | |||
10.9.3. Attributes in Support of Directory Notifications . . 216 | 10.9.3. Attributes in Support of Directory Notifications . . 221 | |||
10.9.4. Directory Delegation Recall . . . . . . . . . . . . 216 | 10.9.4. Directory Delegation Recall . . . . . . . . . . . . 221 | |||
10.9.5. Directory Delegation Recovery . . . . . . . . . . . 217 | 10.9.5. Directory Delegation Recovery . . . . . . . . . . . 222 | |||
11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 217 | 11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 222 | |||
11.1. Location Attributes . . . . . . . . . . . . . . . . . . 217 | 11.1. Location Attributes . . . . . . . . . . . . . . . . . . 223 | |||
11.2. File System Presence or Absence . . . . . . . . . . . . 218 | 11.2. File System Presence or Absence . . . . . . . . . . . . 223 | |||
11.3. Getting Attributes for an Absent File System . . . . . . 219 | 11.3. Getting Attributes for an Absent File System . . . . . . 224 | |||
11.3.1. GETATTR Within an Absent File System . . . . . . . . 219 | 11.3.1. GETATTR Within an Absent File System . . . . . . . . 225 | |||
11.3.2. READDIR and Absent File Systems . . . . . . . . . . 220 | 11.3.2. READDIR and Absent File Systems . . . . . . . . . . 226 | |||
11.4. Uses of Location Information . . . . . . . . . . . . . . 221 | 11.4. Uses of Location Information . . . . . . . . . . . . . . 226 | |||
11.4.1. File System Replication . . . . . . . . . . . . . . 222 | 11.4.1. File System Replication . . . . . . . . . . . . . . 227 | |||
11.4.2. File System Migration . . . . . . . . . . . . . . . 222 | 11.4.2. File System Migration . . . . . . . . . . . . . . . 228 | |||
11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 224 | 11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 229 | |||
11.5. Location Entries and Server Identity . . . . . . . . . . 225 | 11.5. Location Entries and Server Identity . . . . . . . . . . 231 | |||
11.6. Additional Client-side Considerations . . . . . . . . . 226 | 11.6. Additional Client-side Considerations . . . . . . . . . 231 | |||
11.7. Effecting File System Transitions . . . . . . . . . . . 226 | 11.7. Effecting File System Transitions . . . . . . . . . . . 232 | |||
11.7.1. File System Transitions and Simultaneous Access . . 228 | 11.7.1. File System Transitions and Simultaneous Access . . 233 | |||
11.7.2. Simultaneous Use and Transparent Transitions . . . . 228 | 11.7.2. Simultaneous Use and Transparent Transitions . . . . 234 | |||
11.7.3. Filehandles and File System Transitions . . . . . . 231 | 11.7.3. Filehandles and File System Transitions . . . . . . 237 | |||
11.7.4. Fileids and File System Transitions . . . . . . . . 231 | 11.7.4. Fileids and File System Transitions . . . . . . . . 237 | |||
11.7.5. Fsids and File System Transitions . . . . . . . . . 233 | 11.7.5. Fsids and File System Transitions . . . . . . . . . 238 | |||
11.7.6. The Change Attribute and File System Transitions . . 233 | 11.7.6. The Change Attribute and File System Transitions . . 239 | |||
11.7.7. Lock State and File System Transitions . . . . . . . 234 | 11.7.7. Lock State and File System Transitions . . . . . . . 239 | |||
11.7.8. Write Verifiers and File System Transitions . . . . 238 | 11.7.8. Write Verifiers and File System Transitions . . . . 244 | |||
11.7.9. Readdir Cookies and Verifiers and File System | 11.7.9. Readdir Cookies and Verifiers and File System | |||
Transitions . . . . . . . . . . . . . . . . . . . . 238 | Transitions . . . . . . . . . . . . . . . . . . . . 244 | |||
11.7.10. File System Data and File System Transitions . . . . 238 | 11.7.10. File System Data and File System Transitions . . . . 244 | |||
11.8. Effecting File System Referrals . . . . . . . . . . . . 240 | 11.8. Effecting File System Referrals . . . . . . . . . . . . 246 | |||
11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 240 | 11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 246 | |||
11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 244 | 11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 250 | |||
11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 246 | 11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 252 | |||
11.10. The Attribute fs_locations_info . . . . . . . . . . . . 249 | 11.10. The Attribute fs_locations_info . . . . . . . . . . . . 255 | |||
11.10.1. The fs_locations_server4 Structure . . . . . . . . . 253 | 11.10.1. The fs_locations_server4 Structure . . . . . . . . . 259 | |||
11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 258 | 11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 264 | |||
11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 259 | 11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 265 | |||
11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 261 | 11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 267 | |||
12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 265 | 12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 271 | |||
12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 265 | 12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 271 | |||
12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 266 | 12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 272 | |||
12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 267 | 12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 273 | |||
12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 267 | 12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 273 | |||
12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 267 | 12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 273 | |||
12.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 267 | 12.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 273 | |||
12.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 268 | 12.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 274 | |||
12.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 268 | 12.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 274 | |||
12.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 268 | 12.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 274 | |||
12.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 269 | 12.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 275 | |||
12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 269 | 12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 275 | |||
12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 270 | 12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 276 | |||
12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 271 | 12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 277 | |||
12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 272 | 12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 278 | |||
12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 272 | 12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 278 | |||
12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 272 | 12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 278 | |||
12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 273 | 12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 279 | |||
12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 274 | 12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 280 | |||
12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 276 | 12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 282 | |||
12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 279 | 12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 285 | |||
12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 287 | 12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 293 | |||
12.5.7. Metadata Server Write Propagation . . . . . . . . . 287 | 12.5.7. Metadata Server Write Propagation . . . . . . . . . 293 | |||
12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 287 | 12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 293 | |||
12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 289 | 12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 295 | |||
12.7.1. Recovery from Client Restart . . . . . . . . . . . . 289 | 12.7.1. Recovery from Client Restart . . . . . . . . . . . . 295 | |||
12.7.2. Dealing with Lease Expiration on the Client . . . . 290 | 12.7.2. Dealing with Lease Expiration on the Client . . . . 296 | |||
12.7.3. Dealing with Loss of Layout State on the Metadata | 12.7.3. Dealing with Loss of Layout State on the Metadata | |||
Server . . . . . . . . . . . . . . . . . . . . . . . 291 | Server . . . . . . . . . . . . . . . . . . . . . . . 297 | |||
12.7.4. Recovery from Metadata Server Restart . . . . . . . 291 | 12.7.4. Recovery from Metadata Server Restart . . . . . . . 297 | |||
12.7.5. Operations During Metadata Server Grace Period . . . 293 | 12.7.5. Operations During Metadata Server Grace Period . . . 299 | |||
12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 294 | 12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 300 | |||
12.8. Metadata and Storage Device Roles . . . . . . . . . . . 294 | 12.8. Metadata and Storage Device Roles . . . . . . . . . . . 300 | |||
12.9. Security Considerations for pNFS . . . . . . . . . . . . 294 | 12.9. Security Considerations for pNFS . . . . . . . . . . . . 300 | |||
13. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 295 | 13. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 301 | |||
13.1. Client ID and Session Considerations . . . . . . . . . . 296 | 13.1. Client ID and Session Considerations . . . . . . . . . . 302 | |||
13.1.1. Sessions Considerations for Data Servers . . . . . . 298 | 13.1.1. Sessions Considerations for Data Servers . . . . . . 304 | |||
13.2. File Layout Definitions . . . . . . . . . . . . . . . . 298 | 13.2. File Layout Definitions . . . . . . . . . . . . . . . . 304 | |||
13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 299 | 13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 305 | |||
13.4. Interpreting the File Layout . . . . . . . . . . . . . . 303 | 13.4. Interpreting the File Layout . . . . . . . . . . . . . . 309 | |||
13.4.1. Determining the Stripe Unit Number . . . . . . . . . 303 | 13.4.1. Determining the Stripe Unit Number . . . . . . . . . 309 | |||
13.4.2. Interpreting the File Layout Using Sparse Packing . 303 | 13.4.2. Interpreting the File Layout Using Sparse Packing . 309 | |||
13.4.3. Interpreting the File Layout Using Dense Packing . . 306 | 13.4.3. Interpreting the File Layout Using Dense Packing . . 312 | |||
13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 308 | 13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 314 | |||
13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 310 | 13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 316 | |||
13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 311 | 13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 317 | |||
13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 313 | 13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 319 | |||
13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 315 | 13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 321 | |||
13.9. Metadata and Data Server State Coordination . . . . . . 315 | 13.9. Metadata and Data Server State Coordination . . . . . . 321 | |||
13.9.1. Global Stateid Requirements . . . . . . . . . . . . 315 | 13.9.1. Global Stateid Requirements . . . . . . . . . . . . 321 | |||
13.9.2. Data Server State Propagation . . . . . . . . . . . 316 | 13.9.2. Data Server State Propagation . . . . . . . . . . . 322 | |||
13.10. Data Server Component File Size . . . . . . . . . . . . 318 | 13.10. Data Server Component File Size . . . . . . . . . . . . 324 | |||
13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 319 | 13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 325 | |||
13.12. Security Considerations for the File Layout Type . . . . 319 | 13.12. Security Considerations for the File Layout Type . . . . 325 | |||
14. Internationalization . . . . . . . . . . . . . . . . . . . . 320 | 14. Internationalization . . . . . . . . . . . . . . . . . . . . 326 | |||
14.1. Stringprep profile for the utf8str_cs type . . . . . . . 321 | 14.1. Stringprep profile for the utf8str_cs type . . . . . . . 327 | |||
14.2. Stringprep profile for the utf8str_cis type . . . . . . 323 | 14.2. Stringprep profile for the utf8str_cis type . . . . . . 329 | |||
14.3. Stringprep profile for the utf8str_mixed type . . . . . 324 | 14.3. Stringprep profile for the utf8str_mixed type . . . . . 330 | |||
14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 326 | 14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 332 | |||
14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 326 | 14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 332 | |||
15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 327 | 15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 333 | |||
15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 327 | 15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 333 | |||
15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 329 | 15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 335 | |||
15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 331 | 15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 337 | |||
15.1.3. Compound Structure Errors . . . . . . . . . . . . . 332 | 15.1.3. Compound Structure Errors . . . . . . . . . . . . . 338 | |||
15.1.4. File System Errors . . . . . . . . . . . . . . . . . 334 | 15.1.4. File System Errors . . . . . . . . . . . . . . . . . 340 | |||
15.1.5. State Management Errors . . . . . . . . . . . . . . 336 | 15.1.5. State Management Errors . . . . . . . . . . . . . . 342 | |||
15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 337 | 15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 343 | |||
15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 337 | 15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 343 | |||
15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 338 | 15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 344 | |||
15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 339 | 15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 345 | |||
15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 340 | 15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 346 | |||
15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 341 | 15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 347 | |||
15.1.12. Session Management Errors . . . . . . . . . . . . . 343 | 15.1.12. Session Management Errors . . . . . . . . . . . . . 349 | |||
15.1.13. Client Management Errors . . . . . . . . . . . . . . 343 | 15.1.13. Client Management Errors . . . . . . . . . . . . . . 349 | |||
15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 344 | 15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 350 | |||
15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 344 | 15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 350 | |||
15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 345 | 15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 351 | |||
15.2. Operations and their valid errors . . . . . . . . . . . 346 | 15.2. Operations and their valid errors . . . . . . . . . . . 352 | |||
15.3. Callback operations and their valid errors . . . . . . . 362 | 15.3. Callback operations and their valid errors . . . . . . . 368 | |||
15.4. Errors and the operations that use them . . . . . . . . 364 | 15.4. Errors and the operations that use them . . . . . . . . 370 | |||
16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 378 | 16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 384 | |||
16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 378 | 16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 384 | |||
16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 379 | 16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 385 | |||
17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 390 | 17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 396 | |||
18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 393 | 18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 399 | |||
18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 393 | 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 399 | |||
18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 399 | 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 405 | |||
18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 400 | 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 406 | |||
18.4. Operation 6: CREATE - Create a Non-Regular File Object . 403 | 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 409 | |||
18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting | 18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting | |||
Recovery . . . . . . . . . . . . . . . . . . . . . . . . 406 | Recovery . . . . . . . . . . . . . . . . . . . . . . . . 412 | |||
18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 407 | 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 413 | |||
18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 407 | 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 413 | |||
18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 409 | 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 415 | |||
18.9. Operation 11: LINK - Create Link to a File . . . . . . . 410 | 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 416 | |||
18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 413 | 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 419 | |||
18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 417 | 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 423 | |||
18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 418 | 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 424 | |||
18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 420 | 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 426 | |||
18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 421 | 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 427 | |||
18.15. Operation 17: NVERIFY - Verify Difference in | 18.15. Operation 17: NVERIFY - Verify Difference in | |||
Attributes . . . . . . . . . . . . . . . . . . . . . . . 423 | Attributes . . . . . . . . . . . . . . . . . . . . . . . 429 | |||
18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 424 | 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 430 | |||
18.17. Operation 19: OPENATTR - Open Named Attribute | 18.17. Operation 19: OPENATTR - Open Named Attribute | |||
Directory . . . . . . . . . . . . . . . . . . . . . . . 443 | Directory . . . . . . . . . . . . . . . . . . . . . . . 449 | |||
18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 444 | 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 450 | |||
18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 446 | 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 452 | |||
18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 446 | 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 452 | |||
18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 448 | 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 454 | |||
18.22. Operation 25: READ - Read from File . . . . . . . . . . 449 | 18.22. Operation 25: READ - Read from File . . . . . . . . . . 455 | |||
18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 451 | 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 457 | |||
18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 455 | 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 461 | |||
18.25. Operation 28: REMOVE - Remove File System Object . . . . 456 | 18.25. Operation 28: REMOVE - Remove File System Object . . . . 462 | |||
18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 458 | 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 464 | |||
18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 462 | 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 468 | |||
18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 463 | 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 469 | |||
18.29. Operation 33: SECINFO - Obtain Available Security . . . 464 | 18.29. Operation 33: SECINFO - Obtain Available Security . . . 470 | |||
18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 468 | 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 474 | |||
18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 471 | 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 477 | |||
18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 472 | 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 478 | |||
18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 476 | 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 482 | |||
18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 478 | 18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 484 | |||
18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 481 | 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 487 | |||
18.36. Operation 43: CREATE_SESSION - Create New Session and | 18.36. Operation 43: CREATE_SESSION - Create New Session and | |||
Confirm Client ID . . . . . . . . . . . . . . . . . . . 498 | Confirm Client ID . . . . . . . . . . . . . . . . . . . 504 | |||
18.37. Operation 44: DESTROY_SESSION - Destroy existing | 18.37. Operation 44: DESTROY_SESSION - Destroy existing | |||
session . . . . . . . . . . . . . . . . . . . . . . . . 508 | session . . . . . . . . . . . . . . . . . . . . . . . . 514 | |||
18.38. Operation 45: FREE_STATEID - Free stateid with no | 18.38. Operation 45: FREE_STATEID - Free stateid with no | |||
locks . . . . . . . . . . . . . . . . . . . . . . . . . 509 | locks . . . . . . . . . . . . . . . . . . . . . . . . . 515 | |||
18.39. Operation 46: GET_DIR_DELEGATION - Get a directory | 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory | |||
delegation . . . . . . . . . . . . . . . . . . . . . . . 510 | delegation . . . . . . . . . . . . . . . . . . . . . . . 516 | |||
18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 514 | 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 520 | |||
18.41. Operation 48: GETDEVICELIST - Get All Device Mappings | 18.41. Operation 48: GETDEVICELIST - Get All Device Mappings | |||
for a File System . . . . . . . . . . . . . . . . . . . 516 | for a File System . . . . . . . . . . . . . . . . . . . 522 | |||
18.42. Operation 49: LAYOUTCOMMIT - Commit writes made using | 18.42. Operation 49: LAYOUTCOMMIT - Commit writes made using | |||
a layout . . . . . . . . . . . . . . . . . . . . . . . . 518 | a layout . . . . . . . . . . . . . . . . . . . . . . . . 524 | |||
18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 521 | 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 527 | |||
18.44. Operation 51: LAYOUTRETURN - Release Layout | 18.44. Operation 51: LAYOUTRETURN - Release Layout | |||
Information . . . . . . . . . . . . . . . . . . . . . . 531 | Information . . . . . . . . . . . . . . . . . . . . . . 537 | |||
18.45. Operation 52: SECINFO_NO_NAME - Get Security on | 18.45. Operation 52: SECINFO_NO_NAME - Get Security on | |||
Unnamed Object . . . . . . . . . . . . . . . . . . . . . 535 | Unnamed Object . . . . . . . . . . . . . . . . . . . . . 541 | |||
18.46. Operation 53: SEQUENCE - Supply per-procedure | 18.46. Operation 53: SEQUENCE - Supply per-procedure | |||
sequencing and control . . . . . . . . . . . . . . . . . 537 | sequencing and control . . . . . . . . . . . . . . . . . 543 | |||
18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 542 | 18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 548 | |||
18.48. Operation 55: TEST_STATEID - Test stateids for | 18.48. Operation 55: TEST_STATEID - Test stateids for | |||
validity . . . . . . . . . . . . . . . . . . . . . . . . 544 | validity . . . . . . . . . . . . . . . . . . . . . . . . 550 | |||
18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 546 | 18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 552 | |||
18.50. Operation 57: DESTROY_CLIENTID - Destroy existing | 18.50. Operation 57: DESTROY_CLIENTID - Destroy existing | |||
client ID . . . . . . . . . . . . . . . . . . . . . . . 550 | client ID . . . . . . . . . . . . . . . . . . . . . . . 556 | |||
18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims | 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims | |||
Finished . . . . . . . . . . . . . . . . . . . . . . . . 550 | Finished . . . . . . . . . . . . . . . . . . . . . . . . 556 | |||
18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 553 | 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 559 | |||
19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 553 | 19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 559 | |||
19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 554 | 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 560 | |||
19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 554 | 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 560 | |||
20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 558 | 20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 564 | |||
20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 558 | 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 564 | |||
20.2. Operation 4: CB_RECALL - Recall a Delegation . . . . . . 559 | 20.2. Operation 4: CB_RECALL - Recall a Delegation . . . . . . 565 | |||
20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from | 20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from | |||
Client . . . . . . . . . . . . . . . . . . . . . . . . . 560 | Client . . . . . . . . . . . . . . . . . . . . . . . . . 566 | |||
20.4. Operation 6: CB_NOTIFY - Notify directory changes . . . 564 | 20.4. Operation 6: CB_NOTIFY - Notify directory changes . . . 570 | |||
20.5. Operation 7: CB_PUSH_DELEG - Offer Delegation to | 20.5. Operation 7: CB_PUSH_DELEG - Offer Delegation to | |||
Client . . . . . . . . . . . . . . . . . . . . . . . . . 568 | Client . . . . . . . . . . . . . . . . . . . . . . . . . 574 | |||
20.6. Operation 8: CB_RECALL_ANY - Keep any N recallable | 20.6. Operation 8: CB_RECALL_ANY - Keep any N recallable | |||
objects . . . . . . . . . . . . . . . . . . . . . . . . 569 | objects . . . . . . . . . . . . . . . . . . . . . . . . 575 | |||
20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal | 20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal | |||
Resources for Recallable Objects . . . . . . . . . . . . 572 | Resources for Recallable Objects . . . . . . . . . . . . 578 | |||
20.8. Operation 10: CB_RECALL_SLOT - change flow control | 20.8. Operation 10: CB_RECALL_SLOT - change flow control | |||
limits . . . . . . . . . . . . . . . . . . . . . . . . . 573 | limits . . . . . . . . . . . . . . . . . . . . . . . . . 579 | |||
20.9. Operation 11: CB_SEQUENCE - Supply backchannel | 20.9. Operation 11: CB_SEQUENCE - Supply backchannel | |||
sequencing and control . . . . . . . . . . . . . . . . . 574 | sequencing and control . . . . . . . . . . . . . . . . . 580 | |||
20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending | 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending | |||
Delegation Wants . . . . . . . . . . . . . . . . . . . . 576 | Delegation Wants . . . . . . . . . . . . . . . . . . . . 582 | |||
20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible | 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible | |||
lock availability . . . . . . . . . . . . . . . . . . . 577 | lock availability . . . . . . . . . . . . . . . . . . . 583 | |||
20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID | 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID | |||
changes . . . . . . . . . . . . . . . . . . . . . . . . 579 | changes . . . . . . . . . . . . . . . . . . . . . . . . 585 | |||
20.13. Operation 10044: CB_ILLEGAL - Illegal Callback | 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback | |||
Operation . . . . . . . . . . . . . . . . . . . . . . . 581 | Operation . . . . . . . . . . . . . . . . . . . . . . . 587 | |||
21. Security Considerations . . . . . . . . . . . . . . . . . . . 581 | 21. Security Considerations . . . . . . . . . . . . . . . . . . . 587 | |||
22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 583 | 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 589 | |||
22.1. Named Attribute Definitions . . . . . . . . . . . . . . 583 | 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 589 | |||
22.1.1. Initial Registry . . . . . . . . . . . . . . . . . . 584 | 22.1.1. Initial Registry . . . . . . . . . . . . . . . . . . 590 | |||
22.1.2. Updating Registrations . . . . . . . . . . . . . . . 584 | 22.1.2. Updating Registrations . . . . . . . . . . . . . . . 590 | |||
22.2. Device ID Notifications . . . . . . . . . . . . . . . . 584 | 22.2. Device ID Notifications . . . . . . . . . . . . . . . . 590 | |||
22.2.1. Initial Registry . . . . . . . . . . . . . . . . . . 585 | 22.2.1. Initial Registry . . . . . . . . . . . . . . . . . . 591 | |||
22.2.2. Updating Registrations . . . . . . . . . . . . . . . 585 | 22.2.2. Updating Registrations . . . . . . . . . . . . . . . 591 | |||
22.3. Object Recall Types . . . . . . . . . . . . . . . . . . 585 | 22.3. Object Recall Types . . . . . . . . . . . . . . . . . . 591 | |||
22.3.1. Initial Registry . . . . . . . . . . . . . . . . . . 587 | 22.3.1. Initial Registry . . . . . . . . . . . . . . . . . . 593 | |||
22.3.2. Updating Registrations . . . . . . . . . . . . . . . 587 | 22.3.2. Updating Registrations . . . . . . . . . . . . . . . 593 | |||
22.4. Layout Types . . . . . . . . . . . . . . . . . . . . . . 587 | 22.4. Layout Types . . . . . . . . . . . . . . . . . . . . . . 593 | |||
22.4.1. Initial Registry . . . . . . . . . . . . . . . . . . 588 | 22.4.1. Initial Registry . . . . . . . . . . . . . . . . . . 594 | |||
22.4.2. Updating Registrations . . . . . . . . . . . . . . . 588 | 22.4.2. Updating Registrations . . . . . . . . . . . . . . . 594 | |||
22.4.3. Guidelines for Writing Layout Type Specifications . 588 | 22.4.3. Guidelines for Writing Layout Type Specifications . 594 | |||
22.5. Path Variable Definitions . . . . . . . . . . . . . . . 590 | 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 596 | |||
22.5.1. Path Variables Registry . . . . . . . . . . . . . . 590 | 22.5.1. Path Variables Registry . . . . . . . . . . . . . . 596 | |||
22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable . . . . 592 | 22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable . . . . 598 | |||
22.5.3. Values for the ${ietf.org:OS_TYPE} Variable . . . . 592 | 22.5.3. Values for the ${ietf.org:OS_TYPE} Variable . . . . 598 | |||
23. References . . . . . . . . . . . . . . . . . . . . . . . . . 593 | 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 599 | |||
23.1. Normative References . . . . . . . . . . . . . . . . . . 593 | 23.1. Normative References . . . . . . . . . . . . . . . . . . 599 | |||
23.2. Informative References . . . . . . . . . . . . . . . . . 595 | 23.2. Informative References . . . . . . . . . . . . . . . . . 601 | |||
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 596 | Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 602 | |||
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 598 | Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 604 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 599 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 605 | |||
Intellectual Property and Copyright Statements . . . . . . . . . 600 | Intellectual Property and Copyright Statements . . . . . . . . . 606 | |||
1. Introduction | 1. Introduction | |||
1.1. The NFS Version 4 Minor Version 1 Protocol | 1.1. The NFS Version 4 Minor Version 1 Protocol | |||
The NFS version 4 minor version 1 (NFSv4.1) protocol is the second | The NFS version 4 minor version 1 (NFSv4.1) protocol is the second | |||
minor version of the NFS version 4 (NFSv4) protocol. The first minor | minor version of the NFS version 4 (NFSv4) protocol. The first minor | |||
version, NFSv4.0 is described in [20]. It generally follows the | version, NFSv4.0 is described in [20]. It generally follows the | |||
guidelines for minor versioning model listed in Section 10 of RFC | guidelines for minor versioning model listed in Section 10 of RFC | |||
3530. However, it diverges from guidelines 11 ("a client and server | 3530. However, it diverges from guidelines 11 ("a client and server | |||
skipping to change at page 26, line 4 | skipping to change at page 26, line 4 | |||
(e.g. restarts) of the same client cause the client to present the | (e.g. restarts) of the same client cause the client to present the | |||
same string. The implementor is cautioned from an approach that | same string. The implementor is cautioned from an approach that | |||
requires the string to be recorded in a local file because this | requires the string to be recorded in a local file because this | |||
precludes the use of the implementation in an environment where | precludes the use of the implementation in an environment where | |||
there is no local disk and all file access is from an NFSv4.1 | there is no local disk and all file access is from an NFSv4.1 | |||
server. | server. | |||
o The string should be the same for each server network address that | o The string should be the same for each server network address that | |||
the client accesses. This way, if a server has multiple | the client accesses. This way, if a server has multiple | |||
interfaces, the client can trunk traffic over multiple network | interfaces, the client can trunk traffic over multiple network | |||
paths as described in Section 2.10.4. (Note: the precise opposite | paths as described in Section 2.10.5. (Note: the precise opposite | |||
was advised in the NFSv4.0 specification [20].) | was advised in the NFSv4.0 specification [20].) | |||
o The algorithm for generating the string should not assume that the | o The algorithm for generating the string should not assume that the | |||
client's network address will not change, unless the client | client's network address will not change, unless the client | |||
implementation knows it is using statically assigned network | implementation knows it is using statically assigned network | |||
addresses. This includes changes between client incarnations and | addresses. This includes changes between client incarnations and | |||
even changes while the client is still running in its current | even changes while the client is still running in its current | |||
incarnation. Thus with dynamic address assignment, if the client | incarnation. Thus with dynamic address assignment, if the client | |||
includes just the client's network address in the co_ownerid | includes just the client's network address in the co_ownerid | |||
string, there is a real risk that after the client gives up the | string, there is a real risk that after the client gives up the | |||
skipping to change at page 27, line 9 | skipping to change at page 27, line 9 | |||
The client ID is assigned by the server (the eir_clientid result from | The client ID is assigned by the server (the eir_clientid result from | |||
EXCHANGE_ID) and should be chosen so that it will not conflict with a | EXCHANGE_ID) and should be chosen so that it will not conflict with a | |||
client ID previously assigned by the server. This applies across | client ID previously assigned by the server. This applies across | |||
server restarts. | server restarts. | |||
In the event of a server restart, a client may find out that its | In the event of a server restart, a client may find out that its | |||
current client ID is no longer valid when it receives an | current client ID is no longer valid when it receives an | |||
NFS4ERR_STALE_CLIENTID error. The precise circumstances depend on | NFS4ERR_STALE_CLIENTID error. The precise circumstances depend on | |||
the characteristics of the sessions involved, specifically whether | the characteristics of the sessions involved, specifically whether | |||
the session is persistent (see Section 2.10.5.5), but in each case | the session is persistent (see Section 2.10.6.5), but in each case | |||
the client will receive this error when it attempts to establish a | the client will receive this error when it attempts to establish a | |||
new session with the existing client ID and receives the error | new session with the existing client ID and receives the error | |||
NFS4ERR_STALE_CLIENTID, indicating that a new client ID must be | NFS4ERR_STALE_CLIENTID, indicating that a new client ID must be | |||
obtained via EXCHANGE_ID and the new session established with that | obtained via EXCHANGE_ID and the new session established with that | |||
client ID. | client ID. | |||
When a session is not persistent, the client will find out that it | When a session is not persistent, the client will find out that it | |||
needs to create a new session as a result of getting an | needs to create a new session as a result of getting an | |||
NFS4ERR_BADSESSION, since the session in question was lost as part of | NFS4ERR_BADSESSION, since the session in question was lost as part of | |||
a server restart. When the existing client ID is presented to a | a server restart. When the existing client ID is presented to a | |||
skipping to change at page 28, line 39 | skipping to change at page 28, line 39 | |||
the client ID in order to conserve resources. If the client contacts | the client ID in order to conserve resources. If the client contacts | |||
the server after this release, the server must ensure the client | the server after this release, the server must ensure the client | |||
receives the appropriate error so that it will use the EXCHANGE_ID/ | receives the appropriate error so that it will use the EXCHANGE_ID/ | |||
CREATE_SESSION sequence to establish a new client ID. The server | CREATE_SESSION sequence to establish a new client ID. The server | |||
ought to be very hesitant to release a client ID since the resulting | ought to be very hesitant to release a client ID since the resulting | |||
work on the client to recover from such an event will be the same | work on the client to recover from such an event will be the same | |||
burden as if the server had failed and restarted. Typically a server | burden as if the server had failed and restarted. Typically a server | |||
would not release a client ID unless there had been no activity from | would not release a client ID unless there had been no activity from | |||
that client for many minutes. As long as there are sessions, opens, | that client for many minutes. As long as there are sessions, opens, | |||
locks, delegations, layouts, or wants, the server MUST NOT release | locks, delegations, layouts, or wants, the server MUST NOT release | |||
the client ID. See Section 2.10.11.1.4 for discussion on releasing | the client ID. See Section 2.10.12.1.4 for discussion on releasing | |||
inactive sessions. | inactive sessions. | |||
2.4.3. Resolving Client Owner Conflicts | 2.4.3. Resolving Client Owner Conflicts | |||
When the server gets an EXCHANGE_ID for a client owner that currently | When the server gets an EXCHANGE_ID for a client owner that currently | |||
has no state, or that has state, but the lease has expired, the | has no state, or that has state, but the lease has expired, the | |||
server MUST allow the EXCHANGE_ID, and confirm the new client ID if | server MUST allow the EXCHANGE_ID, and confirm the new client ID if | |||
followed by the appropriate CREATE_SESSION. | followed by the appropriate CREATE_SESSION. | |||
When the server gets an EXCHANGE_ID for a new incarnation of a client | When the server gets an EXCHANGE_ID for a new incarnation of a client | |||
skipping to change at page 29, line 15 | skipping to change at page 29, line 15 | |||
o The principal that created the client ID for the client owner is | o The principal that created the client ID for the client owner is | |||
the same as the principal that is issuing the EXCHANGE_ID. Note | the same as the principal that is issuing the EXCHANGE_ID. Note | |||
that if the client ID was created with SP4_MACH_CRED state | that if the client ID was created with SP4_MACH_CRED state | |||
protection (Section 18.35), the principal MUST be based on | protection (Section 18.35), the principal MUST be based on | |||
RPCSEC_GSS authentication, the RPCSEC_GSS service used MUST be | RPCSEC_GSS authentication, the RPCSEC_GSS service used MUST be | |||
integrity or privacy, and the same GSS mechanism and principal | integrity or privacy, and the same GSS mechanism and principal | |||
must be used as that used when the client ID was created. | must be used as that used when the client ID was created. | |||
o The client ID was established with SP4_SSV protection | o The client ID was established with SP4_SSV protection | |||
(Section 18.35, Section 2.10.7.3) and the client sends the | (Section 18.35, Section 2.10.8.3) and the client sends the | |||
EXCHANGE_ID with the security flavor set to RPCSEC_GSS using the | EXCHANGE_ID with the security flavor set to RPCSEC_GSS using the | |||
GSS SSV mechanism (Section 2.10.8). | GSS SSV mechanism (Section 2.10.9). | |||
o The client ID was established with SP4_SSV protection, and under | o The client ID was established with SP4_SSV protection, and under | |||
the conditions described herein, the EXCHANGE_ID was sent with | the conditions described herein, the EXCHANGE_ID was sent with | |||
SP4_MACH_CRED state protection. Because the SSV might not persist | SP4_MACH_CRED state protection. Because the SSV might not persist | |||
across client and server restart, and because the first time a | across client and server restart, and because the first time a | |||
client sends EXCHANGE_ID to a server it does not have an SSV, the | client sends EXCHANGE_ID to a server it does not have an SSV, the | |||
client MAY send the subsequent EXCHANGE_ID without an SSV | client MAY send the subsequent EXCHANGE_ID without an SSV | |||
RPCSEC_GSS handle. Instead, as with SP4_MACH_CRED protection, the | RPCSEC_GSS handle. Instead, as with SP4_MACH_CRED protection, the | |||
principal MUST be based on RPCSEC_GSS authentication, the | principal MUST be based on RPCSEC_GSS authentication, the | |||
RPCSEC_GSS service used MUST be integrity or privacy, and the same | RPCSEC_GSS service used MUST be integrity or privacy, and the same | |||
skipping to change at page 29, line 41 | skipping to change at page 29, line 41 | |||
If none of the above situations apply, the server MUST return | If none of the above situations apply, the server MUST return | |||
NFS4ERR_CLID_INUSE. | NFS4ERR_CLID_INUSE. | |||
If the server accepts the principal and co_ownerid as matching that | If the server accepts the principal and co_ownerid as matching that | |||
which created the client ID, and the co_verifier in the EXCHANGE_ID | which created the client ID, and the co_verifier in the EXCHANGE_ID | |||
differs from the co_verifier used when the client ID was created, | differs from the co_verifier used when the client ID was created, | |||
then after the server receives a CREATE_SESSION that confirms the | then after the server receives a CREATE_SESSION that confirms the | |||
client ID, the server deletes state. If the co_verifier values are | client ID, the server deletes state. If the co_verifier values are | |||
the same, (e.g. the client is either updating properties of the | the same, (e.g. the client is either updating properties of the | |||
client ID (Section 18.35), or the client is attempting trunking | client ID (Section 18.35), or the client is attempting trunking | |||
(Section 2.10.4) the server MUST NOT delete state. | (Section 2.10.5) the server MUST NOT delete state. | |||
2.5. Server Owners | 2.5. Server Owners | |||
The Server Owner is similar to a Client Owner (Section 2.4), but | The Server Owner is similar to a Client Owner (Section 2.4), but | |||
unlike the Client Owner, there is no shorthand server ID. The Server | unlike the Client Owner, there is no shorthand server ID. The Server | |||
Owner is defined in the following data type: | Owner is defined in the following data type: | |||
struct server_owner4 { | struct server_owner4 { | |||
uint64_t so_minor_id; | uint64_t so_minor_id; | |||
opaque so_major_id<NFS4_OPAQUE_LIMIT>; | opaque so_major_id<NFS4_OPAQUE_LIMIT>; | |||
}; | }; | |||
The Server Owner is returned from EXCHANGE_ID. When the so_major_id | The Server Owner is returned from EXCHANGE_ID. When the so_major_id | |||
fields are the same in two EXCHANGE_ID results, the connections each | fields are the same in two EXCHANGE_ID results, the connections each | |||
EXCHANGE_ID were sent over can be assumed to address the same Server | EXCHANGE_ID were sent over can be assumed to address the same Server | |||
(as defined in Section 1.5). If the so_minor_id fields are also the | (as defined in Section 1.5). If the so_minor_id fields are also the | |||
same, then not only do both connections connect to the same server, | same, then not only do both connections connect to the same server, | |||
but the session can be shared across both connections. The reader is | but the session can be shared across both connections. The reader is | |||
cautioned that multiple servers may deliberately or accidentally | cautioned that multiple servers may deliberately or accidentally | |||
claim to have the same so_major_id or so_major_id/so_minor_id; the | claim to have the same so_major_id or so_major_id/so_minor_id; the | |||
reader should examine Section 2.10.4 and Section 18.35 in order to | reader should examine Section 2.10.5 and Section 18.35 in order to | |||
avoid acting on falsely matching Server Owner values. | avoid acting on falsely matching Server Owner values. | |||
The considerations for generating a so_major_id are similar to that | The considerations for generating a so_major_id are similar to that | |||
for generating a co_ownerid string (see Section 2.4). The | for generating a co_ownerid string (see Section 2.4). The | |||
consequences of two servers generating conflicting so_major_id values | consequences of two servers generating conflicting so_major_id values | |||
are less dire than they are for co_ownerid conflicts because the | are less dire than they are for co_ownerid conflicts because the | |||
client can use RPCSEC_GSS to compare the authenticity of each server | client can use RPCSEC_GSS to compare the authenticity of each server | |||
(see Section 2.10.4). | (see Section 2.10.5). | |||
2.6. Security Service Negotiation | 2.6. Security Service Negotiation | |||
With the NFSv4.1 server potentially offering multiple security | With the NFSv4.1 server potentially offering multiple security | |||
mechanisms, the client needs a method to determine or negotiate which | mechanisms, the client needs a method to determine or negotiate which | |||
mechanism is to be used for its communication with the server. The | mechanism is to be used for its communication with the server. The | |||
NFS server may have multiple points within its file system namespace | NFS server may have multiple points within its file system namespace | |||
that are available for use by NFS clients. These points can be | that are available for use by NFS clients. These points can be | |||
considered security policy boundaries, and in some NFS | considered security policy boundaries, and in some NFS | |||
implementations are tied to NFS export points. In turn the NFS | implementations are tied to NFS export points. In turn the NFS | |||
skipping to change at page 40, line 22 | skipping to change at page 40, line 22 | |||
In order to reduce congestion, if a connection-oriented transport is | In order to reduce congestion, if a connection-oriented transport is | |||
used, and the request is not the NULL procedure, | used, and the request is not the NULL procedure, | |||
o A requester MUST NOT retry a request unless the connection the | o A requester MUST NOT retry a request unless the connection the | |||
request was sent over was lost before the reply was received. | request was sent over was lost before the reply was received. | |||
o A replier MUST NOT silently drop a request, even if the request is | o A replier MUST NOT silently drop a request, even if the request is | |||
a retry. (The silent drop behavior of RPCSEC_GSS [4] does not | a retry. (The silent drop behavior of RPCSEC_GSS [4] does not | |||
apply because this behavior happens at the RPCSEC_GSS layer, a | apply because this behavior happens at the RPCSEC_GSS layer, a | |||
lower layer in the request processing). Instead, the replier | lower layer in the request processing). Instead, the replier | |||
SHOULD return an appropriate error (see Section 2.10.5.1) or it | SHOULD return an appropriate error (see Section 2.10.6.1) or it | |||
MAY disconnect the connection. | MAY disconnect the connection. | |||
When sending a reply, the replier MUST send the reply to the same | When sending a reply, the replier MUST send the reply to the same | |||
full network address (e.g. if using an IP-based transport, the source | full network address (e.g. if using an IP-based transport, the source | |||
port of the requester is part of the full network address) that the | port of the requester is part of the full network address) that the | |||
requester sent the request from. If using a connection-oriented | requester sent the request from. If using a connection-oriented | |||
transport, replies MUST be sent on the same connection the request | transport, replies MUST be sent on the same connection the request | |||
was received from. | was received from. | |||
If a connection is dropped after the replier receives the request but | If a connection is dropped after the replier receives the request but | |||
skipping to change at page 41, line 15 | skipping to change at page 41, line 15 | |||
o RDMA credits present a new issue to the reply cache in NFSv4.1. | o RDMA credits present a new issue to the reply cache in NFSv4.1. | |||
The reply cache may be used when a connection within a session is | The reply cache may be used when a connection within a session is | |||
lost, such as after the client reconnects. Credit information is | lost, such as after the client reconnects. Credit information is | |||
a dynamic property of the RDMA connection, and stale values must | a dynamic property of the RDMA connection, and stale values must | |||
not be replayed from the cache. This implies that the reply cache | not be replayed from the cache. This implies that the reply cache | |||
contents must not be blindly used when replies are sent from it, | contents must not be blindly used when replies are sent from it, | |||
and credit information appropriate to the channel must be | and credit information appropriate to the channel must be | |||
refreshed by the RPC layer. | refreshed by the RPC layer. | |||
In addition, as described in Section 2.10.5.2, while a session is | In addition, as described in Section 2.10.6.2, while a session is | |||
active, the NFSv4.1 requester MUST NOT stop waiting for a reply. | active, the NFSv4.1 requester MUST NOT stop waiting for a reply. | |||
2.9.3. Ports | 2.9.3. Ports | |||
Historically, NFSv3 servers have listened over TCP port 2049. The | Historically, NFSv3 servers have listened over TCP port 2049. The | |||
registered port 2049 [24] for the NFS protocol should be the default | registered port 2049 [24] for the NFS protocol should be the default | |||
configuration. NFSv4.1 clients SHOULD NOT use the RPC binding | configuration. NFSv4.1 clients SHOULD NOT use the RPC binding | |||
protocols as described in [25]. | protocols as described in [25]. | |||
2.10. Session | 2.10. Session | |||
skipping to change at page 41, line 51 | skipping to change at page 41, line 51 | |||
o Requiring machine credentials for fully secure operation. | o Requiring machine credentials for fully secure operation. | |||
Through the introduction of a session, NFSv4.1 addresses the above | Through the introduction of a session, NFSv4.1 addresses the above | |||
shortfalls with practical solutions: | shortfalls with practical solutions: | |||
o EOS is enabled by a reply cache with a bounded size, making it | o EOS is enabled by a reply cache with a bounded size, making it | |||
feasible to keep the cache in persistent storage and enable EOS | feasible to keep the cache in persistent storage and enable EOS | |||
through server failure and recovery. One reason that previous | through server failure and recovery. One reason that previous | |||
revisions of NFS did not support EOS was because some EOS | revisions of NFS did not support EOS was because some EOS | |||
approaches often limited parallelism. As will be explained in | approaches often limited parallelism. As will be explained in | |||
Section 2.10.5, NFSv4.1 supports both EOS and unlimited | Section 2.10.6, NFSv4.1 supports both EOS and unlimited | |||
parallelism. | parallelism. | |||
o The NFSv4.1 client (defined in Section 1.5, Paragraph 2) creates | o The NFSv4.1 client (defined in Section 1.5, Paragraph 2) creates | |||
transport connections and provides them to the server to use for | transport connections and provides them to the server to use for | |||
sending callback requests, thus solving the firewall issue | sending callback requests, thus solving the firewall issue | |||
(Section 18.34). Races between responses from client requests, | (Section 18.34). Races between responses from client requests, | |||
and callbacks caused by the requests are detected via the | and callbacks caused by the requests are detected via the | |||
session's sequencing properties which are a consequence of EOS | session's sequencing properties which are a consequence of EOS | |||
(Section 2.10.5.3). | (Section 2.10.6.3). | |||
o The NFSv4.1 client can add an arbitrary number of connections to | o The NFSv4.1 client can add an arbitrary number of connections to | |||
the session, and thus provide trunking (Section 2.10.4). | the session, and thus provide trunking (Section 2.10.5). | |||
o The NFSv4.1 client and server produces a session key independent | o The NFSv4.1 client and server produces a session key independent | |||
of client and server machine credentials which can be used to | of client and server machine credentials which can be used to | |||
compute a digest for protecting critical session management | compute a digest for protecting critical session management | |||
operations (Section 2.10.7.3). | operations (Section 2.10.8.3). | |||
o The NFSv4.1 client can also create secure RPCSEC_GSS contexts for | o The NFSv4.1 client can also create secure RPCSEC_GSS contexts for | |||
use by the session's backchannel that do not require the server to | use by the session's backchannel that do not require the server to | |||
authenticate to a client machine principal (Section 2.10.7.2). | authenticate to a client machine principal (Section 2.10.8.2). | |||
A session is a dynamically created, long-lived server object created | A session is a dynamically created, long-lived server object created | |||
by a client, used over time from one or more transport connections. | by a client, used over time from one or more transport connections. | |||
Its function is to maintain the server's state relative to the | Its function is to maintain the server's state relative to the | |||
connection(s) belonging to a client instance. This state is entirely | connection(s) belonging to a client instance. This state is entirely | |||
independent of the connection itself, and indeed the state exists | independent of the connection itself, and indeed the state exists | |||
whether the connection exists or not. A client may have one or more | whether the connection exists or not. A client may have one or more | |||
sessions associated with it so that client-associated state may be | sessions associated with it so that client-associated state may be | |||
accessed using any of the sessions associated with that client's | accessed using any of the sessions associated with that client's | |||
client ID, when connections are associated with those sessions. When | client ID, when connections are associated with those sessions. When | |||
skipping to change at page 43, line 20 | skipping to change at page 43, line 20 | |||
established session, with the exception of some session | established session, with the exception of some session | |||
administration operations, such as DESTROY_SESSION (Section 18.37). | administration operations, such as DESTROY_SESSION (Section 18.37). | |||
2.10.2.1. SEQUENCE and CB_SEQUENCE | 2.10.2.1. SEQUENCE and CB_SEQUENCE | |||
In NFSv4.1, when the SEQUENCE operation is present, it MUST be the | In NFSv4.1, when the SEQUENCE operation is present, it MUST be the | |||
first operation in the COMPOUND procedure. The primary purpose of | first operation in the COMPOUND procedure. The primary purpose of | |||
SEQUENCE is to carry the session identifier. The session identifier | SEQUENCE is to carry the session identifier. The session identifier | |||
associates all other operations in the COMPOUND procedure with a | associates all other operations in the COMPOUND procedure with a | |||
particular session. SEQUENCE also contains required information for | particular session. SEQUENCE also contains required information for | |||
maintaining EOS (see Section 2.10.5). Session-enabled NFSv4.1 | maintaining EOS (see Section 2.10.6). Session-enabled NFSv4.1 | |||
COMPOUND requests thus have the form: | COMPOUND requests thus have the form: | |||
+-----+--------------+-----------+------------+-----------+---- | +-----+--------------+-----------+------------+-----------+---- | |||
| tag | minorversion | numops |SEQUENCE op | op + args | ... | | tag | minorversion | numops |SEQUENCE op | op + args | ... | |||
| | (== 1) | (limited) | + args | | | | | (== 1) | (limited) | + args | | | |||
+-----+--------------+-----------+------------+-----------+---- | +-----+--------------+-----------+------------+-----------+---- | |||
and the replys have the form: | and the replies have the form: | |||
+------------+-----+--------+-------------------------------+--// | +------------+-----+--------+-------------------------------+--// | |||
|last status | tag | numres |status + SEQUENCE op + results | // | |last status | tag | numres |status + SEQUENCE op + results | // | |||
+------------+-----+--------+-------------------------------+--// | +------------+-----+--------+-------------------------------+--// | |||
//-----------------------+---- | //-----------------------+---- | |||
// status + op + results | ... | // status + op + results | ... | |||
//-----------------------+---- | //-----------------------+---- | |||
A CB_COMPOUND procedure request and reply has a similar form to | A CB_COMPOUND procedure request and reply has a similar form to | |||
COMPOUND, but instead of a SEQUENCE operation, there is a CB_SEQUENCE | COMPOUND, but instead of a SEQUENCE operation, there is a CB_SEQUENCE | |||
operation. CB_COMPOUND also has an additional field called | operation. CB_COMPOUND also has an additional field called | |||
"callback_ident", which is superfluous in NFSv4.1 and MUST be ignored | "callback_ident", which is superfluous in NFSv4.1 and MUST be ignored | |||
by the client. CB_SEQUENCE has the same information as SEQUENCE, and | by the client. CB_SEQUENCE has the same information as SEQUENCE, and | |||
also includes other information needed to resolve callback races | also includes other information needed to resolve callback races | |||
(Section 2.10.5.3). | (Section 2.10.6.3). | |||
2.10.2.2. Client ID and Session Association | 2.10.2.2. Client ID and Session Association | |||
Each client ID (Section 2.4) can have zero or more active sessions. | Each client ID (Section 2.4) can have zero or more active sessions. | |||
A client ID and associated session are required to perform file | A client ID and associated session are required to perform file | |||
access in NFSv4.1. Each time a session is used (whether by a client | access in NFSv4.1. Each time a session is used (whether by a client | |||
sending a request to the server, or the client replying to a callback | sending a request to the server, or the client replying to a callback | |||
request from the server), the state leased to its associated client | request from the server), the state leased to its associated client | |||
ID is automatically renewed. | ID is automatically renewed. | |||
State such as share reservations, locks, delegations, and layouts | State such as share reservations, locks, delegations, and layouts | |||
(Section 1.6.4) is tied to the client ID. Client state is not tied | (Section 1.6.4) is tied to the client ID. Client state is not tied | |||
to any individual session. Successive state changing operations from | to any individual session. Successive state changing operations from | |||
a given state owner MAY go over different sessions, provided the | a given state owner MAY go over different sessions, provided the | |||
session is associated with the same client ID. A callback MAY arrive | session is associated with the same client ID. A callback MAY arrive | |||
over a different session than from the session that originally | over a different session than from the session that originally | |||
acquired the state pertaining to the callback. For example, if | acquired the state pertaining to the callback. For example, if | |||
session A is used to acquire a delegation, a request to recall the | session A is used to acquire a delegation, a request to recall the | |||
delegation MAY arrive over session B if both sessions are associated | delegation MAY arrive over session B if both sessions are associated | |||
with the same client ID. Section 2.10.7.1 and Section 2.10.7.2 | with the same client ID. Section 2.10.8.1 and Section 2.10.8.2 | |||
discuss the security considerations around callbacks. | discuss the security considerations around callbacks. | |||
2.10.3. Channels | 2.10.3. Channels | |||
A channel is not a connection. A channel represents the direction | A channel is not a connection. A channel represents the direction | |||
ONC RPC requests are sent. | ONC RPC requests are sent. | |||
Each session has one or two channels: the fore channel and the | Each session has one or two channels: the fore channel and the | |||
backchannel. Because there are at most two channels per session, and | backchannel. Because there are at most two channels per session, and | |||
because each channel has a distinct purpose, channels are not | because each channel has a distinct purpose, channels are not | |||
skipping to change at page 44, line 39 | skipping to change at page 44, line 39 | |||
server, and carries COMPOUND requests and responses. A session | server, and carries COMPOUND requests and responses. A session | |||
always has a fore channel. | always has a fore channel. | |||
The backchannel used for callback requests from server to client, and | The backchannel used for callback requests from server to client, and | |||
carries CB_COMPOUND requests and responses. Whether there is a | carries CB_COMPOUND requests and responses. Whether there is a | |||
backchannel or not is a decision by the client, however many features | backchannel or not is a decision by the client, however many features | |||
of NFSv4.1 require a backchannel. NFSv4.1 servers MUST support | of NFSv4.1 require a backchannel. NFSv4.1 servers MUST support | |||
backchannels. | backchannels. | |||
Each session has resources for each channel, including separate reply | Each session has resources for each channel, including separate reply | |||
caches (see Section 2.10.5.1). Note that even the backchannel | caches (see Section 2.10.6.1). Note that even the backchannel | |||
requires a reply cache because some callback operations are | requires a reply cache because some callback operations are | |||
nonidempotent. | nonidempotent. | |||
2.10.3.1. Association of Connections, Channels, and Sessions | 2.10.3.1. Association of Connections, Channels, and Sessions | |||
Each channel is associated with zero or more transport connections | Each channel is associated with zero or more transport connections | |||
(whether of the same transport protocol or different transport | (whether of the same transport protocol or different transport | |||
protocols). A connection can be associated with one channel or both | protocols). A connection can be associated with one channel or both | |||
channels of a session; the client and server negotiate whether a | channels of a session; the client and server negotiate whether a | |||
connection will carry traffic for one channel or both channels via | connection will carry traffic for one channel or both channels via | |||
skipping to change at page 45, line 22 | skipping to change at page 45, line 22 | |||
A connection's association with a session is not exclusive. A | A connection's association with a session is not exclusive. A | |||
connection associated with the channel(s) of one session may be | connection associated with the channel(s) of one session may be | |||
simultaneously associated with the channel(s) of other sessions | simultaneously associated with the channel(s) of other sessions | |||
including sessions associated with other client IDs. | including sessions associated with other client IDs. | |||
It is permissible for connections of multiple transport types to be | It is permissible for connections of multiple transport types to be | |||
associated with the same channel. For example both a TCP and RDMA | associated with the same channel. For example both a TCP and RDMA | |||
connection can be associated with the fore channel. In the event an | connection can be associated with the fore channel. In the event an | |||
RDMA and non-RDMA connection are associated with the same channel, | RDMA and non-RDMA connection are associated with the same channel, | |||
the maximum number of slots SHOULD be at least one more than the | the maximum number of slots SHOULD be at least one more than the | |||
total number of RDMA credits (Section 2.10.5.1. This way if all RDMA | total number of RDMA credits (Section 2.10.6.1. This way if all RDMA | |||
credits are used, the non-RDMA connection can have at least one | credits are used, the non-RDMA connection can have at least one | |||
outstanding request. If a server supports multiple transport types, | outstanding request. If a server supports multiple transport types, | |||
it MUST allow a client to associate connections from each transport | it MUST allow a client to associate connections from each transport | |||
to a channel. | to a channel. | |||
It is permissible for a connection of one type of transport to be | It is permissible for a connection of one type of transport to be | |||
associated with the fore channel, and a connection of a different | associated with the fore channel, and a connection of a different | |||
type to be associated with the backchannel. | type to be associated with the backchannel. | |||
2.10.4. Trunking | 2.10.4. Server Scope | |||
Servers each specify a server scope value in the form of an opaque | ||||
string eir_server_scope returned as part of the results of an | ||||
EXCHANGE_ID operation. The purpose of the server scope is to allow | ||||
groups of servers to indicate to clients that a set of servers | ||||
sharing the same server scope value have arranged to use compatible | ||||
values of otherwise opaque identifiers so that the identifiers | ||||
generated by one server of that set may be presented to another of | ||||
that same scope. | ||||
The use of such compatible values does not imply that a value | ||||
generated by one server will always be accepted by another. In most | ||||
cases, it will not. However, a server will not accept a value | ||||
generated by another inadvertently. When it does accept it, it will | ||||
be because it is recognized as valid and carrying the same meaning as | ||||
on another server of the same scope. | ||||
When servers are of the same server scope, this compatibility of | ||||
values applies to the follow identifiers: | ||||
o Filehandle values. A filehandle value accepted by two servers of | ||||
the same server scope denotes the same object. A write done to | ||||
one server is reflected immediately in a read done to the other | ||||
and locks obtained on one server conflict with those requested on | ||||
the other. | ||||
o Session ID values. A session ID value accepted by two server of | ||||
the same server scope denotes the same session. | ||||
o Client ID values. A client ID value accepted as valid by two | ||||
servers of the same server scope is associated with two clients | ||||
with the same client owner and verifier. | ||||
o State ID values when the corresponding client ID is recognized as | ||||
valid. If the same stateid value is accepted as valid on two | ||||
servers of the same scope and the client ID's on the two servers | ||||
represent the same client owner and verifier, then the two state | ||||
ID values designate the same set of locks and are for the same | ||||
file | ||||
o Server owner values. When the server scope values are the same, | ||||
server owner value may be validly compared. In cases where the | ||||
server scope are different, server owner values are treated as | ||||
different even if they contain all identical bytes. | ||||
The co-ordination among servers required to provide such | ||||
compatibility can be quite minimal, and limited to a simple partition | ||||
of the ID space. The recognition of common values requires | ||||
additional implementation, but this can be tailored to the specific | ||||
situations in which that recognition is desired. | ||||
Clients will have occasion to compare the server scope values of | ||||
multiple servers under a number of circumstances, each of which will | ||||
be discussed under the appropriate functional section. | ||||
o When server owner values received in response to EXCHANGE_ID | ||||
operations issued to multiple network addresses are compared for | ||||
the purpose of determining the validity of various forms of | ||||
trunking, as described in Section 2.10.5. | ||||
o When network or server reconfiguration causes the same network | ||||
address to possibly be directed to different servers, with the | ||||
necessity for the client to determine when lock reclaim should be | ||||
attempted, as described in Section 8.4.2.1 | ||||
o When file system migration causes the transfer of responsibility | ||||
for a file system between servers and the client needs to | ||||
determine whether state has been transferred with the file system | ||||
and whether a client may reclaim it on a similar basis as in the | ||||
case of server reboot. | ||||
When two replies from EXCHANGE_ID each from two different server | ||||
network addresses have the same server scope, there are a number of | ||||
ways a client can validate that the common server scope is by benign | ||||
intent. | ||||
o If both EXCHANGE_ID requests were sent with RPCSEC_GSS | ||||
authentication and the server principal is the same for both | ||||
targets, the equality of server scope is validated. | ||||
o A second option for verification is to use SP4_SSV protection in a | ||||
fashion similar to the verification of client ID trunking. When | ||||
the client sends EXCHANGE_ID it specifies SP4_SSV protection. The | ||||
first EXCHANGE_ID the client sends always has to be confirmed by a | ||||
CREATE_SESSION call. The client then sends SET_SSV. Later the | ||||
client sends EXCHANGE_ID to a second destination network address | ||||
different form the one the first EXCHANGE_ID was sent to. The | ||||
client checks that each EXCHANGE_ID reply has the same | ||||
eir_server_scope. If so, the client verifies the claim by issuing | ||||
a CREATE_SESSION to the second destination address, protected with | ||||
RPCSEC_GSS integrity using an RPCSEC_GSS handle returned by the | ||||
second EXCHANGE_ID. If the server accepts the CREATE_SESSION | ||||
request, and if the client verifies the RPCSEC_GSS verifier and | ||||
integrity codes, then the client has proof the second server knows | ||||
the SSV, and thus the two servers are co-operating for the | ||||
purposes of maintaining compatible ID spaces as indicated by a | ||||
common server scope. | ||||
o If neither of the two methods provides verification, the client | ||||
may accept the appearance of the second server in fs_locations or | ||||
fs_locations_info attribute for a relevant file system. For | ||||
example, if there is migration event for a particular particular | ||||
file system or there are locks to be reclaimed on a particular | ||||
file system, the attributes for that particular file system may be | ||||
used. The client sends the GETATTR request to the first server | ||||
for the fs_locations or fs_locations_info attribute with | ||||
RPCSEC_GSS authentication. It may need to do this in advance of | ||||
the need to verify the common server scope. If the client | ||||
successfully authenticates the reply to GETATTR, and the GETATTR | ||||
request and reply containing the fs_locations or fs_locations_info | ||||
attribute refers to the second server, then the equality of server | ||||
scope is supported. A client may choose to limit the use of this | ||||
form of support to information relevant to the specific file | ||||
system involved (e.g. a file system being migrated). | ||||
2.10.5. Trunking | ||||
Trunking is the use of multiple connections between a client and | Trunking is the use of multiple connections between a client and | |||
server in order to increase the speed of data transfer. NFSv4.1 | server in order to increase the speed of data transfer. NFSv4.1 | |||
supports two types of trunking: session trunking and client ID | supports two types of trunking: session trunking and client ID | |||
trunking. NFSv4.1 repliers and requesters MUST support session | trunking. | |||
trunking. NFSv4.1 servers MAY support client ID trunking. NFSv4.1 | ||||
clients MUST support client ID trunking. | NFSv4.1 servers MUST support both forms of trunking within the | |||
context of a single server network address and MUST support both | ||||
forms within the context of the set of network addresses used to | ||||
access a single server. NFSv4.1 servers in a clustered configuration | ||||
MAY allow network addresses for different servers to use client ID | ||||
trunking. | ||||
Clients may use either form of trunking as long as they do not, when | ||||
trunking between different server network addresses, violate the | ||||
servers' mandates as to the kinds of trunking to be allowed (see | ||||
below). With regard to callback channels, the client MUST allow the | ||||
server to choose among all callback channels valid for a given client | ||||
ID and MUST support trunking when the connections supporting the | ||||
backchannel allow session or client ID trunking to be used for | ||||
callbacks | ||||
Session trunking is essentially the association of multiple | Session trunking is essentially the association of multiple | |||
connections, each with potentially different target and/or source | connections, each with potentially different target and/or source | |||
network addresses, to the same session. | network addresses, to the same session. When the target network | |||
addresses (server addresses) of the two connections are the same, the | ||||
server MUST support such session trunking. When the target network | ||||
addresses are different, the server MAY indicate such support using | ||||
the data returned by the EXCHANGE_ID operation (see below). | ||||
Client ID trunking is the association of multiple sessions to the | Client ID trunking is the association of multiple sessions to the | |||
same client ID, major server owner ID (Section 2.5), and server scope | same client ID. Servers MUST support client ID trunking for two | |||
(Section 11.7.7). When two servers return the same major server | target network addresses whenever they allow session trunking for | |||
owner and server scope it means the two servers are cooperating on | those same two network addresses. In addition, a server MAY, by | |||
presenting the same major server owner ID (Section 2.5), and server | ||||
scope (Section 11.7.7) allow an additional case of client ID | ||||
trunking. When two servers return the same major server owner and | ||||
server scope, it means that the two servers are cooperating on | ||||
locking state management which is a prerequisite for client ID | locking state management which is a prerequisite for client ID | |||
trunking. | trunking. | |||
Understanding and distinguishing session and client ID trunking | Understanding and distinguishing when the client is allowed to use | |||
requires understanding how the results of the EXCHANGE_ID | session and client ID trunking requires understanding how the results | |||
(Section 18.35) operation identify a server. Suppose a client sends | of the EXCHANGE_ID (Section 18.35) operation identify a server. | |||
EXCHANGE_ID over two different connections each with a possibly | Suppose a client sends EXCHANGE_ID over two different connections | |||
different target network address but each EXCHANGE_ID with the same | each with a possibly different target network address but each | |||
value in the eia_clientowner field. If the same NFSv4.1 server is | EXCHANGE_ID operation has the same value in the eia_clientowner | |||
listening over each connection, then each EXCHANGE_ID result MUST | field. If the same NFSv4.1 server is listening over each connection, | |||
return the same values of eir_clientid, eir_server_owner.so_major_id | then each EXCHANGE_ID result MUST return the same values of | |||
and eir_server_scope. The client can then treat each connection as | eir_clientid, eir_server_owner.so_major_id and eir_server_scope. The | |||
referring to the same server (subject to verification, see | client can then treat each connection as referring to the same server | |||
Paragraph 5 later in this section), and it can use each connection to | (subject to verification, see Paragraph 8 later in this section), and | |||
trunk requests and replies. The question is whether session trunking | it can use each connection to trunk requests and replies. The | |||
and/or client ID trunking applies. | client's choice is whether session trunking or client ID trunking | |||
applies. | ||||
Session Trunking If the eia_clientowner argument is the same in two | Session Trunking If the eia_clientowner argument is the same in two | |||
different EXCHANGE_ID requests, and the eir_clientid, | different EXCHANGE_ID requests, and the eir_clientid, | |||
eir_server_owner.so_major_id, eir_server_owner.so_minor_id, and | eir_server_owner.so_major_id, eir_server_owner.so_minor_id, and | |||
eir_server_scope results match in both EXCHANGE_ID results, then | eir_server_scope results match in both EXCHANGE_ID results, then | |||
the client is permitted to perform session trunking. If the | the client is permitted to perform session trunking. If the | |||
client has no session mapping to the tuple of eir_clientid, | client has no session mapping to the tuple of eir_clientid, | |||
eir_server_owner.so_major_id, eir_server_scope, | eir_server_owner.so_major_id, eir_server_scope, | |||
eir_server_owner.so_minor_id, then it creates the session via a | eir_server_owner.so_minor_id, then it creates the session via a | |||
CREATE_SESSION operation over one of the connections, which | CREATE_SESSION operation over one of the connections, which | |||
associates the connection to the session. If there is a session | associates the connection to the session. If there is a session | |||
for the tuple, the client can send BIND_CONN_TO_SESSION to | for the tuple, the client can send BIND_CONN_TO_SESSION to | |||
associate the connection to the session. (Of course, if the | associate the connection to the session. | |||
client does not want to use session trunking, it can invoke | ||||
CREATE_SESSION on the connection. This will result in client ID | Of course, if the client does not desire to use session trunking, | |||
trunking as described below.) | it is not required to do so. It can invoke CREATE_SESSION on the | |||
connection. This will result in client ID trunking as described | ||||
below. It can also decide to drop the connection if it does not | ||||
choose to use trunking. | ||||
Client ID Trunking If the eia_clientowner argument is the same in | Client ID Trunking If the eia_clientowner argument is the same in | |||
two different EXCHANGE_ID requests, and the eir_clientid, | two different EXCHANGE_ID requests, and the eir_clientid, | |||
eir_server_owner.so_major_id, and eir_server_scope results match | eir_server_owner.so_major_id, and eir_server_scope results match | |||
in both EXCHANGE_ID results, but the eir_server_owner.so_minor_id | in both EXCHANGE_ID results, but the eir_server_owner.so_minor_id | |||
results do not match then the client is permitted to perform | results do not match then the client is permitted to perform | |||
client ID trunking. The client can associate each connection with | client ID trunking. The client can associate each connection with | |||
different sessions, where each session is associated with the same | different sessions, where each session is associated with the same | |||
server. | server. | |||
Of course, even if the eir_server_owner.so_minor_id fields do | Of course, even if the eir_server_owner.so_minor_id fields do | |||
match, the client is free to employ client ID trunking instead of | match, the client is free to employ client ID trunking instead of | |||
session trunking. | session trunking. | |||
The client completes the act of client ID trunking by invoking | The client completes the act of client ID trunking by invoking | |||
CREATE_SESSION on each connection, using the same client ID that | CREATE_SESSION on each connection, using the same client ID that | |||
was returned in eir_clientid. These invocations create two | was returned in eir_clientid. These invocations create two | |||
sessions and also associate each connection with each session. | sessions and also associate each connection with its respective | |||
session. The client is free to choose not to use client ID | ||||
trunking by simply dropping the connection at this point. | ||||
When doing client ID trunking, locking state is shared across | When doing client ID trunking, locking state is shared across | |||
sessions associated with the same client ID. This requires the | sessions associated with that same client ID. This requires the | |||
server to coordinate state across sessions. | server to coordinate state across sessions. | |||
The client should be prepared for the possibility that | ||||
eir_server_owner values may be different on subsequent EXCHANGE_ID | ||||
requests made to the same network address, as a result of various | ||||
sorts of reconfiguration events. When this happens and the changes | ||||
result in the invalidation of previously valid forms of trunking, the | ||||
client should cease to use those forms, either by dropping | ||||
connections or by adding sessions. For a discussion of lock reclaim | ||||
as it relates to such reconfiguration events, see Section 8.4.2.1. | ||||
When two servers over two connections claim matching or partially | When two servers over two connections claim matching or partially | |||
matching eir_server_owner, eir_server_scope, and eir_clientid values, | matching eir_server_owner, eir_server_scope, and eir_clientid values, | |||
the client does not have to trust the servers' claims. The client | the client does not have to trust the servers' claims. The client | |||
may verify these claims before trunking traffic in the following | may verify these claims before trunking traffic in the following | |||
ways: | ways: | |||
o For session trunking, clients SHOULD reliably verify if | o For session trunking, clients SHOULD reliably verify if | |||
connections between different network paths are in fact associated | connections between different network paths are in fact associated | |||
with the same NFSv4.1 server and usable on the same session, and | with the same NFSv4.1 server and usable on the same session, and | |||
servers MUST allow clients to perform reliable verification. When | servers MUST allow clients to perform reliable verification. When | |||
skipping to change at page 47, line 35 | skipping to change at page 50, line 41 | |||
SP4_SSV, reliable verification depends on a shared secret (the | SP4_SSV, reliable verification depends on a shared secret (the | |||
SSV) that is established via the SET_SSV (Section 18.47) | SSV) that is established via the SET_SSV (Section 18.47) | |||
operation. | operation. | |||
When a new connection is associated with the session (via the | When a new connection is associated with the session (via the | |||
BIND_CONN_TO_SESSION operation, see Section 18.34), if the client | BIND_CONN_TO_SESSION operation, see Section 18.34), if the client | |||
specified SP4_SSV state protection for the BIND_CONN_TO_SESSION | specified SP4_SSV state protection for the BIND_CONN_TO_SESSION | |||
operation, the client MUST send the BIND_CONN_TO_SESSION with | operation, the client MUST send the BIND_CONN_TO_SESSION with | |||
RPCSEC_GSS protection, using integrity or privacy, and an | RPCSEC_GSS protection, using integrity or privacy, and an | |||
RPCSEC_GSS handle created with the GSS SSV mechanism | RPCSEC_GSS handle created with the GSS SSV mechanism | |||
(Section 2.10.8). | (Section 2.10.9). | |||
If the client mistakenly tries to associate a connection to a | If the client mistakenly tries to associate a connection to a | |||
session of a wrong server, the server will either reject the | session of a wrong server, the server will either reject the | |||
attempt because it is not aware of the session identifier of the | attempt because it is not aware of the session identifier of the | |||
BIND_CONN_TO_SESSION arguments, or it will reject the attempt | BIND_CONN_TO_SESSION arguments, or it will reject the attempt | |||
because the RPCSEC_GSS authentication fails. Even if the server | because the RPCSEC_GSS authentication fails. Even if the server | |||
mistakenly or maliciously accepts the connection association | mistakenly or maliciously accepts the connection association | |||
attempt, the RPCSEC_GSS verifier it computes in the response will | attempt, the RPCSEC_GSS verifier it computes in the response will | |||
not be verified by the client, so the client will know it cannot | not be verified by the client, so the client will know it cannot | |||
use the connection for trunking the specified session. | use the connection for trunking the specified session. | |||
skipping to change at page 48, line 22 | skipping to change at page 51, line 27 | |||
authentication, the client notes the principal name of the GSS | authentication, the client notes the principal name of the GSS | |||
target. If the EXCHANGE_ID results indicate client ID trunking is | target. If the EXCHANGE_ID results indicate client ID trunking is | |||
possible, and the GSS targets' principal names are the same, the | possible, and the GSS targets' principal names are the same, the | |||
servers are the same and client ID trunking is allowed. | servers are the same and client ID trunking is allowed. | |||
The second option for verification is to use SP4_SSV protection. | The second option for verification is to use SP4_SSV protection. | |||
When the client sends EXCHANGE_ID it specifies SP4_SSV protection. | When the client sends EXCHANGE_ID it specifies SP4_SSV protection. | |||
The first EXCHANGE_ID the client sends always has to be confirmed | The first EXCHANGE_ID the client sends always has to be confirmed | |||
by a CREATE_SESSION call. The client then sends SET_SSV. Later | by a CREATE_SESSION call. The client then sends SET_SSV. Later | |||
the client sends EXCHANGE_ID to a second destination network | the client sends EXCHANGE_ID to a second destination network | |||
address than the first EXCHANGE_ID was sent with. The client | address different from the one the first EXCHANGE_ID was sent to. | |||
checks that each EXCHANGE_ID reply has the same eir_clientid, | The client checks that each EXCHANGE_ID reply has the same | |||
eir_server_owner.so_major_id, and eir_server_scope. If so, the | eir_clientid, eir_server_owner.so_major_id, and eir_server_scope. | |||
client verifies the claim by issuing a CREATE_SESSION to the | If so, the client verifies the claim by issuing a CREATE_SESSION | |||
second destination address, protected with RPCSEC_GSS integrity | to the second destination address, protected with RPCSEC_GSS | |||
using an RPCSEC_GSS handle returned by the second EXCHANGE_ID. If | integrity using an RPCSEC_GSS handle returned by the second | |||
the server accepts the CREATE_SESSION request, and if the client | EXCHANGE_ID. If the server accepts the CREATE_SESSION request, | |||
verifies the RPCSEC_GSS verifier and integrity codes, then the | and if the client verifies the RPCSEC_GSS verifier and integrity | |||
client has proof the second server knows the SSV, and thus the two | codes, then the client has proof the second server knows the SSV, | |||
servers are the same for the purposes of client ID trunking. | and thus the two servers are co-operating for the purposes of | |||
specifying server scope and client ID trunking. | ||||
2.10.5. Exactly Once Semantics | 2.10.6. Exactly Once Semantics | |||
Via the session, NFSv4.1 offers Exactly Once Semantics (EOS) for | Via the session, NFSv4.1 offers Exactly Once Semantics (EOS) for | |||
requests sent over a channel. EOS is supported on both the fore and | requests sent over a channel. EOS is supported on both the fore and | |||
back channels. | back channels. | |||
Each COMPOUND or CB_COMPOUND request that is sent with a leading | Each COMPOUND or CB_COMPOUND request that is sent with a leading | |||
SEQUENCE or CB_SEQUENCE operation MUST be executed by the receiver | SEQUENCE or CB_SEQUENCE operation MUST be executed by the receiver | |||
exactly once. This requirement holds regardless of whether the | exactly once. This requirement holds regardless of whether the | |||
request is sent with reply caching specified (see | request is sent with reply caching specified (see | |||
Section 2.10.5.1.3). The requirement holds even if the requester is | Section 2.10.6.1.3). The requirement holds even if the requester is | |||
issuing the request over a session created between a pNFS data client | issuing the request over a session created between a pNFS data client | |||
and pNFS data server. To understand the rationale for this | and pNFS data server. To understand the rationale for this | |||
requirement, divide the requests into three classifications: | requirement, divide the requests into three classifications: | |||
o Nonidempotent requests. | o Nonidempotent requests. | |||
o Idempotent modifying requests. | o Idempotent modifying requests. | |||
o Idempotent non-modifying requests. | o Idempotent non-modifying requests. | |||
skipping to change at page 49, line 46 | skipping to change at page 52, line 49 | |||
execution of a such a request will not cause data corruption, or | execution of a such a request will not cause data corruption, or | |||
produce an incorrect result. Nonetheless, to keep the implementation | produce an incorrect result. Nonetheless, to keep the implementation | |||
simple, the replier MUST enforce EOS for all requests whether | simple, the replier MUST enforce EOS for all requests whether | |||
idempotent and non-modifying or not. | idempotent and non-modifying or not. | |||
Note that true and complete EOS is not possible unless the server | Note that true and complete EOS is not possible unless the server | |||
persists the reply cache in stable storage, unless the server is | persists the reply cache in stable storage, unless the server is | |||
somehow implemented to never require a restart (indeed if such a | somehow implemented to never require a restart (indeed if such a | |||
server exists, the distinction between a reply cache kept in stable | server exists, the distinction between a reply cache kept in stable | |||
storage versus one that is not is one without meaning). See | storage versus one that is not is one without meaning). See | |||
Section 2.10.5.5 for a discussion of persistence in the reply cache. | Section 2.10.6.5 for a discussion of persistence in the reply cache. | |||
Regardless, even if the server does not persist the reply cache, EOS | Regardless, even if the server does not persist the reply cache, EOS | |||
improves robustness and correctness over previous versions of NFS | improves robustness and correctness over previous versions of NFS | |||
because the legacy duplicate request/reply caches were based on the | because the legacy duplicate request/reply caches were based on the | |||
ONC RPC transaction identifier (XID). Section 2.10.5.1 explains the | ONC RPC transaction identifier (XID). Section 2.10.6.1 explains the | |||
shortcomings of the XID as a basis for a reply cache and describes | shortcomings of the XID as a basis for a reply cache and describes | |||
how NFSv4.1 sessions improve upon the XID. | how NFSv4.1 sessions improve upon the XID. | |||
2.10.5.1. Slot Identifiers and Reply Cache | 2.10.6.1. Slot Identifiers and Reply Cache | |||
The RPC layer provides a transaction ID (XID), which, while required | The RPC layer provides a transaction ID (XID), which, while required | |||
to be unique, is not convenient for tracking requests for two | to be unique, is not convenient for tracking requests for two | |||
reasons. First, the XID is only meaningful to the requester; it | reasons. First, the XID is only meaningful to the requester; it | |||
cannot be interpreted by the replier except to test for equality with | cannot be interpreted by the replier except to test for equality with | |||
previously sent requests. When consulting an RPC-based duplicate | previously sent requests. When consulting an RPC-based duplicate | |||
request cache, the opaqueness of the XID requires a computationally | request cache, the opaqueness of the XID requires a computationally | |||
expensive lookup (often via a hash that includes XID and source | expensive lookup (often via a hash that includes XID and source | |||
address). NFSv4.1 requests use a non-opaque slot ID which is an | address). NFSv4.1 requests use a non-opaque slot ID which is an | |||
index into a slot table, which is far more efficient. Second, | index into a slot table, which is far more efficient. Second, | |||
skipping to change at page 51, line 20 | skipping to change at page 54, line 26 | |||
request is: | request is: | |||
o A new request, in which the sequence ID is one greater than that | o A new request, in which the sequence ID is one greater than that | |||
previously seen in the slot (accounting for sequence wraparound). | previously seen in the slot (accounting for sequence wraparound). | |||
The replier proceeds to execute the new request, and the replier | The replier proceeds to execute the new request, and the replier | |||
MUST increase the slot's sequence ID by one. | MUST increase the slot's sequence ID by one. | |||
o A retransmitted request, in which the sequence ID is equal to that | o A retransmitted request, in which the sequence ID is equal to that | |||
currently recorded in the slot. If the original request has | currently recorded in the slot. If the original request has | |||
executed to completion, the replier returns the cached reply. See | executed to completion, the replier returns the cached reply. See | |||
Section 2.10.5.2 for direction on how the replier deals with | Section 2.10.6.2 for direction on how the replier deals with | |||
retries of requests that are still in progress. | retries of requests that are still in progress. | |||
o A misordered retry, in which the sequence ID is less than | o A misordered retry, in which the sequence ID is less than | |||
(accounting for sequence wraparound) that previously seen in the | (accounting for sequence wraparound) that previously seen in the | |||
slot. The replier MUST return NFS4ERR_SEQ_MISORDERED (as the | slot. The replier MUST return NFS4ERR_SEQ_MISORDERED (as the | |||
result from SEQUENCE or CB_SEQUENCE). | result from SEQUENCE or CB_SEQUENCE). | |||
o A misordered new request, in which the sequence ID is two or more | o A misordered new request, in which the sequence ID is two or more | |||
than (accounting for sequence wraparound) than that previously | than (accounting for sequence wraparound) than that previously | |||
seen in the slot. Note that because the sequence ID must | seen in the slot. Note that because the sequence ID must | |||
skipping to change at page 54, line 23 | skipping to change at page 57, line 27 | |||
because the request may have been sent from the requester before | because the request may have been sent from the requester before | |||
the update was received. Therefore, in the downward adjustment | the update was received. Therefore, in the downward adjustment | |||
case, the replier may have to retain a number of reply cache | case, the replier may have to retain a number of reply cache | |||
entries at least as large as the old value of maximum requests | entries at least as large as the old value of maximum requests | |||
outstanding, until it can infer that the requester has seen a | outstanding, until it can infer that the requester has seen a | |||
reply containing the new granted highest_slotid. The replier can | reply containing the new granted highest_slotid. The replier can | |||
infer that requester as seen such a reply when it receives a new | infer that requester as seen such a reply when it receives a new | |||
request with the same slot ID as the request replied to and the | request with the same slot ID as the request replied to and the | |||
next higher sequence ID. | next higher sequence ID. | |||
2.10.5.1.1. Caching of SEQUENCE and CB_SEQUENCE Replies | 2.10.6.1.1. Caching of SEQUENCE and CB_SEQUENCE Replies | |||
When a SEQUENCE or CB_SEQUENCE operation is successfully executed, | When a SEQUENCE or CB_SEQUENCE operation is successfully executed, | |||
its reply MUST always be cached. Specifically, session ID, sequence | its reply MUST always be cached. Specifically, session ID, sequence | |||
ID, and slot ID MUST be cached in the reply cache. The reply from | ID, and slot ID MUST be cached in the reply cache. The reply from | |||
SEQUENCE also includes the highest slot ID, target highest slot ID, | SEQUENCE also includes the highest slot ID, target highest slot ID, | |||
and status flags. Instead of caching these values, the server MAY | and status flags. Instead of caching these values, the server MAY | |||
re-compute the values from the current state of the fore channel, | re-compute the values from the current state of the fore channel, | |||
session and/or client ID as appropriate. Similarly, the reply from | session and/or client ID as appropriate. Similarly, the reply from | |||
CB_SEQUENCE includes a highest slot ID and target highest slot ID. | CB_SEQUENCE includes a highest slot ID and target highest slot ID. | |||
The client MAY re-compute the values from the current state of the | The client MAY re-compute the values from the current state of the | |||
skipping to change at page 55, line 5 | skipping to change at page 58, line 8 | |||
response to the retry, or is a delayed response to the original | response to the retry, or is a delayed response to the original | |||
request. Therefore, it may be the case that highest slot ID, target | request. Therefore, it may be the case that highest slot ID, target | |||
slot ID, or status bits may reflect the state of affairs when the | slot ID, or status bits may reflect the state of affairs when the | |||
request was first executed. Although acting based on such delayed | request was first executed. Although acting based on such delayed | |||
information is valid, it may cause the receiver to do unneeded work. | information is valid, it may cause the receiver to do unneeded work. | |||
Requesters MAY choose to send additional requests to get the current | Requesters MAY choose to send additional requests to get the current | |||
state of affairs or use the state of affairs reported by subsequent | state of affairs or use the state of affairs reported by subsequent | |||
requests, in preference to acting immediately on data which may be | requests, in preference to acting immediately on data which may be | |||
out of date. | out of date. | |||
2.10.5.1.2. Errors from SEQUENCE and CB_SEQUENCE | 2.10.6.1.2. Errors from SEQUENCE and CB_SEQUENCE | |||
Any time SEQUENCE or CB_SEQUENCE return an error, the sequence ID of | Any time SEQUENCE or CB_SEQUENCE return an error, the sequence ID of | |||
the slot MUST NOT change. The replier MUST NOT modify the reply | the slot MUST NOT change. The replier MUST NOT modify the reply | |||
cache entry for the slot whenever an error is returned from SEQUENCE | cache entry for the slot whenever an error is returned from SEQUENCE | |||
or CB_SEQUENCE. | or CB_SEQUENCE. | |||
2.10.5.1.3. Optional Reply Caching | 2.10.6.1.3. Optional Reply Caching | |||
On a per-request basis the requester can choose to direct the replier | On a per-request basis the requester can choose to direct the replier | |||
to cache the reply to all operations after the first operation | to cache the reply to all operations after the first operation | |||
(SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis | (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis | |||
fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it | fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it | |||
would not direct the replier to cache the entire reply is that the | would not direct the replier to cache the entire reply is that the | |||
request is composed of all idempotent operations [23]. Caching the | request is composed of all idempotent operations [23]. Caching the | |||
reply may offer little benefit. If the reply is too large (see | reply may offer little benefit. If the reply is too large (see | |||
Section 2.10.5.4), it may not be cacheable anyway. Even if the reply | Section 2.10.6.4), it may not be cacheable anyway. Even if the reply | |||
to idempotent request is small enough to cache, unnecessarily caching | to idempotent request is small enough to cache, unnecessarily caching | |||
the reply slows down the server and increases RPC latency. | the reply slows down the server and increases RPC latency. | |||
Whether the requester requests the reply to be cached or not has no | Whether the requester requests the reply to be cached or not has no | |||
effect on the slot processing. If the results of SEQUENCE or | effect on the slot processing. If the results of SEQUENCE or | |||
CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be | CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be | |||
incremented by one. If a requester does not direct the replier to | incremented by one. If a requester does not direct the replier to | |||
cache the reply, the replier MUST do one of following: | cache the reply, the replier MUST do one of following: | |||
o The replier can cache the entire original reply. Even though | o The replier can cache the entire original reply. Even though | |||
sa_cachethis or csa_cachethis are FALSE, the replier is always | sa_cachethis or csa_cachethis are FALSE, the replier is always | |||
free to cache. It may choose this approach in order to simplify | free to cache. It may choose this approach in order to simplify | |||
implementation. | implementation. | |||
o The replier enters into its reply cache a reply consisting of the | o The replier enters into its reply cache a reply consisting of the | |||
original results to the SEQUENCE or CB_SEQUENCE operation, and | original results to the SEQUENCE or CB_SEQUENCE operation, and | |||
with the next operation in COMPOUND or CB_COMPOUND having the | with the next operation in COMPOUND or CB_COMPOUND having the | |||
error NFS4ERR_RETRY_UNCACHED_REP. Thus if the requester later | error NFS4ERR_RETRY_UNCACHED_REP. Thus if the requester later | |||
retries the request, it will get NFS4ERR_RETRY_UNCACHED_REP. | retries the request, it will get NFS4ERR_RETRY_UNCACHED_REP. | |||
2.10.5.2. Retry and Replay of Reply | 2.10.6.2. Retry and Replay of Reply | |||
A requester MUST NOT retry a request, unless the connection it used | A requester MUST NOT retry a request, unless the connection it used | |||
to send the request disconnects. The requester can then reconnect | to send the request disconnects. The requester can then reconnect | |||
and re-send the request, or it can re-send the request over a | and re-send the request, or it can re-send the request over a | |||
different connection that is associated with the same session. | different connection that is associated with the same session. | |||
If the requester is a server wanting to re-send a callback operation | If the requester is a server wanting to re-send a callback operation | |||
over the backchannel of session, the requester of course cannot | over the backchannel of session, the requester of course cannot | |||
reconnect because only the client can associate connections with the | reconnect because only the client can associate connections with the | |||
backchannel. The server can re-send the request over another | backchannel. The server can re-send the request over another | |||
skipping to change at page 56, line 46 | skipping to change at page 60, line 5 | |||
A retry might be sent while the original request is still in progress | A retry might be sent while the original request is still in progress | |||
on the replier. The replier SHOULD deal with the issue by returning | on the replier. The replier SHOULD deal with the issue by returning | |||
NFS4ERR_DELAY as the reply to SEQUENCE or CB_SEQUENCE operation, but | NFS4ERR_DELAY as the reply to SEQUENCE or CB_SEQUENCE operation, but | |||
implementations MAY return NFS4ERR_MISORDERED. Since errors from | implementations MAY return NFS4ERR_MISORDERED. Since errors from | |||
SEQUENCE and CB_SEQUENCE are never recorded in the reply cache, this | SEQUENCE and CB_SEQUENCE are never recorded in the reply cache, this | |||
approach allows the results of the execution of the original request | approach allows the results of the execution of the original request | |||
to be properly recorded in the reply cache (assuming the requester | to be properly recorded in the reply cache (assuming the requester | |||
specified the reply to be cached). | specified the reply to be cached). | |||
2.10.5.3. Resolving Server Callback Races | 2.10.6.3. Resolving Server Callback Races | |||
It is possible for server callbacks to arrive at the client before | It is possible for server callbacks to arrive at the client before | |||
the reply from related fore channel operations. For example, a | the reply from related fore channel operations. For example, a | |||
client may have been granted a delegation to a file it has opened, | client may have been granted a delegation to a file it has opened, | |||
but the reply to the OPEN (informing the client of the granting of | but the reply to the OPEN (informing the client of the granting of | |||
the delegation) may be delayed in the network. If a conflicting | the delegation) may be delayed in the network. If a conflicting | |||
operation arrives at the server, it will recall the delegation using | operation arrives at the server, it will recall the delegation using | |||
the backchannel, which may be on a different transport connection, | the backchannel, which may be on a different transport connection, | |||
perhaps even a different network, or even a different session | perhaps even a different network, or even a different session | |||
associated with the same client ID | associated with the same client ID | |||
skipping to change at page 58, line 8 | skipping to change at page 61, line 13 | |||
to arrive before responding to the CB_COMPOUND that won the race, | to arrive before responding to the CB_COMPOUND that won the race, | |||
because it is possible that it will be delayed indefinitely. The | because it is possible that it will be delayed indefinitely. The | |||
client should assume the likely case that the reply will arrive | client should assume the likely case that the reply will arrive | |||
within the average round trip time for COMPOUND requests to the | within the average round trip time for COMPOUND requests to the | |||
server, and wait that period of time. If that period of time expires | server, and wait that period of time. If that period of time expires | |||
it can respond to the CB_COMPOUND with NFS4ERR_DELAY. | it can respond to the CB_COMPOUND with NFS4ERR_DELAY. | |||
There are other scenarios under which callbacks may race replies. | There are other scenarios under which callbacks may race replies. | |||
Among them are pNFS layout recalls as described in Section 12.5.5.2. | Among them are pNFS layout recalls as described in Section 12.5.5.2. | |||
2.10.5.4. COMPOUND and CB_COMPOUND Construction Issues | 2.10.6.4. COMPOUND and CB_COMPOUND Construction Issues | |||
Very large requests and replies may pose both buffer management | Very large requests and replies may pose both buffer management | |||
issues (especially with RDMA) and reply cache issues. When the | issues (especially with RDMA) and reply cache issues. When the | |||
session is created, (Section 18.36), for each channel (fore and | session is created, (Section 18.36), for each channel (fore and | |||
back), the client and server negotiate the maximum sized request they | back), the client and server negotiate the maximum sized request they | |||
will send or process (ca_maxrequestsize), the maximum sized reply | will send or process (ca_maxrequestsize), the maximum sized reply | |||
they will return or process (ca_maxresponsesize), and the maximum | they will return or process (ca_maxresponsesize), and the maximum | |||
sized reply they will store in the reply cache | sized reply they will store in the reply cache | |||
(ca_maxresponsesize_cached). | (ca_maxresponsesize_cached). | |||
skipping to change at page 58, line 40 | skipping to change at page 61, line 45 | |||
If a reply exceeds ca_maxresponsesize, the reply will have the status | If a reply exceeds ca_maxresponsesize, the reply will have the status | |||
NFS4ERR_REP_TOO_BIG. A replier MAY return NFS4ERR_REP_TOO_BIG as the | NFS4ERR_REP_TOO_BIG. A replier MAY return NFS4ERR_REP_TOO_BIG as the | |||
status for first operation (SEQUENCE or CB_SEQUENCE) in the request, | status for first operation (SEQUENCE or CB_SEQUENCE) in the request, | |||
or it MAY opt to return it on a subsequent operation (in the same | or it MAY opt to return it on a subsequent operation (in the same | |||
COMPOUND or CB_COMPOUND reply). A replier MAY return | COMPOUND or CB_COMPOUND reply). A replier MAY return | |||
NFS4ERR_REP_TOO_BIG in the reply to SEQUENCE or CB_SEQUENCE, even if | NFS4ERR_REP_TOO_BIG in the reply to SEQUENCE or CB_SEQUENCE, even if | |||
the response would still exceed ca_maxresponsesize. | the response would still exceed ca_maxresponsesize. | |||
If sa_cachethis or csa_cachethis are TRUE, then the replier MUST | If sa_cachethis or csa_cachethis are TRUE, then the replier MUST | |||
cache a reply except if an error is returned by the SEQUENCE or | cache a reply except if an error is returned by the SEQUENCE or | |||
CB_SEQUENCE operation (see Section 2.10.5.1.2). If the reply exceeds | CB_SEQUENCE operation (see Section 2.10.6.1.2). If the reply exceeds | |||
ca_maxresponsesize_cached, (and sa_cachethis or csa_cachethis are | ca_maxresponsesize_cached, (and sa_cachethis or csa_cachethis are | |||
TRUE) then the server MUST return NFS4ERR_REP_TOO_BIG_TO_CACHE. Even | TRUE) then the server MUST return NFS4ERR_REP_TOO_BIG_TO_CACHE. Even | |||
if NFS4ERR_REP_TOO_BIG_TO_CACHE (or any other error for that matter) | if NFS4ERR_REP_TOO_BIG_TO_CACHE (or any other error for that matter) | |||
is returned on a operation other than first operation (SEQUENCE or | is returned on a operation other than first operation (SEQUENCE or | |||
CB_SEQUENCE), then the reply MUST be cached if sa_cachethis or | CB_SEQUENCE), then the reply MUST be cached if sa_cachethis or | |||
csa_cachethis are TRUE. For example, if a COMPOUND has eleven | csa_cachethis are TRUE. For example, if a COMPOUND has eleven | |||
operations, including SEQUENCE, the fifth operation is a RENAME, and | operations, including SEQUENCE, the fifth operation is a RENAME, and | |||
the tenth operation is a READ for one million bytes, the server may | the tenth operation is a READ for one million bytes, the server may | |||
return NFS4ERR_REP_TOO_BIG_TO_CACHE on the tenth operation. Since | return NFS4ERR_REP_TOO_BIG_TO_CACHE on the tenth operation. Since | |||
the server executed several operations, especially the non-idempotent | the server executed several operations, especially the non-idempotent | |||
skipping to change at page 59, line 47 | skipping to change at page 63, line 5 | |||
too large on the next operation, especially if the operation is | too large on the next operation, especially if the operation is | |||
OPEN. | OPEN. | |||
o A server MAY return NFS4ERR_UNSAFE_COMPOUND to a non-idempotent | o A server MAY return NFS4ERR_UNSAFE_COMPOUND to a non-idempotent | |||
current filehandle changing operation, if it looks at the next | current filehandle changing operation, if it looks at the next | |||
operation (in the same COMPOUND procedure) and finds it is not | operation (in the same COMPOUND procedure) and finds it is not | |||
GETFH. The server SHOULD do this if it is unable to determine in | GETFH. The server SHOULD do this if it is unable to determine in | |||
advance whether the total response size would exceed | advance whether the total response size would exceed | |||
ca_maxresponsesize_cached or ca_maxresponsesize. | ca_maxresponsesize_cached or ca_maxresponsesize. | |||
2.10.5.5. Persistence | 2.10.6.5. Persistence | |||
Since the reply cache is bounded, it is practical for the reply cache | Since the reply cache is bounded, it is practical for the reply cache | |||
to persist across server restarts. The replier MUST persist the | to persist across server restarts. The replier MUST persist the | |||
following information if it agreed to persist the session (when the | following information if it agreed to persist the session (when the | |||
session was created; see Section 18.36): | session was created; see Section 18.36): | |||
o The session ID. | o The session ID. | |||
o The slot table including the sequence ID and cached reply for each | o The slot table including the sequence ID and cached reply for each | |||
slot. | slot. | |||
skipping to change at page 61, line 24 | skipping to change at page 64, line 29 | |||
failure before the transaction is committed, then the server rolls | failure before the transaction is committed, then the server rolls | |||
back the transaction. If server itself fails, then when it restarts, | back the transaction. If server itself fails, then when it restarts, | |||
its recovery logic could roll back the transaction before starting | its recovery logic could roll back the transaction before starting | |||
the NFSv4.1 server. | the NFSv4.1 server. | |||
While the description of the implementation for atomic execution of | While the description of the implementation for atomic execution of | |||
the request and caching of the reply is beyond the scope of this | the request and caching of the reply is beyond the scope of this | |||
document, an example implementation for NFSv2 [27] is described in | document, an example implementation for NFSv2 [27] is described in | |||
[28]. | [28]. | |||
2.10.6. RDMA Considerations | 2.10.7. RDMA Considerations | |||
A complete discussion of the operation of RPC-based protocols over | A complete discussion of the operation of RPC-based protocols over | |||
RDMA transports is in [8]. A discussion of the operation of NFSv4, | RDMA transports is in [8]. A discussion of the operation of NFSv4, | |||
including NFSv4.1, over RDMA is in [9]. Where RDMA is considered, | including NFSv4.1, over RDMA is in [9]. Where RDMA is considered, | |||
this specification assumes the use of such a layering; it addresses | this specification assumes the use of such a layering; it addresses | |||
only the upper layer issues relevant to making best use of RPC/RDMA. | only the upper layer issues relevant to making best use of RPC/RDMA. | |||
2.10.6.1. RDMA Connection Resources | 2.10.7.1. RDMA Connection Resources | |||
RDMA requires its consumers to register memory and post buffers of a | RDMA requires its consumers to register memory and post buffers of a | |||
specific size and number for receive operations. | specific size and number for receive operations. | |||
Registration of memory can be a relatively high-overhead operation, | Registration of memory can be a relatively high-overhead operation, | |||
since it requires pinning of buffers, assignment of attributes (e.g. | since it requires pinning of buffers, assignment of attributes (e.g. | |||
readable/writable), and initialization of hardware translation. | readable/writable), and initialization of hardware translation. | |||
Preregistration is desirable to reduce overhead. These registrations | Preregistration is desirable to reduce overhead. These registrations | |||
are specific to hardware interfaces and even to RDMA connection | are specific to hardware interfaces and even to RDMA connection | |||
endpoints, therefore negotiation of their limits is desirable to | endpoints, therefore negotiation of their limits is desirable to | |||
skipping to change at page 62, line 13 | skipping to change at page 65, line 18 | |||
NFSv4.1 manages slots as resources on a per session basis (see | NFSv4.1 manages slots as resources on a per session basis (see | |||
Section 2.10), while RDMA connections manage credits on a per | Section 2.10), while RDMA connections manage credits on a per | |||
connection basis. This means that in order for a peer to send data | connection basis. This means that in order for a peer to send data | |||
over RDMA to a remote buffer, it has to have both an NFSv4.1 slot, | over RDMA to a remote buffer, it has to have both an NFSv4.1 slot, | |||
and an RDMA credit. If multiple RDMA connections are associated with | and an RDMA credit. If multiple RDMA connections are associated with | |||
a session, then if the total number of credits across all RDMA | a session, then if the total number of credits across all RDMA | |||
connections associated with the session is X, and the number slots in | connections associated with the session is X, and the number slots in | |||
the session is Y, then the maximum number of outstanding requests is | the session is Y, then the maximum number of outstanding requests is | |||
lesser of X and Y. | lesser of X and Y. | |||
2.10.6.2. Flow Control | 2.10.7.2. Flow Control | |||
Previous versions of NFS do not provide flow control; instead they | Previous versions of NFS do not provide flow control; instead they | |||
rely on the windowing provided by transports like TCP to throttle | rely on the windowing provided by transports like TCP to throttle | |||
requests. This does not work with RDMA, which provides no operation | requests. This does not work with RDMA, which provides no operation | |||
flow control and will terminate a connection in error when limits are | flow control and will terminate a connection in error when limits are | |||
exceeded. Limits such as maximum number of requests outstanding are | exceeded. Limits such as maximum number of requests outstanding are | |||
therefore negotiated when a session is created (see the | therefore negotiated when a session is created (see the | |||
ca_maxrequests field in Section 18.36). These limits then provide | ca_maxrequests field in Section 18.36). These limits then provide | |||
the maxima which each connection associated with the session's | the maxima which each connection associated with the session's | |||
channel(s) must remain within. RDMA connections are managed within | channel(s) must remain within. RDMA connections are managed within | |||
skipping to change at page 62, line 42 | skipping to change at page 65, line 47 | |||
associated with the replier's channel does exceed the channel's | associated with the replier's channel does exceed the channel's | |||
maximum number of outstanding requests. | maximum number of outstanding requests. | |||
The limits may also be modified dynamically at the replier's choosing | The limits may also be modified dynamically at the replier's choosing | |||
by manipulating certain parameters present in each NFSv4.1 reply. In | by manipulating certain parameters present in each NFSv4.1 reply. In | |||
addition, the CB_RECALL_SLOT callback operation (see Section 20.8) | addition, the CB_RECALL_SLOT callback operation (see Section 20.8) | |||
can be sent by a server to a client to return RDMA credits to the | can be sent by a server to a client to return RDMA credits to the | |||
server, thereby lowering the maximum number of requests a client can | server, thereby lowering the maximum number of requests a client can | |||
have outstanding to the server. | have outstanding to the server. | |||
2.10.6.3. Padding | 2.10.7.3. Padding | |||
Header padding is requested by each peer at session initiation (see | Header padding is requested by each peer at session initiation (see | |||
the ca_headerpadsize argument to CREATE_SESSION in Section 18.36), | the ca_headerpadsize argument to CREATE_SESSION in Section 18.36), | |||
and subsequently used by the RPC RDMA layer, as described in [8]. | and subsequently used by the RPC RDMA layer, as described in [8]. | |||
Zero padding is permitted. | Zero padding is permitted. | |||
Padding leverages the useful property that RDMA preserve alignment of | Padding leverages the useful property that RDMA preserve alignment of | |||
data, even when they are placed into anonymous (untagged) buffers. | data, even when they are placed into anonymous (untagged) buffers. | |||
If requested, client inline writes will insert appropriate pad bytes | If requested, client inline writes will insert appropriate pad bytes | |||
within the request header to align the data payload on the specified | within the request header to align the data payload on the specified | |||
boundary. The client is encouraged to add sufficient padding (up to | boundary. The client is encouraged to add sufficient padding (up to | |||
the negotiated size) so that the "data" field of the NFSv4.1 WRITE | the negotiated size) so that the "data" field of the NFSv4.1 WRITE | |||
operation is aligned. Most servers can make good use of such | operation is aligned. Most servers can make good use of such | |||
padding, which allows them to chain receive buffers in such a way | padding, which allows them to chain receive buffers in such a way | |||
skipping to change at page 63, line 47 | skipping to change at page 67, line 5 | |||
In the above case, the server may recycle unused buffers to the next | In the above case, the server may recycle unused buffers to the next | |||
posted receive if unused by the actual received request, or may pass | posted receive if unused by the actual received request, or may pass | |||
the now-complete buffers by reference for normal write processing. | the now-complete buffers by reference for normal write processing. | |||
For a server which can make use of it, this removes any need for data | For a server which can make use of it, this removes any need for data | |||
copies of incoming data, without resorting to complicated end-to-end | copies of incoming data, without resorting to complicated end-to-end | |||
buffer advertisement and management. This includes most kernel-based | buffer advertisement and management. This includes most kernel-based | |||
and integrated server designs, among many others. The client may | and integrated server designs, among many others. The client may | |||
perform similar optimizations, if desired. | perform similar optimizations, if desired. | |||
2.10.6.4. Dual RDMA and Non-RDMA Transports | 2.10.7.4. Dual RDMA and Non-RDMA Transports | |||
Some RDMA transports (for example [10]), permit a "streaming" (non- | Some RDMA transports (for example [10]), permit a "streaming" (non- | |||
RDMA) phase, where ordinary traffic might flow before "stepping up" | RDMA) phase, where ordinary traffic might flow before "stepping up" | |||
to RDMA mode, commencing RDMA traffic. Some RDMA transports start | to RDMA mode, commencing RDMA traffic. Some RDMA transports start | |||
connections always in RDMA mode. NFSv4.1 allows, but does not | connections always in RDMA mode. NFSv4.1 allows, but does not | |||
assume, a streaming phase before RDMA mode. When a connection is | assume, a streaming phase before RDMA mode. When a connection is | |||
associated with a session, the client and server negotiate whether | associated with a session, the client and server negotiate whether | |||
the connection is used in RDMA or non-RDMA mode (see Section 18.36 | the connection is used in RDMA or non-RDMA mode (see Section 18.36 | |||
and Section 18.34). | and Section 18.34). | |||
2.10.7. Sessions Security | 2.10.8. Sessions Security | |||
2.10.7.1. Session Callback Security | 2.10.8.1. Session Callback Security | |||
Via session / connection association, NFSv4.1 improves security over | Via session / connection association, NFSv4.1 improves security over | |||
that provided by NFSv4.0 for the backchannel. The connection is | that provided by NFSv4.0 for the backchannel. The connection is | |||
client-initiated (see Section 18.34), and subject to the same | client-initiated (see Section 18.34), and subject to the same | |||
firewall and routing checks as the fore channel. The connection | firewall and routing checks as the fore channel. The connection | |||
cannot be hijacked by an attacker who connects to the client port | cannot be hijacked by an attacker who connects to the client port | |||
prior to the intended server as is possible with NFSv4.0. At the | prior to the intended server as is possible with NFSv4.0. At the | |||
client's option (see Section 18.35), connection association is fully | client's option (see Section 18.35), connection association is fully | |||
authenticated before being activated (see Section 18.34). Traffic | authenticated before being activated (see Section 18.34). Traffic | |||
from the server over the backchannel is authenticated exactly as the | from the server over the backchannel is authenticated exactly as the | |||
client specifies (see Section 2.10.7.2). | client specifies (see Section 2.10.8.2). | |||
2.10.7.2. Backchannel RPC Security | 2.10.8.2. Backchannel RPC Security | |||
When the NFSv4.1 client establishes the backchannel, it informs the | When the NFSv4.1 client establishes the backchannel, it informs the | |||
server of the security flavors and principals to use when sending | server of the security flavors and principals to use when sending | |||
requests. If the security flavor is RPCSEC_GSS, the client expresses | requests. If the security flavor is RPCSEC_GSS, the client expresses | |||
the principal in the form of an established RPCSEC_GSS context. The | the principal in the form of an established RPCSEC_GSS context. The | |||
server is free to use any of the flavor/principal combinations the | server is free to use any of the flavor/principal combinations the | |||
client offers, but it MUST NOT use unoffered combinations. This way, | client offers, but it MUST NOT use unoffered combinations. This way, | |||
the client need not provide a target GSS principal for the | the client need not provide a target GSS principal for the | |||
backchannel as it did with NFSv4.0, nor the server have to implement | backchannel as it did with NFSv4.0, nor the server have to implement | |||
an RPCSEC_GSS initiator as it did with NFSv4.0 [20]. | an RPCSEC_GSS initiator as it did with NFSv4.0 [20]. | |||
The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL | The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL | |||
(Section 18.33) operations allow the client to specify flavor/ | (Section 18.33) operations allow the client to specify flavor/ | |||
principal combinations. | principal combinations. | |||
Also note that the SP4_SSV state protection mode (see Section 18.35 | Also note that the SP4_SSV state protection mode (see Section 18.35 | |||
and Section 2.10.7.3) has the side benefit of providing SSV-derived | and Section 2.10.8.3) has the side benefit of providing SSV-derived | |||
RPCSEC_GSS contexts (Section 2.10.8). | RPCSEC_GSS contexts (Section 2.10.9). | |||
2.10.7.3. Protection from Unauthorized State Changes | 2.10.8.3. Protection from Unauthorized State Changes | |||
As described to this point in the specification, the state model of | As described to this point in the specification, the state model of | |||
NFSv4.1 is vulnerable to an attacker that sends a SEQUENCE operation | NFSv4.1 is vulnerable to an attacker that sends a SEQUENCE operation | |||
with a forged session ID and with a slot ID that it expects the | with a forged session ID and with a slot ID that it expects the | |||
legitimate client to use next. When the legitimate client uses the | legitimate client to use next. When the legitimate client uses the | |||
slot ID with the same sequence number, the server returns the | slot ID with the same sequence number, the server returns the | |||
attacker's result from the reply cache which disrupts the legitimate | attacker's result from the reply cache which disrupts the legitimate | |||
client and thus denies service to it. Similarly an attacker could | client and thus denies service to it. Similarly an attacker could | |||
send a CREATE_SESSION with a forged client ID to create a new session | send a CREATE_SESSION with a forged client ID to create a new session | |||
associated with the client ID. The attacker could send requests | associated with the client ID. The attacker could send requests | |||
skipping to change at page 66, line 37 | skipping to change at page 69, line 44 | |||
3. The physical client has multiple users, but the client | 3. The physical client has multiple users, but the client | |||
implementation has a unique client ID for each user. This is | implementation has a unique client ID for each user. This is | |||
effectively the same as the second scenario, but a disadvantage | effectively the same as the second scenario, but a disadvantage | |||
is that each user must be allocated at least one session each, so | is that each user must be allocated at least one session each, so | |||
the approach suffers from lack of economy. | the approach suffers from lack of economy. | |||
The SP4_SSV protection option uses a Secret State Verifier (SSV) | The SP4_SSV protection option uses a Secret State Verifier (SSV) | |||
which is shared between a client and server. The SSV serves as the | which is shared between a client and server. The SSV serves as the | |||
secret key for an internal (that is, internal to NFSv4.1) GSS | secret key for an internal (that is, internal to NFSv4.1) GSS | |||
mechanism that uses the secret key for Message Integrity Code (MIC) | mechanism that uses the secret key for Message Integrity Code (MIC) | |||
and Wrap tokens (Section 2.10.8). The SP4_SSV protection option is | and Wrap tokens (Section 2.10.9). The SP4_SSV protection option is | |||
intended for the client that has multiple users, and the system | intended for the client that has multiple users, and the system | |||
administrator does not wish to configure a permanent machine | administrator does not wish to configure a permanent machine | |||
credential for each client. The SSV is established on the server via | credential for each client. The SSV is established on the server via | |||
SET_SSV (see Section 18.47). To prevent eavesdropping, a client | SET_SSV (see Section 18.47). To prevent eavesdropping, a client | |||
SHOULD send SET_SSV via RPCSEC_GSS with the privacy service. Several | SHOULD send SET_SSV via RPCSEC_GSS with the privacy service. Several | |||
aspects of the SSV make it intractable for an attacker to guess the | aspects of the SSV make it intractable for an attacker to guess the | |||
SSV, and thus associate rogue connections with a session, and rogue | SSV, and thus associate rogue connections with a session, and rogue | |||
sessions with a client ID: | sessions with a client ID: | |||
o The arguments to and results of SET_SSV include digests of the old | o The arguments to and results of SET_SSV include digests of the old | |||
and new SSV, respectively. | and new SSV, respectively. | |||
o Because the initial value of the SSV is zero, therefore known, the | o Because the initial value of the SSV is zero, therefore known, the | |||
client that opts for SP4_SSV protection and opts to apply SP4_SSV | client that opts for SP4_SSV protection and opts to apply SP4_SSV | |||
protection to BIND_CONN_TO_SESSION and CREATE_SESSION MUST send at | protection to BIND_CONN_TO_SESSION and CREATE_SESSION MUST send at | |||
least one SET_SSV operation before the first BIND_CONN_TO_SESSION | least one SET_SSV operation before the first BIND_CONN_TO_SESSION | |||
operation or before the second CREATE_SESSION operation on a | operation or before the second CREATE_SESSION operation on a | |||
client ID. If it does not, the SSV mechanism will not generate | client ID. If it does not, the SSV mechanism will not generate | |||
tokens (Section 2.10.8). A client SHOULD send SET_SSV as soon as | tokens (Section 2.10.9). A client SHOULD send SET_SSV as soon as | |||
a session is created. | a session is created. | |||
o A SET_SSV does not replace the SSV with the argument to SET_SSV. | o A SET_SSV does not replace the SSV with the argument to SET_SSV. | |||
Instead, the current SSV on the server is logically exclusive ORed | Instead, the current SSV on the server is logically exclusive ORed | |||
(XORed) with the argument to SET_SSV. Each time a new principal | (XORed) with the argument to SET_SSV. Each time a new principal | |||
uses a client ID for the first time, the client SHOULD send a | uses a client ID for the first time, the client SHOULD send a | |||
SET_SSV with that principal's RPCSEC_GSS credentials, with | SET_SSV with that principal's RPCSEC_GSS credentials, with | |||
RPCSEC_GSS service set to RPC_GSS_SVC_PRIVACY. | RPCSEC_GSS service set to RPC_GSS_SVC_PRIVACY. | |||
Here are the types of attacks that can be attempted by an attacker | Here are the types of attacks that can be attempted by an attacker | |||
skipping to change at page 69, line 27 | skipping to change at page 72, line 33 | |||
is to prevent connection hijacking, the use of IPsec is RECOMMENDED. | is to prevent connection hijacking, the use of IPsec is RECOMMENDED. | |||
If a connection hijack occurs, the hijacker could in theory change | If a connection hijack occurs, the hijacker could in theory change | |||
locking state and negatively impact the service to legitimate | locking state and negatively impact the service to legitimate | |||
clients. However if the server is configured to require the use of | clients. However if the server is configured to require the use of | |||
RPCSEC_GSS with integrity or privacy on the affected file objects, | RPCSEC_GSS with integrity or privacy on the affected file objects, | |||
and if EXCHGID4_FLAG_BIND_PRINC_STATEID capability (Section 18.35), | and if EXCHGID4_FLAG_BIND_PRINC_STATEID capability (Section 18.35), | |||
is in force, this will thwart unauthorized attempts to change locking | is in force, this will thwart unauthorized attempts to change locking | |||
state. | state. | |||
2.10.8. The SSV GSS Mechanism | 2.10.9. The SSV GSS Mechanism | |||
The SSV provides the secret key for a mechanism that NFSv4.1 uses for | The SSV provides the secret key for a mechanism that NFSv4.1 uses for | |||
state protection. Contexts for this mechanism are not established | state protection. Contexts for this mechanism are not established | |||
via the RPCSEC_GSS protocol. Instead, the contexts are automatically | via the RPCSEC_GSS protocol. Instead, the contexts are automatically | |||
created when EXCHANGE_ID specifies SP4_SSV protection. The only | created when EXCHANGE_ID specifies SP4_SSV protection. The only | |||
tokens defined are the PerMsgToken (emitted by GSS_GetMIC) and the | tokens defined are the PerMsgToken (emitted by GSS_GetMIC) and the | |||
SealedMessage token (emitted by GSS_Wrap). | SealedMessage token (emitted by GSS_Wrap). | |||
The mechanism OID for the SSV mechanism is: | The mechanism OID for the SSV mechanism is: | |||
iso.org.dod.internet.private.enterprise.Michael Eisler.nfs.ssv_mech | iso.org.dod.internet.private.enterprise.Michael Eisler.nfs.ssv_mech | |||
skipping to change at page 73, line 36 | skipping to change at page 76, line 43 | |||
The client MUST establish an SSV via SET_SSV before the SSV GSS | The client MUST establish an SSV via SET_SSV before the SSV GSS | |||
context can be used to emit tokens from GSS_Wrap() and GSS_GetMIC(). | context can be used to emit tokens from GSS_Wrap() and GSS_GetMIC(). | |||
If SET_SSV has not been successfully called, attempts to emit tokens | If SET_SSV has not been successfully called, attempts to emit tokens | |||
MUST fail. | MUST fail. | |||
The SSV mechanism does not support replay detection and sequencing in | The SSV mechanism does not support replay detection and sequencing in | |||
its tokens because RPCSEC_GSS does not use those features (See | its tokens because RPCSEC_GSS does not use those features (See | |||
Section 5.2.2 "Context Creation Requests" in [4]). | Section 5.2.2 "Context Creation Requests" in [4]). | |||
2.10.9. Session Mechanics - Steady State | 2.10.10. Session Mechanics - Steady State | |||
2.10.9.1. Obligations of the Server | 2.10.10.1. Obligations of the Server | |||
The server has the primary obligation to monitor the state of | The server has the primary obligation to monitor the state of | |||
backchannel resources that the client has created for the server | backchannel resources that the client has created for the server | |||
(RPCSEC_GSS contexts and backchannel connections). If these | (RPCSEC_GSS contexts and backchannel connections). If these | |||
resources vanish, the server takes action as specified in | resources vanish, the server takes action as specified in | |||
Section 2.10.11.2. | Section 2.10.12.2. | |||
2.10.9.2. Obligations of the Client | 2.10.10.2. Obligations of the Client | |||
The client SHOULD honor the following obligations in order to utilize | The client SHOULD honor the following obligations in order to utilize | |||
the session: | the session: | |||
o Keep a necessary session from going idle on the server. A client | o Keep a necessary session from going idle on the server. A client | |||
that requires a session, but nonetheless is not sending operations | that requires a session, but nonetheless is not sending operations | |||
risks having the session be destroyed by the server. This is | risks having the session be destroyed by the server. This is | |||
because sessions consume resources, and resource limitations may | because sessions consume resources, and resource limitations may | |||
force the server to cull an inactive session. A server MAY | force the server to cull an inactive session. A server MAY | |||
consider a session to be inactive if the client has not used the | consider a session to be inactive if the client has not used the | |||
session before the session inactivity timer (Section 2.10.10) has | session before the session inactivity timer (Section 2.10.11) has | |||
expired. | expired. | |||
o Destroy the session when not needed. If a client has multiple | o Destroy the session when not needed. If a client has multiple | |||
sessions, one of which has no requests waiting for replies, and | sessions, one of which has no requests waiting for replies, and | |||
has been idle for some period of time, it SHOULD destroy the | has been idle for some period of time, it SHOULD destroy the | |||
session. | session. | |||
o Maintain GSS contexts for the backchannel. If the client requires | o Maintain GSS contexts for the backchannel. If the client requires | |||
the server to use the RPCSEC_GSS security flavor for callbacks, | the server to use the RPCSEC_GSS security flavor for callbacks, | |||
then it needs to be sure the contexts handed to the server via | then it needs to be sure the contexts handed to the server via | |||
skipping to change at page 74, line 35 | skipping to change at page 77, line 40 | |||
backchannel in order to gracefully recall recallable state, or | backchannel in order to gracefully recall recallable state, or | |||
notify the client of certain events. Note that if the connection | notify the client of certain events. Note that if the connection | |||
is not being used for the fore channel, there is no way for the | is not being used for the fore channel, there is no way for the | |||
client tell if the connection is still alive (e.g., the server | client tell if the connection is still alive (e.g., the server | |||
restarted without sending a disconnect). The onus is on the | restarted without sending a disconnect). The onus is on the | |||
server, not the client, to determine if the backchannel's | server, not the client, to determine if the backchannel's | |||
connection is alive, and to indicate in the response to a SEQUENCE | connection is alive, and to indicate in the response to a SEQUENCE | |||
operation when the last connection associated with a session's | operation when the last connection associated with a session's | |||
backchannel has disconnected. | backchannel has disconnected. | |||
2.10.9.3. Steps the Client Takes To Establish a Session | 2.10.10.3. Steps the Client Takes To Establish a Session | |||
If the client does not have a client ID, the client sends EXCHANGE_ID | If the client does not have a client ID, the client sends EXCHANGE_ID | |||
to establish a client ID. If it opts for SP4_MACH_CRED or SP4_SSV | to establish a client ID. If it opts for SP4_MACH_CRED or SP4_SSV | |||
protection, in the spo_must_enforce list of operations, it SHOULD at | protection, in the spo_must_enforce list of operations, it SHOULD at | |||
minimum specify: CREATE_SESSION, DESTROY_SESSION, | minimum specify: CREATE_SESSION, DESTROY_SESSION, | |||
BIND_CONN_TO_SESSION, BACKCHANNEL_CTL, and DESTROY_CLIENTID. If opts | BIND_CONN_TO_SESSION, BACKCHANNEL_CTL, and DESTROY_CLIENTID. If opts | |||
for SP4_SSV protection, the client needs to ask for SSV-based | for SP4_SSV protection, the client needs to ask for SSV-based | |||
RPCSEC_GSS handles. | RPCSEC_GSS handles. | |||
The client uses the client ID to send a CREATE_SESSION on a | The client uses the client ID to send a CREATE_SESSION on a | |||
skipping to change at page 75, line 28 | skipping to change at page 78, line 33 | |||
If the client wants to use additional connections for the | If the client wants to use additional connections for the | |||
backchannel, then it must call BIND_CONN_TO_SESSION on each | backchannel, then it must call BIND_CONN_TO_SESSION on each | |||
connection it wants to use with the session. If the client wants to | connection it wants to use with the session. If the client wants to | |||
use additional connections for the fore channel, then it must call | use additional connections for the fore channel, then it must call | |||
BIND_CONN_TO_SESSION if it specified SP4_SSV or SP4_MACH_CRED state | BIND_CONN_TO_SESSION if it specified SP4_SSV or SP4_MACH_CRED state | |||
protection when the client ID was created. | protection when the client ID was created. | |||
At this point the session has reached steady state. | At this point the session has reached steady state. | |||
2.10.10. Session Inactivity Timer | 2.10.11. Session Inactivity Timer | |||
The server MAY maintain a session inactivity timer for each session. | The server MAY maintain a session inactivity timer for each session. | |||
If the session inactivity timer expires, then the server MAY destroy | If the session inactivity timer expires, then the server MAY destroy | |||
the session. To avoid losing a session due to inactivity, the client | the session. To avoid losing a session due to inactivity, the client | |||
MUST renew the session inactivity timer. The length of session | MUST renew the session inactivity timer. The length of session | |||
inactivity timer MUST NOT be less than the lease_time attribute | inactivity timer MUST NOT be less than the lease_time attribute | |||
(Section 5.8.1.11). As with lease renewal (Section 8.3), when the | (Section 5.8.1.11). As with lease renewal (Section 8.3), when the | |||
server receives a SEQUENCE operation, it resets the session | server receives a SEQUENCE operation, it resets the session | |||
inactivity timer, and MUST NOT allow the timer to expire while the | inactivity timer, and MUST NOT allow the timer to expire while the | |||
rest of the operations in the COMPOUND procedure's request are still | rest of the operations in the COMPOUND procedure's request are still | |||
executing. Once the last operation has finished, the server MUST set | executing. Once the last operation has finished, the server MUST set | |||
the session inactivity timer to expire no sooner that the sum of the | the session inactivity timer to expire no sooner that the sum of the | |||
current time and the value of the lease_time attribute. | current time and the value of the lease_time attribute. | |||
2.10.11. Session Mechanics - Recovery | 2.10.12. Session Mechanics - Recovery | |||
2.10.11.1. Events Requiring Client Action | 2.10.12.1. Events Requiring Client Action | |||
The following events require client action to recover. | The following events require client action to recover. | |||
2.10.11.1.1. RPCSEC_GSS Context Loss by Callback Path | 2.10.12.1.1. RPCSEC_GSS Context Loss by Callback Path | |||
If all RPCSEC_GSS contexts granted by the client to the server for | If all RPCSEC_GSS contexts granted by the client to the server for | |||
callback use have expired, the client MUST establish a new context | callback use have expired, the client MUST establish a new context | |||
via BACKCHANNEL_CTL. The sr_status_flags field of the SEQUENCE | via BACKCHANNEL_CTL. The sr_status_flags field of the SEQUENCE | |||
results indicates when callback contexts are nearly expired, or fully | results indicates when callback contexts are nearly expired, or fully | |||
expired (see Section 18.46.3). | expired (see Section 18.46.3). | |||
2.10.11.1.2. Connection Loss | 2.10.12.1.2. Connection Loss | |||
If the client loses the last connection of the session, and if wants | If the client loses the last connection of the session, and if wants | |||
to retain the session, then it must create a new connection, and if, | to retain the session, then it must create a new connection, and if, | |||
when the client ID was created, BIND_CONN_TO_SESSION was specified in | when the client ID was created, BIND_CONN_TO_SESSION was specified in | |||
the spo_must_enforce list, the client MUST use BIND_CONN_TO_SESSION | the spo_must_enforce list, the client MUST use BIND_CONN_TO_SESSION | |||
to associate the connection with the session. | to associate the connection with the session. | |||
If there was a request outstanding at the time the of connection | If there was a request outstanding at the time the of connection | |||
loss, then if client wants to continue to use the session it MUST | loss, then if client wants to continue to use the session it MUST | |||
retry the request, as described in Section 2.10.5.2. Note that it is | retry the request, as described in Section 2.10.6.2. Note that it is | |||
not necessary to retry requests over a connection with the same | not necessary to retry requests over a connection with the same | |||
source network address or the same destination network address as the | source network address or the same destination network address as the | |||
lost connection. As long as the session ID, slot ID, and sequence ID | lost connection. As long as the session ID, slot ID, and sequence ID | |||
in the retry match that of the original request, the server will | in the retry match that of the original request, the server will | |||
recognize the request as a retry if it executed the request prior to | recognize the request as a retry if it executed the request prior to | |||
disconnect. | disconnect. | |||
If the connection that was lost was the last one associated with the | If the connection that was lost was the last one associated with the | |||
backchannel, and the client wants to retain the backchannel and/or | backchannel, and the client wants to retain the backchannel and/or | |||
not put recallable state subject to revocation, the client must | not put recallable state subject to revocation, the client must | |||
reconnect, and if it does, it MUST associate the connection to the | reconnect, and if it does, it MUST associate the connection to the | |||
session and backchannel via BIND_CONN_TO_SESSION. The server SHOULD | session and backchannel via BIND_CONN_TO_SESSION. The server SHOULD | |||
indicate when it has no callback connection via the sr_status_flags | indicate when it has no callback connection via the sr_status_flags | |||
result from SEQUENCE. | result from SEQUENCE. | |||
2.10.11.1.3. Backchannel GSS Context Loss | 2.10.12.1.3. Backchannel GSS Context Loss | |||
Via the sr_status_flags result of the SEQUENCE operation or other | Via the sr_status_flags result of the SEQUENCE operation or other | |||
means, the client will learn if some or all of the RPCSEC_GSS | means, the client will learn if some or all of the RPCSEC_GSS | |||
contexts it assigned to the backchannel have been lost. If the | contexts it assigned to the backchannel have been lost. If the | |||
client wants to the retain the backchannel and/or not put recallable | client wants to the retain the backchannel and/or not put recallable | |||
state subjection to revocation, the client must use BACKCHANNEL_CTL | state subjection to revocation, the client must use BACKCHANNEL_CTL | |||
to assign new contexts. | to assign new contexts. | |||
2.10.11.1.4. Loss of Session | 2.10.12.1.4. Loss of Session | |||
The replier might lose a record of the session. Causes include: | The replier might lose a record of the session. Causes include: | |||
o Replier failure and restart | o Replier failure and restart | |||
o A catastrophe that causes the reply cache to be corrupted or lost | o A catastrophe that causes the reply cache to be corrupted or lost | |||
on the media it was stored on. This applies even if the replier | on the media it was stored on. This applies even if the replier | |||
indicated in the CREATE_SESSION results that it would persist the | indicated in the CREATE_SESSION results that it would persist the | |||
cache. | cache. | |||
o The server purges the session of a client that has been inactive | o The server purges the session of a client that has been inactive | |||
for a very extended period of time. | for a very extended period of time. | |||
o As a result of configuration changes among a set of clustered | ||||
servers, a network address previously connected to one server | ||||
becomes connected to a different server which has no knowledge of | ||||
the session in question. Such a configuration change will | ||||
generally only happen when the original server ceases to function | ||||
for a time. | ||||
Loss of reply cache is equivalent to loss of session. The replier | Loss of reply cache is equivalent to loss of session. The replier | |||
indicates loss of session to the requester by returning | indicates loss of session to the requester by returning | |||
NFS4ERR_BADSESSION on the next operation that uses the session ID | NFS4ERR_BADSESSION on the next operation that uses the session ID | |||
that refers to the lost session. | that refers to the lost session. | |||
After an event like a server restart, the client may have lost its | After an event like a server restart, the client may have lost its | |||
connections. The client assumes for the moment that the session has | connections. The client assumes for the moment that the session has | |||
not been lost. It reconnects, and if it specified connection | not been lost. It reconnects, and if it specified connection | |||
association enforcement when the session was created, it invokes | association enforcement when the session was created, it invokes | |||
BIND_CONN_TO_SESSION using the session ID. Otherwise, it invokes | BIND_CONN_TO_SESSION using the session ID. Otherwise, it invokes | |||
SEQUENCE. If BIND_CONN_TO_SESSION or SEQUENCE returns | SEQUENCE. If BIND_CONN_TO_SESSION or SEQUENCE returns | |||
NFS4ERR_BADSESSION, the client knows the session was lost. If the | NFS4ERR_BADSESSION, the client knows the session is not available to | |||
connection survives session loss, then the next SEQUENCE operation | it when communicating with that network address. If the connection | |||
the client sends over the connection will get back | survives session loss, then the next SEQUENCE operation the client | |||
NFS4ERR_BADSESSION. The client again knows the session was lost. | sends over the connection will get back NFS4ERR_BADSESSION. The | |||
client again knows the session was lost. | ||||
Here is one suggested algorithm for the client when it gets | ||||
NFS4ERR_BADSESSION. It is not obligatory in that, if a client does | ||||
not want to take advantage of such features as trunking, it may omit | ||||
parts of it. However, it is a useful example which draws attention | ||||
to various possible recovery issues: | ||||
1. If the client has other connections to other server network | ||||
addresses associated with the same session, attempt a COMPOUND | ||||
with a single operation, SEQUENCE, on each of the other | ||||
connections. | ||||
2. If the attempts succeed, the session is still alive, and this is | ||||
a strong indicator the server's network address has moved. The | ||||
client might send an EXCHANGE_ID on the connection that returned | ||||
NFS4ERR_BADSESSION to see if there are opportunities for client | ||||
ID trunking (i.e. the same client ID and so_major are returned). | ||||
The client might use DNS to see if the moved network address was | ||||
replaced with another, so that the performance and availability | ||||
benefits of session trunking can continue. | ||||
3. If the SEQUENCE requests fail with NFS4ERR_BADSESSION then the | ||||
session no longer exists on any of the server network addresses | ||||
the client has connections associated with that session ID. It | ||||
is possible the session is still alive and available on other | ||||
network addresses. The client sends an EXCHANGE_ID on all the | ||||
connections to see if the server owner is still listening on | ||||
those network addresses. If the same server owner is returned, | ||||
but a new client ID is returned, this is a strong indicator of a | ||||
server restart. If both the same server owner and same client ID | ||||
are returned, then this is a strong indication that the server | ||||
did delete the session, and the client will need to send a | ||||
CREATE_SESSION if it has no other sessions for that client ID. | ||||
If a different server owner is returned, the client can use DNS | ||||
to find other network addresses. If it does not, or if DNS does | ||||
not find any other addresses for the server, then the client will | ||||
be unable to provide NFSv4.1 service, and fatal errors should be | ||||
returned to processes that were using the server. If the client | ||||
is using a "mount" paradigm, unmounting the server is advised. | ||||
4. If the client knows of no other connections associated with the | ||||
session ID, and server network addresses that are, or have been | ||||
associated with the session ID, then the client can use DNS to | ||||
find other network addresses. If it does not, or if DNS does not | ||||
find any other addresses for the server, then the client will be | ||||
unable to provide NFSv4.1 service, and fatal errors should be | ||||
returned to processes that were using the server. If the client | ||||
is using a "mount" paradigm, unmounting the server is advised. | ||||
If there is a reconfiguration event which results in the same network | ||||
being assigned to servers where the server_scope value is different, | ||||
it cannot be guaranteed that a session ID generated by the first will | ||||
be recognized as invalid by the first. Therefore, in managing server | ||||
reconfigurations among servers with different server scope values, it | ||||
is necessary to make sure that all clients have disconnected from the | ||||
first server before effecting the reconfiguration. Nonetheless, | ||||
clients should not assume that this requirement will always be | ||||
adhered to in effecting server reconfigurations to deal with | ||||
unexpected events. Even where a session ID is inappropriately | ||||
recognized as valid, it is likely that either the connection will not | ||||
be recognized as valid, or that a sequence value for a slot will not | ||||
be correct. Therefore, when a client receives results indicating | ||||
such unexpected errors, the use of EXCHANGE_ID to determine the | ||||
current server configuration and present the client to the server is | ||||
recommended. | ||||
A variation on the above is that after a server's network address | ||||
moves, there is no NFSv4.1 server listening. E.g. no listener on | ||||
port 2049, the NFSv4 server returns NFS4ERR_MINOR_VERS_MISMATCH, the | ||||
NFS server server returns a PROG_MISMATCH error, the RPC listener on | ||||
2049 returns PROG_MISMATCH, or attempts to re-connect to the network | ||||
address timeout. These should be treated as equivalent to SEQUENCE | ||||
returning NFS4ERR_BADSESSION for these purposes. | ||||
When the client detects session loss, it must call CREATE_SESSION to | When the client detects session loss, it must call CREATE_SESSION to | |||
recover. Any non-idempotent operations that were in progress may | recover. Any non-idempotent operations that were in progress may | |||
have been performed on the server at the time of session loss. The | have been performed on the server at the time of session loss. The | |||
client has no general way to recover from this. | client has no general way to recover from this. | |||
Note that loss of session does not imply loss of lock, open, | Note that loss of session does not imply loss of lock, open, | |||
delegation, or layout state because locks, opens, delegations, and | delegation, or layout state because locks, opens, delegations, and | |||
layouts are tied to the client ID and depend on the client ID, not | layouts are tied to the client ID and depend on the client ID, not | |||
the session. Nor does loss of lock, open, delegation, or layout | the session. Nor does loss of lock, open, delegation, or layout | |||
skipping to change at page 78, line 5 | skipping to change at page 82, line 38 | |||
client ID; loss of client ID however does imply loss of session, | client ID; loss of client ID however does imply loss of session, | |||
lock, open, delegation, and layout state. See Section 8.4.2. A | lock, open, delegation, and layout state. See Section 8.4.2. A | |||
session can survive a server restart, but lock recovery may still be | session can survive a server restart, but lock recovery may still be | |||
needed. | needed. | |||
It is possible CREATE_SESSION will fail with NFS4ERR_STALE_CLIENTID | It is possible CREATE_SESSION will fail with NFS4ERR_STALE_CLIENTID | |||
(for example the server restarts and does not preserve client ID | (for example the server restarts and does not preserve client ID | |||
state). If so, the client needs to call EXCHANGE_ID, followed by | state). If so, the client needs to call EXCHANGE_ID, followed by | |||
CREATE_SESSION. | CREATE_SESSION. | |||
2.10.11.2. Events Requiring Server Action | 2.10.12.2. Events Requiring Server Action | |||
The following events require server action to recover. | The following events require server action to recover. | |||
2.10.11.2.1. Client Crash and Restart | 2.10.12.2.1. Client Crash and Restart | |||
As described in Section 18.35, a restarted client sends EXCHANGE_ID | As described in Section 18.35, a restarted client sends EXCHANGE_ID | |||
in such a way it causes the server to delete any sessions it had. | in such a way it causes the server to delete any sessions it had. | |||
2.10.11.2.2. Client Crash with No Restart | 2.10.12.2.2. Client Crash with No Restart | |||
If a client crashes and never comes back, it will never send | If a client crashes and never comes back, it will never send | |||
EXCHANGE_ID with its old client owner. Thus the server has session | EXCHANGE_ID with its old client owner. Thus the server has session | |||
state that will never be used again. After an extended period of | state that will never be used again. After an extended period of | |||
time and if the server has resource constraints, it MAY destroy the | time and if the server has resource constraints, it MAY destroy the | |||
old session as well as locking state. | old session as well as locking state. | |||
2.10.11.2.3. Extended Network Partition | 2.10.12.2.3. Extended Network Partition | |||
To the server, the extended network partition may be no different | To the server, the extended network partition may be no different | |||
from a client crash with no restart (see Section 2.10.11.2.2). | from a client crash with no restart (see Section 2.10.12.2.2). | |||
Unless the server can discern that there is a network partition, it | Unless the server can discern that there is a network partition, it | |||
is free to treat the situation as if the client has crashed | is free to treat the situation as if the client has crashed | |||
permanently. | permanently. | |||
2.10.11.2.4. Backchannel Connection Loss | 2.10.12.2.4. Backchannel Connection Loss | |||
If there were callback requests outstanding at the time of a | If there were callback requests outstanding at the time of a | |||
connection loss, then the server MUST retry the request, as described | connection loss, then the server MUST retry the request, as described | |||
in Section 2.10.5.2. Note that it is not necessary to retry requests | in Section 2.10.6.2. Note that it is not necessary to retry requests | |||
over a connection with the same source network address or the same | over a connection with the same source network address or the same | |||
destination network address as the lost connection. As long as the | destination network address as the lost connection. As long as the | |||
session ID, slot ID, and sequence ID in the retry match that of the | session ID, slot ID, and sequence ID in the retry match that of the | |||
original request, the callback target will recognize the request as a | original request, the callback target will recognize the request as a | |||
retry even if it did see the request prior to disconnect. | retry even if it did see the request prior to disconnect. | |||
If the connection lost is the last one associated with the | If the connection lost is the last one associated with the | |||
backchannel, then the server MUST indicate that in the | backchannel, then the server MUST indicate that in the | |||
sr_status_flags field of every SEQUENCE reply until the backchannel | sr_status_flags field of every SEQUENCE reply until the backchannel | |||
is reestablished. There are two situations each of which use | is reestablished. There are two situations each of which use | |||
different status flags: no connectivity for the session's | different status flags: no connectivity for the session's | |||
backchannel, and no connectivity for any session backchannel of the | backchannel, and no connectivity for any session backchannel of the | |||
client. See Section 18.46 for a description of the appropriate flags | client. See Section 18.46 for a description of the appropriate flags | |||
in sr_status_flags. | in sr_status_flags. | |||
2.10.11.2.5. GSS Context Loss | 2.10.12.2.5. GSS Context Loss | |||
The server SHOULD monitor when the number RPCSEC_GSS contexts | The server SHOULD monitor when the number RPCSEC_GSS contexts | |||
assigned to the backchannel reaches one, and when that one context is | assigned to the backchannel reaches one, and when that one context is | |||
near expiry (i.e. between one and two periods of lease time), | near expiry (i.e. between one and two periods of lease time), | |||
indicate so in the sr_status_flags field of all SEQUENCE replies. | indicate so in the sr_status_flags field of all SEQUENCE replies. | |||
The server MUST indicate when the all of the backchannel's assigned | The server MUST indicate when the all of the backchannel's assigned | |||
RPCSEC_GSS contexts have expired in the sr_status_flags field of all | RPCSEC_GSS contexts have expired in the sr_status_flags field of all | |||
SEQUENCE replies. | SEQUENCE replies. | |||
2.10.12. Parallel NFS and Sessions | 2.10.13. Parallel NFS and Sessions | |||
A client and server can potentially be a non-pNFS implementation, a | A client and server can potentially be a non-pNFS implementation, a | |||
metadata server implementation, a data server implementation, or two | metadata server implementation, a data server implementation, or two | |||
or three types of implementations. The EXCHGID4_FLAG_USE_NON_PNFS, | or three types of implementations. The EXCHGID4_FLAG_USE_NON_PNFS, | |||
EXCHGID4_FLAG_USE_PNFS_MDS, and EXCHGID4_FLAG_USE_PNFS_DS flags (not | EXCHGID4_FLAG_USE_PNFS_MDS, and EXCHGID4_FLAG_USE_PNFS_DS flags (not | |||
mutually exclusive) are passed in the EXCHANGE_ID arguments and | mutually exclusive) are passed in the EXCHANGE_ID arguments and | |||
results to allow the client to indicate how it wants to use sessions | results to allow the client to indicate how it wants to use sessions | |||
created under the client ID, and to allow the server to indicate how | created under the client ID, and to allow the server to indicate how | |||
it will allow the sessions to be used. See Section 13.1 for pNFS | it will allow the sessions to be used. See Section 13.1 for pNFS | |||
sessions considerations. | sessions considerations. | |||
skipping to change at page 84, line 48 | skipping to change at page 89, line 33 | |||
3.3.9. netaddr4 | 3.3.9. netaddr4 | |||
struct netaddr4 { | struct netaddr4 { | |||
/* see struct rpcb in RFC 1833 */ | /* see struct rpcb in RFC 1833 */ | |||
string na_r_netid<>; /* network id */ | string na_r_netid<>; /* network id */ | |||
string na_r_addr<>; /* universal address */ | string na_r_addr<>; /* universal address */ | |||
}; | }; | |||
The netaddr4 data type is used to identify network transport | The netaddr4 data type is used to identify network transport | |||
endpoints. The r_netid and r_addr fields respectively contain a | endpoints. The r_netid and r_addr fields respectively contain a | |||
netid and uaddr. The netid and uaddr concepts are defined in in | netid and uaddr. The netid and uaddr concepts are defined in [13]. | |||
[13]. The netid and uaddr formats for TCP over IPv4 and TCP over | The netid and uaddr formats for TCP over IPv4 and TCP over IPv6 are | |||
IPv6 are defined in [13], specifically Tables 2 and 3 and Sections | defined in [13], specifically Tables 2 and 3 and Sections 3.2.3.3 and | |||
3.2.3.3 and 3.2.3.4. | 3.2.3.4. | |||
3.3.10. state_owner4 | 3.3.10. state_owner4 | |||
struct state_owner4 { | struct state_owner4 { | |||
clientid4 clientid; | clientid4 clientid; | |||
opaque owner<NFS4_OPAQUE_LIMIT>; | opaque owner<NFS4_OPAQUE_LIMIT>; | |||
}; | }; | |||
typedef state_owner4 open_owner4; | typedef state_owner4 open_owner4; | |||
typedef state_owner4 lock_owner4; | typedef state_owner4 lock_owner4; | |||
skipping to change at page 160, line 31 | skipping to change at page 165, line 31 | |||
careful, transport retransmission delays can result in the client | careful, transport retransmission delays can result in the client | |||
failing to detect a server restart before the grace period ends. | failing to detect a server restart before the grace period ends. | |||
The scenario is that the client is using a transport with | The scenario is that the client is using a transport with | |||
exponential back off, such that the maximum retransmission timeout | exponential back off, such that the maximum retransmission timeout | |||
exceeds the both the grace period and the lease_time attribute. A | exceeds the both the grace period and the lease_time attribute. A | |||
network partition causes the client's connection's retransmission | network partition causes the client's connection's retransmission | |||
interval to back off, and even after the partition heals, the next | interval to back off, and even after the partition heals, the next | |||
transport-level retransmission is sent after the server has | transport-level retransmission is sent after the server has | |||
restarted and its grace period ends. | restarted and its grace period ends. | |||
The client MUST either recover from the ensuing NFS4ERR_NOGRACE | The client MUST either recover from the ensuing NFS4ERR_NO_GRACE | |||
errors, or it MUST ensure that despite transport level | errors, or it MUST ensure that despite transport level | |||
retransmission intervals that exceed the lease_time, nonetheless a | retransmission intervals that exceed the lease_time, nonetheless a | |||
SEQUENCE operation is sent that renews the lease before | SEQUENCE operation is sent that renews the lease before | |||
expiration. The client can achieve this by associating a new | expiration. The client can achieve this by associating a new | |||
connection with the session, and sending a SEQUENCE operation on | connection with the session, and sending a SEQUENCE operation on | |||
it. However, if the attempt to establish a new connection is | it. However, if the attempt to establish a new connection is | |||
delayed for some reason (e.g. exponential backoff of the | delayed for some reason (e.g. exponential backoff of the | |||
connection establishment packets), the client will have to abort | connection establishment packets), the client will have to abort | |||
the connection establishment attempt before the lease expires, and | the connection establishment attempt before the lease expires, and | |||
attempt to re-connect. | attempt to re-connect. | |||
skipping to change at page 162, line 12 | skipping to change at page 167, line 12 | |||
within the client or network buffers must wait until the client has | within the client or network buffers must wait until the client has | |||
successfully recovered the locks protecting the READ and WRITE | successfully recovered the locks protecting the READ and WRITE | |||
operations. Any that reach the server before the server can safely | operations. Any that reach the server before the server can safely | |||
determine that the client has recovered enough locking state to be | determine that the client has recovered enough locking state to be | |||
sure that such operations can be safely processed must be rejected. | sure that such operations can be safely processed must be rejected. | |||
This will happen because either: | This will happen because either: | |||
o The state presented is no longer valid since it is associated with | o The state presented is no longer valid since it is associated with | |||
a now invalid client ID. In this case the client will receive | a now invalid client ID. In this case the client will receive | |||
either an NFS4ERR_BADSESSION or NFS4ERR_DEADSESSION error, and any | either an NFS4ERR_BADSESSION or NFS4ERR_DEADSESSION error, and any | |||
attempt to attach a new session to the existing client ID will | attempt to attach a new session to that invalid client ID will | |||
result in an NFS4ERR_STALE_CLIENTID error. | result in an NFS4ERR_STALE_CLIENTID error. | |||
o Subsequent recovery of locks may make execution of the operation | o Subsequent recovery of locks may make execution of the operation | |||
inappropriate (NFS4ERR_GRACE). | inappropriate (NFS4ERR_GRACE). | |||
8.4.1. Client Failure and Recovery | 8.4.1. Client Failure and Recovery | |||
In the event that a client fails, the server may release the client's | In the event that a client fails, the server may release the client's | |||
locks when the associated lease has expired. Conflicting locks from | locks when the associated lease has expired. Conflicting locks from | |||
another client may only be granted after this lease expiration. As | another client may only be granted after this lease expiration. As | |||
skipping to change at page 166, line 39 | skipping to change at page 171, line 39 | |||
A server may, upon restart, establish a new value for the lease | A server may, upon restart, establish a new value for the lease | |||
period. Therefore, clients should, once a new client ID is | period. Therefore, clients should, once a new client ID is | |||
established, refetch the lease_time attribute and use it as the basis | established, refetch the lease_time attribute and use it as the basis | |||
for lease renewal for the lease associated with that server. | for lease renewal for the lease associated with that server. | |||
However, the server must establish, for this restart event, a grace | However, the server must establish, for this restart event, a grace | |||
period at least as long as the lease period for the previous server | period at least as long as the lease period for the previous server | |||
instantiation. This allows the client state obtained during the | instantiation. This allows the client state obtained during the | |||
previous server instance to be reliably re-established. | previous server instance to be reliably re-established. | |||
The possibility exists, that because of server configuration events, | ||||
the client will be communicating with a server different than the one | ||||
on which the locks were obtained, as shown by the combination of | ||||
eir_server_scope and eir_server_owner. This leads to the issue of if | ||||
and when the client should attempt to reclaim locks previously | ||||
obtained on what is being reported as a different server. The rules | ||||
to resolve this question are as follows: | ||||
o If the server scope is different the client should not attempt to | ||||
reclaim locks. In this situation no lock reclaim is possible. | ||||
Any attempt to re-obtain the locks with non-reclaim operations is | ||||
problematic since there is no guarantee that the existing | ||||
filehandles will be recognized by the new server, or that if | ||||
recognized, they denote the same objects. It is best to treat the | ||||
locks as having been revoked by the reconfiguration event. | ||||
o If the server scope is the same, the client should attempt to | ||||
reclaim locks, even if the eir_server_owner value is different. | ||||
In this situation, it is the responsibility of the server to | ||||
return NFS4ERR_NO_GRACE if it cannot provide correct support for | ||||
lock reclaim operations, including the prevention of edge | ||||
conditions. | ||||
The eir_server_owner field is not used in making this determination. | ||||
Its function is to specify trunking possibilities for the client (see | ||||
Section 2.10.5) and not to control lock reclaim. | ||||
8.4.3. Network Partitions and Recovery | 8.4.3. Network Partitions and Recovery | |||
If the duration of a network partition is greater than the lease | If the duration of a network partition is greater than the lease | |||
period provided by the server, the server will not have received a | period provided by the server, the server will not have received a | |||
lease renewal from the client. If this occurs, the server may free | lease renewal from the client. If this occurs, the server may free | |||
all locks held for the client, or it may allow the lock state to | all locks held for the client, or it may allow the lock state to | |||
remain for a considerable period, subject to the constraint that if a | remain for a considerable period, subject to the constraint that if a | |||
request for a conflicting lock is made, locks associated with an | request for a conflicting lock is made, locks associated with an | |||
expired lease do not prevent such a conflicting lock from being | expired lease do not prevent such a conflicting lock from being | |||
granted but MUST be revoked as necessary so as not to interfere with | granted but MUST be revoked as necessary so as not to interfere with | |||
skipping to change at page 167, line 38 | skipping to change at page 173, line 17 | |||
In addition, all I/O submitted by the client with the now invalid | In addition, all I/O submitted by the client with the now invalid | |||
stateids will fail with the server returning the error | stateids will fail with the server returning the error | |||
NFS4ERR_EXPIRED. Once the client learns of the loss of locking | NFS4ERR_EXPIRED. Once the client learns of the loss of locking | |||
state, it will suitably notify the applications that held the | state, it will suitably notify the applications that held the | |||
invalidated locks. The client should then take action to free | invalidated locks. The client should then take action to free | |||
invalidated stateids, either by establishing a new client ID using a | invalidated stateids, either by establishing a new client ID using a | |||
new verifier or by doing a FREE_STATEID operation to release each of | new verifier or by doing a FREE_STATEID operation to release each of | |||
the invalidated stateids. | the invalidated stateids. | |||
When the server adopts a finer-grained approach to revocation of | When the server adopts a finer-grained approach to revocation of | |||
locks when lease have expired, only a subset of stateids will | locks when leases have expired, only a subset of stateids will | |||
normally become invalid during a network partition. When the client | normally become invalid during a network partition. When the client | |||
can communicate with the server after such a network partition heals, | can communicate with the server after such a network partition heals, | |||
the status returned by the SEQUENCE operation will indicate a partial | the status returned by the SEQUENCE operation will indicate a partial | |||
loss of locking state (SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED). In | loss of locking state (SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED). In | |||
addition, operations, including I/O submitted by the client, with the | addition, operations, including I/O submitted by the client, with the | |||
now invalid stateids will fail with the server returning the error | now invalid stateids will fail with the server returning the error | |||
NFS4ERR_EXPIRED. Once the client learns of the loss of locking | NFS4ERR_EXPIRED. Once the client learns of the loss of locking | |||
state, it will use the TEST_STATEID operation on all of its stateids | state, it will use the TEST_STATEID operation on all of its stateids | |||
to determine which locks have been lost and then suitably notify the | to determine which locks have been lost and then suitably notify the | |||
applications that held the invalidated locks. The client can then | applications that held the invalidated locks. The client can then | |||
skipping to change at page 225, line 32 | skipping to change at page 231, line 14 | |||
11.5. Location Entries and Server Identity | 11.5. Location Entries and Server Identity | |||
As mentioned above, a single location entry may have a server address | As mentioned above, a single location entry may have a server address | |||
target in the form of a DNS name which may represent multiple IP | target in the form of a DNS name which may represent multiple IP | |||
addresses, while multiple location entries may have their own server | addresses, while multiple location entries may have their own server | |||
address targets, that reference the same server. Whether two IP | address targets, that reference the same server. Whether two IP | |||
addresses designate the same server is indicated by the existence of | addresses designate the same server is indicated by the existence of | |||
a common so_major_id field within the eir_server_owner field returned | a common so_major_id field within the eir_server_owner field returned | |||
by EXCHANGE_ID (see Section 18.35.3), subject to further | by EXCHANGE_ID (see Section 18.35.3), subject to further | |||
verification, for details of which see Section 2.10.4. | verification, for details of which see Section 2.10.5. | |||
When multiple addresses for the same server exist, the client may | When multiple addresses for the same server exist, the client may | |||
assume that for each file system in the namespace of a given server | assume that for each file system in the namespace of a given server | |||
network address, there exist file systems at corresponding namespace | network address, there exist file systems at corresponding namespace | |||
locations for each of the other server network addresses. It may do | locations for each of the other server network addresses. It may do | |||
this even in the absence of explicit listing in fs_locations and | this even in the absence of explicit listing in fs_locations and | |||
fs_locations_info. Such corresponding file system locations can be | fs_locations_info. Such corresponding file system locations can be | |||
used as alternate locations, just as those explicitly specified via | used as alternate locations, just as those explicitly specified via | |||
the fs_locations and fs_locations_info attributes. Where these | the fs_locations and fs_locations_info attributes. Where these | |||
specific addresses are explicitly designated in the fs_locations_info | specific addresses are explicitly designated in the fs_locations_info | |||
skipping to change at page 229, line 43 | skipping to change at page 235, line 26 | |||
When the conditions in Section 11.7.2 hold, in either of the | When the conditions in Section 11.7.2 hold, in either of the | |||
following two cases, the client may use the two file system instances | following two cases, the client may use the two file system instances | |||
simultaneously. | simultaneously. | |||
o The fs_locations_info attribute does not contain separate per- | o The fs_locations_info attribute does not contain separate per- | |||
network-address entries for file systems instances at the distinct | network-address entries for file systems instances at the distinct | |||
network addresses. This includes the case in which the | network addresses. This includes the case in which the | |||
fs_locations_info attribute is unavailable. In this case, the | fs_locations_info attribute is unavailable. In this case, the | |||
fact that the two server addresses connect to the same server (as | fact that the two server addresses connect to the same server (as | |||
indicated by the two addresses sharing the same the so_major_id | indicated by the two addresses sharing the same the so_major_id | |||
value and subsequently confirmed as described in Section 2.10.4) | value and subsequently confirmed as described in Section 2.10.5) | |||
justifies simultaneous use and there is no fs_locations_info | justifies simultaneous use and there is no fs_locations_info | |||
attribute information contradicting that. | attribute information contradicting that. | |||
o The fs_locations_info attribute indicates that two file system | o The fs_locations_info attribute indicates that two file system | |||
instances belong to the same _simultaneous-use_ class. | instances belong to the same _simultaneous-use_ class. | |||
In this case, the client may use both file system instances | In this case, the client may use both file system instances | |||
simultaneously, as representations of the same file system, whether | simultaneously, as representations of the same file system, whether | |||
that happens because the two network addresses connect to the same | that happens because the two network addresses connect to the same | |||
physical server or because different servers connect to clustered | physical server or because different servers connect to clustered | |||
skipping to change at page 234, line 26 | skipping to change at page 240, line 10 | |||
which they have not. Cooperation by two servers in state management | which they have not. Cooperation by two servers in state management | |||
requires coordination of client IDs. Before the client attempts to | requires coordination of client IDs. Before the client attempts to | |||
use a client ID associated with one server in a request to the server | use a client ID associated with one server in a request to the server | |||
of the other file system, it must eliminate the possibility that two | of the other file system, it must eliminate the possibility that two | |||
non-cooperating servers have assigned the same client ID by accident. | non-cooperating servers have assigned the same client ID by accident. | |||
The client needs to compare the eir_server_scope values returned by | The client needs to compare the eir_server_scope values returned by | |||
each server. If the scope values do not match, then the servers have | each server. If the scope values do not match, then the servers have | |||
not cooperated in state management. If the scope values match, then | not cooperated in state management. If the scope values match, then | |||
this indicates the servers have cooperated in assigning client IDs to | this indicates the servers have cooperated in assigning client IDs to | |||
the point that they will reject client IDs that refer to state they | the point that they will reject client IDs that refer to state they | |||
do not know about. | do not know about. See Section 2.10.4 for more information about the | |||
use of server scope. | ||||
In the case of migration, the servers involved in the migration of a | In the case of migration, the servers involved in the migration of a | |||
file system SHOULD transfer all server state from the original to the | file system SHOULD transfer all server state from the original to the | |||
new server. When this is done, it must be done in a way that is | new server. When this is done, it must be done in a way that is | |||
transparent to the client. With replication, such a degree of common | transparent to the client. With replication, such a degree of common | |||
state is typically not the case. Clients, however should use the | state is typically not the case. Clients, however should use the | |||
information provided by the eir_server_scope returned by EXCHANGE_ID | information provided by the eir_server_scope returned by EXCHANGE_ID | |||
to determine whether such sharing may be in effect, rather than | (as modified by the validation procedures described in | |||
making assumptions based on the reason for the transition. | Section 2.10.4) to determine whether such sharing may be in effect, | |||
rather than making assumptions based on the reason for the | ||||
transition. | ||||
This state transfer will reduce disruption to the client when a file | This state transfer will reduce disruption to the client when a file | |||
system transition occurs. If the servers are successful in | system transition occurs. If the servers are successful in | |||
transferring all state, the client can attempt to establish sessions | transferring all state, the client can attempt to establish sessions | |||
associated with the client ID used for the source file system | associated with the client ID used for the source file system | |||
instance. If the server accepts that as a valid client ID, then the | instance. If the server accepts that as a valid client ID, then the | |||
client may use the existing stateids associated with that client ID | client may use the existing stateids associated with that client ID | |||
for the old file system instance in connection with that same client | for the old file system instance in connection with that same client | |||
ID in connection with the transitioned file system instance. | ID in connection with the transitioned file system instance. If the | |||
client in question already had a client ID on the target system, it | ||||
may interrogate the state ID values from the source system under that | ||||
new client ID, with the assurance that if they are accepted as valid, | ||||
then they represent validly transferred lock state for the source | ||||
file system, transferred to the target server. | ||||
When the two servers belong to the same server scope, it does not | When the two servers belong to the same server scope, it does not | |||
mean that when dealing with the transition, the client will not have | mean that when dealing with the transition, the client will not have | |||
to reclaim state. However it does mean that the client may proceed | to reclaim state. However it does mean that the client may proceed | |||
using its current client ID when establishing communication with the | using its current client ID when establishing communication with the | |||
new server and the new server will either recognize the client ID as | new server and the new server will either recognize the client ID as | |||
valid, or reject it, in which case locks must be reclaimed by the | valid, or reject it, in which case locks must be reclaimed by the | |||
client. | client. | |||
File systems co-operating in state management may actually share | File systems co-operating in state management may actually share | |||
skipping to change at page 235, line 18 | skipping to change at page 241, line 10 | |||
reject as stale) each other's stateids and client IDs. Servers which | reject as stale) each other's stateids and client IDs. Servers which | |||
do share state may not do so under all conditions or at all times. | do share state may not do so under all conditions or at all times. | |||
The requirement for the server is that if it cannot be sure in | The requirement for the server is that if it cannot be sure in | |||
accepting a client ID that it reflects the locks the client was | accepting a client ID that it reflects the locks the client was | |||
given, it must treat all associated state as stale and report it as | given, it must treat all associated state as stale and report it as | |||
such to the client. | such to the client. | |||
When the two file system instances are on servers that do not share a | When the two file system instances are on servers that do not share a | |||
server scope value, the client must establish a new client ID on the | server scope value, the client must establish a new client ID on the | |||
destination, if it does not have one already, and reclaim locks if | destination, if it does not have one already, and reclaim locks if | |||
possible. In this case, old stateids and client IDs should not be | allowed by the server. In this case, old stateids and client IDs | |||
presented to the new server since there is no assurance that they | should not be presented to the new server since there is no assurance | |||
will not conflict with IDs valid on that server. | that they will not conflict with IDs valid on that server. Note that | |||
in this case lock reclaim may be attempted even when the servers | ||||
involved in the transfer have different server scope values (see | ||||
Section 8.4.2.1 for the contrary case of reclaim after server reboot. | ||||
Servers with different server scope values may co-operate to allow | ||||
reclaim for locks associated with the transfer of a filesystem even | ||||
if they do not co-operate sufficiently to share a server scope. | ||||
In either case, when actual locks are not known to be maintained, the | In either case, when actual locks are not known to be maintained, the | |||
destination server may establish a grace period specific to the given | destination server may establish a grace period specific to the given | |||
file system, with non-reclaim locks being rejected for that file | file system, with non-reclaim locks being rejected for that file | |||
system, even though normal locks are being granted for other file | system, even though normal locks are being granted for other file | |||
systems. Clients should not infer the absence of a grace period for | systems. Clients should not infer the absence of a grace period for | |||
file systems being transitioned to a server from responses to | file systems being transitioned to a server from responses to | |||
requests for other file systems. | requests for other file systems. | |||
In the case of lock reclamation for a given file system after a file | In the case of lock reclamation for a given file system after a file | |||
skipping to change at page 282, line 41 | skipping to change at page 288, line 41 | |||
layout stateid. If the "seqid" is not one higher than what the | layout stateid. If the "seqid" is not one higher than what the | |||
client currently has recorded, and the client has at least one | client currently has recorded, and the client has at least one | |||
LAYOUTGET and/or LAYOUTRETURN operation outstanding, the client knows | LAYOUTGET and/or LAYOUTRETURN operation outstanding, the client knows | |||
the server sent the CB_LAYOUTRECALL after sending a response to an | the server sent the CB_LAYOUTRECALL after sending a response to an | |||
outstanding LAYOUTGET or LAYOUTRETURN. The client MUST wait before | outstanding LAYOUTGET or LAYOUTRETURN. The client MUST wait before | |||
processing such a CB_LAYOUTRECALL until it processes all replies for | processing such a CB_LAYOUTRECALL until it processes all replies for | |||
outstanding LAYOUTGET and LAYOUTRETURN operations for the | outstanding LAYOUTGET and LAYOUTRETURN operations for the | |||
corresponding file with seqid less than the seqid given by | corresponding file with seqid less than the seqid given by | |||
CB_LAYOUTRECALL (lor_stateid, see Section 20.3.) | CB_LAYOUTRECALL (lor_stateid, see Section 20.3.) | |||
In addition to the seqid-based mechanism, Section 2.10.5.3 describes | In addition to the seqid-based mechanism, Section 2.10.6.3 describes | |||
the sessions mechanism for allowing the client to detect callback | the sessions mechanism for allowing the client to detect callback | |||
race conditions and delay processing such a CB_LAYOUTRECALL. The | race conditions and delay processing such a CB_LAYOUTRECALL. The | |||
server MAY reference conflicting operations in the CB_SEQUENCE that | server MAY reference conflicting operations in the CB_SEQUENCE that | |||
precedes the CB_LAYOUTRECALL. Because the server has already sent | precedes the CB_LAYOUTRECALL. Because the server has already sent | |||
replies for these operations before issuing the callback, the replies | replies for these operations before issuing the callback, the replies | |||
may race with the CB_LAYOUTRECALL. The client MUST wait for all the | may race with the CB_LAYOUTRECALL. The client MUST wait for all the | |||
referenced calls to complete and update its view of the layout state | referenced calls to complete and update its view of the layout state | |||
before processing the CB_LAYOUTRECALL. | before processing the CB_LAYOUTRECALL. | |||
12.5.5.2.1.1. Get/Return Sequencing | 12.5.5.2.1.1. Get/Return Sequencing | |||
skipping to change at page 285, line 24 | skipping to change at page 291, line 24 | |||
12.5.5.2.1.4. Wraparound and Validation of Seqid | 12.5.5.2.1.4. Wraparound and Validation of Seqid | |||
The rules for layout stateid processing differ from other stateids in | The rules for layout stateid processing differ from other stateids in | |||
the protocol because the "seqid" value cannot be zero and the | the protocol because the "seqid" value cannot be zero and the | |||
stateid's "seqid" value changes in a CB_LAYOUTRECALL operation. The | stateid's "seqid" value changes in a CB_LAYOUTRECALL operation. The | |||
non-zero requirement combined with the inherent parallelism of layout | non-zero requirement combined with the inherent parallelism of layout | |||
operations means that a set of LAYOUTGET and LAYOUTRETURN operations | operations means that a set of LAYOUTGET and LAYOUTRETURN operations | |||
may contain the same value for "seqid". The server uses a slightly | may contain the same value for "seqid". The server uses a slightly | |||
modified version of the modulo arithmetic as described in | modified version of the modulo arithmetic as described in | |||
Section 2.10.5.1 when incrementing the layout stateid's "seqid". The | Section 2.10.6.1 when incrementing the layout stateid's "seqid". The | |||
modification to that modulo arithmetic description is to not use | modification to that modulo arithmetic description is to not use | |||
zero. The modulo arithmetic is also used for the comparisons of | zero. The modulo arithmetic is also used for the comparisons of | |||
"seqid" values in the processing of CB_LAYOUTRECALL events as | "seqid" values in the processing of CB_LAYOUTRECALL events as | |||
described above in Section 12.5.5.2.1.3. | described above in Section 12.5.5.2.1.3. | |||
Just as the server validates the "seqid" in the event of | Just as the server validates the "seqid" in the event of | |||
CB_LAYOUTRECALL usage, as described in Section 12.5.5.2.1.3, the | CB_LAYOUTRECALL usage, as described in Section 12.5.5.2.1.3, the | |||
server also validates the "seqid" value to ensure that it is within | server also validates the "seqid" value to ensure that it is within | |||
an appropriate range. This range represents the degree of | an appropriate range. This range represents the degree of | |||
parallelism the server supports for layout stateids. If the client | parallelism the server supports for layout stateids. If the client | |||
skipping to change at page 290, line 27 | skipping to change at page 296, line 27 | |||
the lease expiration. First, for all modified but uncommitted data, | the lease expiration. First, for all modified but uncommitted data, | |||
write it to the metadata server using the FILE_SYNC4 flag for the | write it to the metadata server using the FILE_SYNC4 flag for the | |||
WRITEs or WRITE and COMMIT. Second, the client reestablishes a | WRITEs or WRITE and COMMIT. Second, the client reestablishes a | |||
client ID and session with the server and obtain new layouts and | client ID and session with the server and obtain new layouts and | |||
device ID to device address mappings for the modified data ranges and | device ID to device address mappings for the modified data ranges and | |||
then write the data to the storage devices with the newly obtained | then write the data to the storage devices with the newly obtained | |||
layouts. | layouts. | |||
If sr_status_flags from the metadata server has | If sr_status_flags from the metadata server has | |||
SEQ4_STATUS_RESTART_RECLAIM_NEEDED set (or SEQUENCE returns | SEQ4_STATUS_RESTART_RECLAIM_NEEDED set (or SEQUENCE returns | |||
NFS4ERR_STALE_CLIENTID, or SEQUENCE returns NFS4ERR_BAD_SESSION and | NFS4ERR_BAD_SESSION and CREATE_SESSION returns | |||
CREATE_SESSION returns NFS4ERR_STALE_CLIENTID) then the metadata | NFS4ERR_STALE_CLIENTID) then the metadata server has restarted, and | |||
server has restarted, and the client SHOULD recover using the methods | the client SHOULD recover using the methods described in | |||
described in Section 12.7.4. | Section 12.7.4. | |||
If sr_status_flags from the metadata server has | If sr_status_flags from the metadata server has | |||
SEQ4_STATUS_LEASE_MOVED set, then the client recovers by following | SEQ4_STATUS_LEASE_MOVED set, then the client recovers by following | |||
the procedure described in Section 11.7.7.1. After that, the client | the procedure described in Section 11.7.7.1. After that, the client | |||
may get an indication that the layout state was not moved with the | may get an indication that the layout state was not moved with the | |||
file system. The client recovers as in the other applicable | file system. The client recovers as in the other applicable | |||
situations discussed in Paragraph 1 or Paragraph 2 of this section. | situations discussed in Paragraph 1 or Paragraph 2 of this section. | |||
If sr_status_flags reports no loss of state, then the lease for the | If sr_status_flags reports no loss of state, then the lease for the | |||
layouts the client has are valid and renewed, and the client can once | layouts the client has are valid and renewed, and the client can once | |||
skipping to change at page 298, line 22 | skipping to change at page 304, line 22 | |||
Another scenario is for the metadata server and the storage device to | Another scenario is for the metadata server and the storage device to | |||
be distinct from one client's point of view, and the roles reversed | be distinct from one client's point of view, and the roles reversed | |||
from another client's point of view. For example, in the cluster | from another client's point of view. For example, in the cluster | |||
file system model, a metadata server to one client may be a data | file system model, a metadata server to one client may be a data | |||
server to another client. If NFSv4.1 is being used as the storage | server to another client. If NFSv4.1 is being used as the storage | |||
protocol, then pNFS servers need to encode the values of filehandles | protocol, then pNFS servers need to encode the values of filehandles | |||
according to their specific roles. | according to their specific roles. | |||
13.1.1. Sessions Considerations for Data Servers | 13.1.1. Sessions Considerations for Data Servers | |||
Section 2.10.9.2 states that a client has to keep its lease renewed | Section 2.10.10.2 states that a client has to keep its lease renewed | |||
in order to prevent a session from being deleted by the server. If | in order to prevent a session from being deleted by the server. If | |||
the reply to EXCHANGE_ID has just the EXCHGID4_FLAG_USE_PNFS_DS role | the reply to EXCHANGE_ID has just the EXCHGID4_FLAG_USE_PNFS_DS role | |||
set, then as noted in Section 13.6 the client will not be able to | set, then as noted in Section 13.6 the client will not be able to | |||
determine the data server's lease_time attribute, because GETATTR | determine the data server's lease_time attribute, because GETATTR | |||
will not be permitted. Instead, the rule is that any time a client | will not be permitted. Instead, the rule is that any time a client | |||
receives a layout referring it to a data server that returns just the | receives a layout referring it to a data server that returns just the | |||
EXCHGID4_FLAG_USE_PNFS_DS role, the client MAY assume that the | EXCHGID4_FLAG_USE_PNFS_DS role, the client MAY assume that the | |||
lease_time attribute from the metadata server that returned the | lease_time attribute from the metadata server that returned the | |||
layout applies to the data server. Thus the data server MUST be | layout applies to the data server. Thus the data server MUST be | |||
aware of the values of all lease_time attributes of all metadata | aware of the values of all lease_time attributes of all metadata | |||
skipping to change at page 310, line 30 | skipping to change at page 316, line 30 | |||
data server 2. Unless data server 2 has two filehandles (each | data server 2. Unless data server 2 has two filehandles (each | |||
referring to a different data file), then, for example, a write to | referring to a different data file), then, for example, a write to | |||
logical stripe unit 1 overwrites the write to logical stripe unit 2, | logical stripe unit 1 overwrites the write to logical stripe unit 2, | |||
because both logical stripe units are located in the same stripe unit | because both logical stripe units are located in the same stripe unit | |||
(0) of data server 2. | (0) of data server 2. | |||
13.5. Data Server Multipathing | 13.5. Data Server Multipathing | |||
The NFSv4.1 file layout supports multipathing to multiple data server | The NFSv4.1 file layout supports multipathing to multiple data server | |||
addresses. Data server-level multipathing is used for bandwidth | addresses. Data server-level multipathing is used for bandwidth | |||
scaling via trunking (Section 2.10.4) and for higher availability of | scaling via trunking (Section 2.10.5) and for higher availability of | |||
use in the case of a data server failure. Multipathing allows the | use in the case of a data server failure. Multipathing allows the | |||
client to switch to another data server address which may that of | client to switch to another data server address which may that of | |||
another data server that is exporting the same data stripe unit, | another data server that is exporting the same data stripe unit, | |||
without having to contact the metadata server for a new layout. | without having to contact the metadata server for a new layout. | |||
To support data server multipathing, each element of the | To support data server multipathing, each element of the | |||
nflda_multipath_ds_list contains an array of one more data server | nflda_multipath_ds_list contains an array of one more data server | |||
network addresses. This array (data type multipath_list4) represents | network addresses. This array (data type multipath_list4) represents | |||
a list of data servers (each identified by a network address), with | a list of data servers (each identified by a network address), with | |||
it being possible that some data servers will appear in the list | it being possible that some data servers will appear in the list | |||
skipping to change at page 311, line 18 | skipping to change at page 317, line 18 | |||
the device ID to device address mappings to the available data | the device ID to device address mappings to the available data | |||
servers. If the device ID itself must be replaced, the MDS SHOULD | servers. If the device ID itself must be replaced, the MDS SHOULD | |||
recall all layouts with the device ID, and thus force the client to | recall all layouts with the device ID, and thus force the client to | |||
get new layouts and device ID mappings via LAYOUTGET and | get new layouts and device ID mappings via LAYOUTGET and | |||
GETDEVICEINFO. | GETDEVICEINFO. | |||
Generally if two network addresses appear in an element of | Generally if two network addresses appear in an element of | |||
nflda_multipath_ds_list they will designate the same data server and | nflda_multipath_ds_list they will designate the same data server and | |||
the two data server addresses will support the implementation client | the two data server addresses will support the implementation client | |||
ID or session trunking (the latter is RECOMMENDED) as defined in | ID or session trunking (the latter is RECOMMENDED) as defined in | |||
Section 2.10.4, and the two data server addresses will share the same | Section 2.10.5, and the two data server addresses will share the same | |||
server owner, or major ID of the server owner. It is not always | server owner, or major ID of the server owner. It is not always | |||
necessary for the two data server addresses to designate the same | necessary for the two data server addresses to designate the same | |||
server with trunking being used. For example the data could be read- | server with trunking being used. For example the data could be read- | |||
only, and the data consist of exact replicas. | only, and the data consist of exact replicas. | |||
13.6. Operations Sent to NFSv4.1 Data Servers | 13.6. Operations Sent to NFSv4.1 Data Servers | |||
Clients accessing data on an NFSv4.1 data server MUST send only the | Clients accessing data on an NFSv4.1 data server MUST send only the | |||
NULL procedure and COMPOUND procedures whose operations are taken | NULL procedure and COMPOUND procedures whose operations are taken | |||
only from two restricted subsets of the operations defined as valid | only from two restricted subsets of the operations defined as valid | |||
skipping to change at page 336, line 35 | skipping to change at page 342, line 35 | |||
due to administrative interaction, possibly while the lease is valid. | due to administrative interaction, possibly while the lease is valid. | |||
15.1.5.2. NFS4ERR_BAD_STATEID (Error Code 10026) | 15.1.5.2. NFS4ERR_BAD_STATEID (Error Code 10026) | |||
A stateid does not properly designate any valid state. See | A stateid does not properly designate any valid state. See | |||
Section 8.2.4 and Section 8.2.3 for a discussion of how stateids are | Section 8.2.4 and Section 8.2.3 for a discussion of how stateids are | |||
validated. | validated. | |||
15.1.5.3. NFS4ERR_DELEG_REVOKED (Error Code 10087) | 15.1.5.3. NFS4ERR_DELEG_REVOKED (Error Code 10087) | |||
A stateid designates recallable locking state of any type that has | A stateid designates recallable locking state of any type (delegation | |||
been revoked due to the failure of the client to return the lock, | or layout) that has been revoked due to the failure of the client to | |||
when it was recalled. | return the lock, when it was recalled. | |||
15.1.5.4. NFS4ERR_EXPIRED (Error Code 10011) | 15.1.5.4. NFS4ERR_EXPIRED (Error Code 10011) | |||
A stateid designates locking state of any type that has been revoked | A stateid designates locking state of any type that has been revoked | |||
due to expiration of the client's lease, either immediately upon | due to expiration of the client's lease, either immediately upon | |||
lease expiration, or following a later request for a conflicting | lease expiration, or following a later request for a conflicting | |||
lock. | lock. | |||
15.1.5.5. NFS4ERR_OLD_STATEID (Error Code 10024) | 15.1.5.5. NFS4ERR_OLD_STATEID (Error Code 10024) | |||
skipping to change at page 342, line 7 | skipping to change at page 348, line 7 | |||
server. | server. | |||
15.1.11. Session Use Errors | 15.1.11. Session Use Errors | |||
This section deals with errors encountered in using sessions, that | This section deals with errors encountered in using sessions, that | |||
is, in issuing requests over them using the Sequence (i.e. either | is, in issuing requests over them using the Sequence (i.e. either | |||
SEQUENCE or CB_SEQUENCE) operations. | SEQUENCE or CB_SEQUENCE) operations. | |||
15.1.11.1. NFS4ERR_BADSESSION (Error Code 10052) | 15.1.11.1. NFS4ERR_BADSESSION (Error Code 10052) | |||
A session ID was specified which does not exist. | A session ID that was specified is not known to the server to which | |||
the operation is addressed. | ||||
15.1.11.2. NFS4ERR_BADSLOT (Error Code 10053) | 15.1.11.2. NFS4ERR_BADSLOT (Error Code 10053) | |||
The requester sent a Sequence operation that attempted to use a slot | The requester sent a Sequence operation that attempted to use a slot | |||
the replier does not have in its slot table. It is possible the slot | the replier does not have in its slot table. It is possible the slot | |||
may have been retired. | may have been retired. | |||
15.1.11.3. NFS4ERR_BAD_HIGH_SLOT (Error Code 10077) | 15.1.11.3. NFS4ERR_BAD_HIGH_SLOT (Error Code 10077) | |||
The highest_slot argument in a Sequence operation exceeds the | The highest_slot argument in a Sequence operation exceeds the | |||
skipping to change at page 351, line 8 | skipping to change at page 357, line 8 | |||
| | NFS4ERR_TOO_MANY_OPS, | | | | NFS4ERR_TOO_MANY_OPS, | | |||
| | NFS4ERR_UNKNOWN_LAYOUTTYPE | | | | NFS4ERR_UNKNOWN_LAYOUTTYPE | | |||
| GETFH | NFS4ERR_FHEXPIRED, NFS4ERR_MOVED, | | | GETFH | NFS4ERR_FHEXPIRED, NFS4ERR_MOVED, | | |||
| | NFS4ERR_NOFILEHANDLE, | | | | NFS4ERR_NOFILEHANDLE, | | |||
| | NFS4ERR_OP_NOT_IN_SESSION, NFS4ERR_STALE | | | | NFS4ERR_OP_NOT_IN_SESSION, NFS4ERR_STALE | | |||
| ILLEGAL | NFS4ERR_BADXDR NFS4ERR_OP_ILLEGAL | | | ILLEGAL | NFS4ERR_BADXDR NFS4ERR_OP_ILLEGAL | | |||
| LAYOUTCOMMIT | NFS4ERR_ACCESS, NFS4ERR_ADMIN_REVOKED, | | | LAYOUTCOMMIT | NFS4ERR_ACCESS, NFS4ERR_ADMIN_REVOKED, | | |||
| | NFS4ERR_ATTRNOTSUPP, NFS4ERR_BADIOMODE, | | | | NFS4ERR_ATTRNOTSUPP, NFS4ERR_BADIOMODE, | | |||
| | NFS4ERR_BADLAYOUT, NFS4ERR_BADXDR, | | | | NFS4ERR_BADLAYOUT, NFS4ERR_BADXDR, | | |||
| | NFS4ERR_DEADSESSION, NFS4ERR_DELAY, | | | | NFS4ERR_DEADSESSION, NFS4ERR_DELAY, | | |||
| | NFS4ERR_EXPIRED, NFS4ERR_FBIG, | | | | NFS4ERR_DELEG_REVOKED, NFS4ERR_EXPIRED, | | |||
| | NFS4ERR_FHEXPIRED, NFS4ERR_GRACE, | | | | NFS4ERR_FBIG, NFS4ERR_FHEXPIRED, | | |||
| | NFS4ERR_INVAL, NFS4ERR_IO, NFS4ERR_ISDIR | | | | NFS4ERR_GRACE, NFS4ERR_INVAL, NFS4ERR_IO, | | |||
| | NFS4ERR_MOVED, NFS4ERR_NOFILEHANDLE, | | | | NFS4ERR_ISDIR NFS4ERR_MOVED, | | |||
| | NFS4ERR_NOTSUPP, NFS4ERR_NO_GRACE, | | | | NFS4ERR_NOFILEHANDLE, NFS4ERR_NOTSUPP, | | |||
| | NFS4ERR_NO_GRACE, | | ||||
| | NFS4ERR_OP_NOT_IN_SESSION, | | | | NFS4ERR_OP_NOT_IN_SESSION, | | |||
| | NFS4ERR_RECLAIM_BAD, | | | | NFS4ERR_RECLAIM_BAD, | | |||
| | NFS4ERR_RECLAIM_CONFLICT, | | | | NFS4ERR_RECLAIM_CONFLICT, | | |||
| | NFS4ERR_REP_TOO_BIG, | | | | NFS4ERR_REP_TOO_BIG, | | |||
| | NFS4ERR_REP_TOO_BIG_TO_CACHE, | | | | NFS4ERR_REP_TOO_BIG_TO_CACHE, | | |||
| | NFS4ERR_REQ_TOO_BIG, NFS4ERR_SERVERFAULT, | | | | NFS4ERR_REQ_TOO_BIG, NFS4ERR_SERVERFAULT, | | |||
| | NFS4ERR_STALE, NFS4ERR_SYMLINK, | | | | NFS4ERR_STALE, NFS4ERR_SYMLINK, | | |||
| | NFS4ERR_TOO_MANY_OPS, | | | | NFS4ERR_TOO_MANY_OPS, | | |||
| | NFS4ERR_UNKNOWN_LAYOUTTYPE, | | | | NFS4ERR_UNKNOWN_LAYOUTTYPE, | | |||
| | NFS4ERR_WRONG_CRED | | | | NFS4ERR_WRONG_CRED | | |||
skipping to change at page 368, line 35 | skipping to change at page 374, line 35 | |||
| | OPENATTR, OPEN_DOWNGRADE, | | | | OPENATTR, OPEN_DOWNGRADE, | | |||
| | PUTFH, PUTPUBFH, PUTROOTFH, | | | | PUTFH, PUTPUBFH, PUTROOTFH, | | |||
| | READ, READDIR, READLINK, | | | | READ, READDIR, READLINK, | | |||
| | RECLAIM_COMPLETE, REMOVE, | | | | RECLAIM_COMPLETE, REMOVE, | | |||
| | RENAME, SECINFO, | | | | RENAME, SECINFO, | | |||
| | SECINFO_NO_NAME, SEQUENCE, | | | | SECINFO_NO_NAME, SEQUENCE, | | |||
| | SETATTR, SET_SSV, | | | | SETATTR, SET_SSV, | | |||
| | TEST_STATEID, VERIFY, | | | | TEST_STATEID, VERIFY, | | |||
| | WANT_DELEGATION, WRITE | | | | WANT_DELEGATION, WRITE | | |||
| NFS4ERR_DELEG_ALREADY_WANTED | OPEN, WANT_DELEGATION | | | NFS4ERR_DELEG_ALREADY_WANTED | OPEN, WANT_DELEGATION | | |||
| NFS4ERR_DELEG_REVOKED | DELEGRETURN, LAYOUTGET, | | | NFS4ERR_DELEG_REVOKED | DELEGRETURN, LAYOUTCOMMIT, | | |||
| | LAYOUTRETURN, OPEN, READ, | | | | LAYOUTGET, LAYOUTRETURN, | | |||
| | SETATTR, WRITE | | | | OPEN, READ, SETATTR, WRITE | | |||
| NFS4ERR_DENIED | LOCK, LOCKT | | | NFS4ERR_DENIED | LOCK, LOCKT | | |||
| NFS4ERR_DIRDELEG_UNAVAIL | GET_DIR_DELEGATION | | | NFS4ERR_DIRDELEG_UNAVAIL | GET_DIR_DELEGATION | | |||
| NFS4ERR_DQUOT | CREATE, LAYOUTGET, LINK, | | | NFS4ERR_DQUOT | CREATE, LAYOUTGET, LINK, | | |||
| | OPEN, OPENATTR, RENAME, | | | | OPEN, OPENATTR, RENAME, | | |||
| | SETATTR, WRITE | | | | SETATTR, WRITE | | |||
| NFS4ERR_ENCR_ALG_UNSUPP | EXCHANGE_ID | | | NFS4ERR_ENCR_ALG_UNSUPP | EXCHANGE_ID | | |||
| NFS4ERR_EXIST | CREATE, LINK, OPEN, RENAME | | | NFS4ERR_EXIST | CREATE, LINK, OPEN, RENAME | | |||
| NFS4ERR_EXPIRED | CLOSE, DELEGRETURN, | | | NFS4ERR_EXPIRED | CLOSE, DELEGRETURN, | | |||
| | LAYOUTCOMMIT, LAYOUTRETURN, | | | | LAYOUTCOMMIT, LAYOUTRETURN, | | |||
| | LOCK, LOCKU, OPEN, | | | | LOCK, LOCKU, OPEN, | | |||
skipping to change at page 385, line 39 | skipping to change at page 391, line 39 | |||
The COMPOUND procedure is used to combine individual operations into | The COMPOUND procedure is used to combine individual operations into | |||
a single RPC request. The server interprets each of the operations | a single RPC request. The server interprets each of the operations | |||
in turn. If an operation is executed by the server and the status of | in turn. If an operation is executed by the server and the status of | |||
that operation is NFS4_OK, then the next operation in the COMPOUND | that operation is NFS4_OK, then the next operation in the COMPOUND | |||
procedure is executed. The server continues this process until there | procedure is executed. The server continues this process until there | |||
are no more operations to be executed or one of the operations has a | are no more operations to be executed or one of the operations has a | |||
status value other than NFS4_OK. | status value other than NFS4_OK. | |||
In the processing of the COMPOUND procedure, the server may find that | In the processing of the COMPOUND procedure, the server may find that | |||
it does not have the available resources to execute any or all of the | it does not have the available resources to execute any or all of the | |||
operations within the COMPOUND sequence. See Section 2.10.5.4 for a | operations within the COMPOUND sequence. See Section 2.10.6.4 for a | |||
more detailed discussion. | more detailed discussion. | |||
The server will generally choose between two methods of decoding the | The server will generally choose between two methods of decoding the | |||
client's request. The first would be the traditional one pass XDR | client's request. The first would be the traditional one pass XDR | |||
decode. If there is an XDR decoding error in this case, the RPC XDR | decode. If there is an XDR decoding error in this case, the RPC XDR | |||
decode error would be returned. The second method would be to make | decode error would be returned. The second method would be to make | |||
an initial pass to decode the basic COMPOUND request and then to XDR | an initial pass to decode the basic COMPOUND request and then to XDR | |||
decode the individual operations; the most interesting is the decode | decode the individual operations; the most interesting is the decode | |||
of attributes. In this case, the server may encounter an XDR decode | of attributes. In this case, the server may encounter an XDR decode | |||
error during the second pass. In this case, the server would return | error during the second pass. In this case, the server would return | |||
skipping to change at page 410, line 5 | skipping to change at page 416, line 5 | |||
default: | default: | |||
void; | void; | |||
}; | }; | |||
18.8.3. DESCRIPTION | 18.8.3. DESCRIPTION | |||
This operation returns the current filehandle value. | This operation returns the current filehandle value. | |||
On success, the current filehandle retains its value. | On success, the current filehandle retains its value. | |||
As described in Section 2.10.5.4, GETFH is REQUIRED or RECOMMENDED to | As described in Section 2.10.6.4, GETFH is REQUIRED or RECOMMENDED to | |||
immediately follow certain operations, and servers are free to reject | immediately follow certain operations, and servers are free to reject | |||
such operations the client fails to insert GETFH in the request as | such operations the client fails to insert GETFH in the request as | |||
REQUIRED or RECOMMENDED. Section 18.16.4.1 provides additional | REQUIRED or RECOMMENDED. Section 18.16.4.1 provides additional | |||
justification for why GETFH MUST follow OPEN. | justification for why GETFH MUST follow OPEN. | |||
18.8.4. IMPLEMENTATION | 18.8.4. IMPLEMENTATION | |||
Operations that change the current filehandle like LOOKUP or CREATE | Operations that change the current filehandle like LOOKUP or CREATE | |||
do not automatically return the new filehandle as a result. For | do not automatically return the new filehandle as a result. For | |||
instance, if a client needs to lookup a directory entry and obtain | instance, if a client needs to lookup a directory entry and obtain | |||
skipping to change at page 443, line 10 | skipping to change at page 449, line 10 | |||
to determine which file to close. Therefore the client MUST follow | to determine which file to close. Therefore the client MUST follow | |||
every OPEN operation with a GETFH operation in the same COMPOUND | every OPEN operation with a GETFH operation in the same COMPOUND | |||
procedure. This will supply the client with the filehandle such that | procedure. This will supply the client with the filehandle such that | |||
CLOSE can be used appropriately. | CLOSE can be used appropriately. | |||
Simply waiting for the lease on the file to expire is insufficient | Simply waiting for the lease on the file to expire is insufficient | |||
because the server may maintain the state indefinitely as long as | because the server may maintain the state indefinitely as long as | |||
another client does not attempt to make a conflicting access to the | another client does not attempt to make a conflicting access to the | |||
same file. | same file. | |||
See also Section 2.10.5.4. | See also Section 2.10.6.4. | |||
18.17. Operation 19: OPENATTR - Open Named Attribute Directory | 18.17. Operation 19: OPENATTR - Open Named Attribute Directory | |||
18.17.1. ARGUMENTS | 18.17.1. ARGUMENTS | |||
struct OPENATTR4args { | struct OPENATTR4args { | |||
/* CURRENT_FH: object */ | /* CURRENT_FH: object */ | |||
bool createdir; | bool createdir; | |||
}; | }; | |||
skipping to change at page 479, line 15 | skipping to change at page 485, line 15 | |||
18.34.3. DESCRIPTION | 18.34.3. DESCRIPTION | |||
BIND_CONN_TO_SESSION is used to associate additional connections with | BIND_CONN_TO_SESSION is used to associate additional connections with | |||
a session. It MUST be used on the connection being associated with | a session. It MUST be used on the connection being associated with | |||
the session. It MUST be the only operation in the COMPOUND | the session. It MUST be the only operation in the COMPOUND | |||
procedure. If SP4_NONE (Section 18.35) state protection is used, any | procedure. If SP4_NONE (Section 18.35) state protection is used, any | |||
principal, security flavor, or RPCSEC_GSS context MAY be used to | principal, security flavor, or RPCSEC_GSS context MAY be used to | |||
invoke the operation. If SP4_MACH_CRED is used, RPCSEC_GSS MUST be | invoke the operation. If SP4_MACH_CRED is used, RPCSEC_GSS MUST be | |||
used with the integrity or privacy services, using the principal that | used with the integrity or privacy services, using the principal that | |||
created the client ID. If SP4_SSV is used, RPCSEC_GSS with the SSV | created the client ID. If SP4_SSV is used, RPCSEC_GSS with the SSV | |||
GSS mechanism (Section 2.10.8) and integrity or privacy MUST be used. | GSS mechanism (Section 2.10.9) and integrity or privacy MUST be used. | |||
If, when the client ID was created, the client opted for SP4_NONE | If, when the client ID was created, the client opted for SP4_NONE | |||
state protection, the client is not required to use | state protection, the client is not required to use | |||
BIND_CONN_TO_SESSION to associate the connection with the session, | BIND_CONN_TO_SESSION to associate the connection with the session, | |||
unless the client wishes to associate the connection with the | unless the client wishes to associate the connection with the | |||
backchannel. When SP4_NONE protection is used, simply sending a | backchannel. When SP4_NONE protection is used, simply sending a | |||
COMPOUND request with a SEQUENCE operation is sufficient to associate | COMPOUND request with a SEQUENCE operation is sufficient to associate | |||
the connection with the session specified in SEQUENCE. | the connection with the session specified in SEQUENCE. | |||
The field bctsa_dir indicates whether the client wants to associate | The field bctsa_dir indicates whether the client wants to associate | |||
skipping to change at page 484, line 29 | skipping to change at page 490, line 29 | |||
EXCHANGE_ID sent with the current incarnation and co_ownerid will | EXCHANGE_ID sent with the current incarnation and co_ownerid will | |||
result in an error or an update of the client ID's properties, | result in an error or an update of the client ID's properties, | |||
depending on the arguments to EXCHANGE_ID. | depending on the arguments to EXCHANGE_ID. | |||
A server MUST NOT use the same client ID for two different | A server MUST NOT use the same client ID for two different | |||
incarnations of an eir_clientowner. | incarnations of an eir_clientowner. | |||
In addition to the client ID and sequence ID, the server returns a | In addition to the client ID and sequence ID, the server returns a | |||
server owner (eir_server_owner) and server scope (eir_server_scope). | server owner (eir_server_owner) and server scope (eir_server_scope). | |||
The former field is used for network trunking as described in | The former field is used for network trunking as described in | |||
Section 2.10.4. The latter field is used to allow clients to | Section 2.10.5. The latter field is used to allow clients to | |||
determine when client IDs sent by one server may be recognized by | determine when client IDs sent by one server may be recognized by | |||
another in the event of file system migration (see Section 11.7.7). | another in the event of file system migration (see Section 11.7.7). | |||
The client ID returned by EXCHANGE_ID is only unique relative to the | The client ID returned by EXCHANGE_ID is only unique relative to the | |||
combination of eir_server_owner.so_major_id and eir_server_scope. | combination of eir_server_owner.so_major_id and eir_server_scope. | |||
Thus if two servers return the same client ID, the onus is on the | Thus if two servers return the same client ID, the onus is on the | |||
client to distinguish the client IDs on the basis of | client to distinguish the client IDs on the basis of | |||
eir_server_owner.so_major_id and eir_server_scope. In the event two | eir_server_owner.so_major_id and eir_server_scope. In the event two | |||
different server's claim matching server_owner.so_major_id and | different server's claim matching server_owner.so_major_id and | |||
eir_server_scope, the client can use the verification techniques | eir_server_scope, the client can use the verification techniques | |||
discussed in Section 2.10.4 to determine if the servers are distinct. | discussed in Section 2.10.5 to determine if the servers are distinct. | |||
If they are distinct, then the client will need to note the | If they are distinct, then the client will need to note the | |||
destination network addresses of the connections used with each | destination network addresses of the connections used with each | |||
server, and use the network address as the final discriminator. | server, and use the network address as the final discriminator. | |||
The server, as defined by the unique identity expressed in the | The server, as defined by the unique identity expressed in the | |||
so_major_id of the server owner and the server scope, needs to track | so_major_id of the server owner and the server scope, needs to track | |||
several properties of each client ID it hands out. The properties | several properties of each client ID it hands out. The properties | |||
apply to the client ID and all sessions associated with the client | apply to the client ID and all sessions associated with the client | |||
ID. The properties are derived from the arguments and results of | ID. The properties are derived from the arguments and results of | |||
EXCHANGE_ID. The client ID properties include: | EXCHANGE_ID. The client ID properties include: | |||
skipping to change at page 486, line 13 | skipping to change at page 492, line 13 | |||
this property cannot be updated by subsequent EXCHANGE_ID | this property cannot be updated by subsequent EXCHANGE_ID | |||
requests. | requests. | |||
* The length of the SSV. This property is represented by the | * The length of the SSV. This property is represented by the | |||
spi_ssv_len field in the EXCHANGE_ID results. Once the client | spi_ssv_len field in the EXCHANGE_ID results. Once the client | |||
ID is confirmed, this property cannot be updated by subsequent | ID is confirmed, this property cannot be updated by subsequent | |||
EXCHANGE_ID requests. The length of SSV MUST be equal to the | EXCHANGE_ID requests. The length of SSV MUST be equal to the | |||
length of the key used by the negotiated encryption algorithm. | length of the key used by the negotiated encryption algorithm. | |||
* Number of concurrent versions of the SSV the client and server | * Number of concurrent versions of the SSV the client and server | |||
will support (Section 2.10.8). This property is represented by | will support (Section 2.10.9). This property is represented by | |||
spi_window, in the EXCHANGE_ID results. The property may be | spi_window, in the EXCHANGE_ID results. The property may be | |||
updated by subsequent EXCHANGE_ID requests. | updated by subsequent EXCHANGE_ID requests. | |||
o The client's implementation ID as represented by the | o The client's implementation ID as represented by the | |||
eia_client_impl_id field of the arguments. The property may be | eia_client_impl_id field of the arguments. The property may be | |||
updated by subsequent EXCHANGE_ID requests. | updated by subsequent EXCHANGE_ID requests. | |||
o The server's implementation ID as represented by the | o The server's implementation ID as represented by the | |||
eir_server_impl_id field of the reply. The property may be | eir_server_impl_id field of the reply. The property may be | |||
updated by replies to subsequent EXCHANGE_ID requests. | updated by replies to subsequent EXCHANGE_ID requests. | |||
skipping to change at page 487, line 13 | skipping to change at page 493, line 13 | |||
principal and security flavor it uses when sending the EXCHANGE_ID | principal and security flavor it uses when sending the EXCHANGE_ID | |||
request. The situations described in Sub-Paragraph 6, Sub- | request. The situations described in Sub-Paragraph 6, Sub- | |||
Paragraph 7, Sub-Paragraph 8, or Sub-Paragraph 9, of Paragraph 6 in | Paragraph 7, Sub-Paragraph 8, or Sub-Paragraph 9, of Paragraph 6 in | |||
Section 18.35.4 will apply. Note that if the operation succeeds and | Section 18.35.4 will apply. Note that if the operation succeeds and | |||
returns a client ID that is already confirmed, the server MUST set | returns a client ID that is already confirmed, the server MUST set | |||
the EXCHGID4_FLAG_CONFIRMED_R bit in eir_flags. | the EXCHGID4_FLAG_CONFIRMED_R bit in eir_flags. | |||
If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is not set in eia_flags, this | If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is not set in eia_flags, this | |||
means the client is trying to establish a new client ID; it is | means the client is trying to establish a new client ID; it is | |||
attempting to trunk data communication to the server | attempting to trunk data communication to the server | |||
(Section 2.10.4); or it is attempting to update properties of an | (Section 2.10.5); or it is attempting to update properties of an | |||
unconfirmed client ID. The situations described in Sub-Paragraph 1, | unconfirmed client ID. The situations described in Sub-Paragraph 1, | |||
Sub-Paragraph 2, Sub-Paragraph 3, Sub-Paragraph 4, or Sub-Paragraph 5 | Sub-Paragraph 2, Sub-Paragraph 3, Sub-Paragraph 4, or Sub-Paragraph 5 | |||
of Paragraph 6 in Section 18.35.4 will apply. Note that if the | of Paragraph 6 in Section 18.35.4 will apply. Note that if the | |||
operation succeeds and returns a client ID that was previously | operation succeeds and returns a client ID that was previously | |||
confirmed, the server MUST set the EXCHGID4_FLAG_CONFIRMED_R bit in | confirmed, the server MUST set the EXCHGID4_FLAG_CONFIRMED_R bit in | |||
eir_flags. | eir_flags. | |||
When the EXCHGID4_FLAG_SUPP_MOVED_REFER flag bit is set, the client | When the EXCHGID4_FLAG_SUPP_MOVED_REFER flag bit is set, the client | |||
indicates that it is capable of dealing with an NFS4ERR_MOVED error | indicates that it is capable of dealing with an NFS4ERR_MOVED error | |||
as part of a referral sequence. When this bit is not set, it is | as part of a referral sequence. When this bit is not set, it is | |||
skipping to change at page 488, line 29 | skipping to change at page 494, line 29 | |||
Multiple roles can be associated with the same client ID or with | Multiple roles can be associated with the same client ID or with | |||
different client IDs. Thus, if a client sends EXCHANGE_ID from the | different client IDs. Thus, if a client sends EXCHANGE_ID from the | |||
same client owner to the same server owner multiple times, but | same client owner to the same server owner multiple times, but | |||
specifies different pNFS roles each time, the server might return | specifies different pNFS roles each time, the server might return | |||
different client IDs. Given that different pNFS roles might have | different client IDs. Given that different pNFS roles might have | |||
different client IDs, the client may ask for different properties for | different client IDs, the client may ask for different properties for | |||
each role/client ID. | each role/client ID. | |||
The spa_how field of the eia_state_protect field specifies how the | The spa_how field of the eia_state_protect field specifies how the | |||
client wants to protect its client, locking and session state from | client wants to protect its client, locking and session state from | |||
unauthorized changes (Section 2.10.7.3): | unauthorized changes (Section 2.10.8.3): | |||
o SP4_NONE. The client does not request the NFSv4.1 server to | o SP4_NONE. The client does not request the NFSv4.1 server to | |||
enforce state protection. The NFSv4.1 server MUST NOT enforce | enforce state protection. The NFSv4.1 server MUST NOT enforce | |||
state protection for the returned client ID. | state protection for the returned client ID. | |||
o SP4_MACH_CRED. This choice is only valid if the client sent the | o SP4_MACH_CRED. This choice is only valid if the client sent the | |||
request with RPCSEC_GSS as the security flavor, and with a service | request with RPCSEC_GSS as the security flavor, and with a service | |||
of RPC_GSS_SVC_INTEGRITY or RPC_GSS_SVC_PRIVACY. The client wants | of RPC_GSS_SVC_INTEGRITY or RPC_GSS_SVC_PRIVACY. The client wants | |||
to use an RPCSEC_GSS-based machine credential to protect its | to use an RPCSEC_GSS-based machine credential to protect its | |||
state. The server MUST note the principal the EXCHANGE_ID | state. The server MUST note the principal the EXCHANGE_ID | |||
skipping to change at page 491, line 20 | skipping to change at page 497, line 20 | |||
return NFS4ERR_INVAL. The server responds with spi_window, which | return NFS4ERR_INVAL. The server responds with spi_window, which | |||
MUST NOT exceed ssp_window, and MUST be at least one (1). Any | MUST NOT exceed ssp_window, and MUST be at least one (1). Any | |||
requests on the backchannel or fore channel that are using a | requests on the backchannel or fore channel that are using a | |||
version of the SSV that is outside the window will fail with an | version of the SSV that is outside the window will fail with an | |||
ONC RPC authentication error, and the requester will have to retry | ONC RPC authentication error, and the requester will have to retry | |||
them with the same slot ID and sequence ID. | them with the same slot ID and sequence ID. | |||
ssp_num_gss_handles: | ssp_num_gss_handles: | |||
This is the number of RPCSEC_GSS handles the server should create | This is the number of RPCSEC_GSS handles the server should create | |||
that are based on the GSS SSV mechanism (Section 2.10.8). It is | that are based on the GSS SSV mechanism (Section 2.10.9). It is | |||
not the total number of RPCSEC_GSS handles for the client ID. | not the total number of RPCSEC_GSS handles for the client ID. | |||
Indeed, subsequent calls to EXCHANGE_ID will add RPCSEC_GSS | Indeed, subsequent calls to EXCHANGE_ID will add RPCSEC_GSS | |||
handles. The server responds with a list of handles in | handles. The server responds with a list of handles in | |||
spi_handles. If the client asks for at least one handle and the | spi_handles. If the client asks for at least one handle and the | |||
server cannot create it, the server MUST return an error. The | server cannot create it, the server MUST return an error. The | |||
handles in spi_handles are not available for use until the client | handles in spi_handles are not available for use until the client | |||
ID is confirmed, which could be immediately if EXCHANGE_ID returns | ID is confirmed, which could be immediately if EXCHANGE_ID returns | |||
EXCHGID4_FLAG_CONFIRMED_R, or upon successful confirmation from | EXCHGID4_FLAG_CONFIRMED_R, or upon successful confirmation from | |||
CREATE_SESSION. While a client ID can span all the connections | CREATE_SESSION. While a client ID can span all the connections | |||
that are connected to a server sharing the same | that are connected to a server sharing the same | |||
skipping to change at page 502, line 5 | skipping to change at page 508, line 5 | |||
The maximum size of a COMPOUND or CB_COMPOUND request that will | The maximum size of a COMPOUND or CB_COMPOUND request that will | |||
be sent. This size represents the XDR encoded size of the | be sent. This size represents the XDR encoded size of the | |||
request, including the RPC headers (including security flavor | request, including the RPC headers (including security flavor | |||
credentials and verifiers) but excludes any RPC transport | credentials and verifiers) but excludes any RPC transport | |||
framing headers. Imagine a request coming over a non-RDMA | framing headers. Imagine a request coming over a non-RDMA | |||
TCP/IP connection, and that it has a single Record Marking | TCP/IP connection, and that it has a single Record Marking | |||
header preceding it. The maximum allowable count encoded in | header preceding it. The maximum allowable count encoded in | |||
the header will be ca_maxrequestsize. If a requester sends a | the header will be ca_maxrequestsize. If a requester sends a | |||
request that exceeds ca_maxrequestsize, the error | request that exceeds ca_maxrequestsize, the error | |||
NFS4ERR_REQ_TOO_BIG will be returned per the description in | NFS4ERR_REQ_TOO_BIG will be returned per the description in | |||
Section 2.10.5.4. | Section 2.10.6.4. | |||
ca_maxresponsesize: | ca_maxresponsesize: | |||
The maximum size of a COMPOUND or CB_COMPOUND reply that the | The maximum size of a COMPOUND or CB_COMPOUND reply that the | |||
requester will accept from the replier including RPC headers | requester will accept from the replier including RPC headers | |||
(see the ca_maxrequestsize definition). The NFSv4.1 server | (see the ca_maxrequestsize definition). The NFSv4.1 server | |||
MUST NOT increase the value of this parameter in the | MUST NOT increase the value of this parameter in the | |||
CREATE_SESSION results. However, if the client selects a value | CREATE_SESSION results. However, if the client selects a value | |||
for ca_maxresponsesize such that a replier on a channel could | for ca_maxresponsesize such that a replier on a channel could | |||
never send a response, the server SHOULD return | never send a response, the server SHOULD return | |||
NFS4ERR_TOOSMALL in the CREATE_SESSION reply. If a requester | NFS4ERR_TOOSMALL in the CREATE_SESSION reply. If a requester | |||
sends a request for which the size of the reply would exceed | sends a request for which the size of the reply would exceed | |||
this value, the replier will return NFS4ERR_REP_TOO_BIG, per | this value, the replier will return NFS4ERR_REP_TOO_BIG, per | |||
the description in Section 2.10.5.4. | the description in Section 2.10.6.4. | |||
ca_maxresponsesize_cached: | ca_maxresponsesize_cached: | |||
Like ca_maxresponsesize, but the maximum size of a reply that | Like ca_maxresponsesize, but the maximum size of a reply that | |||
will be stored in the reply cache (Section 2.10.5.1). If the | will be stored in the reply cache (Section 2.10.6.1). If the | |||
reply to CREATE_SESSION has ca_maxresponsesize_cached less than | reply to CREATE_SESSION has ca_maxresponsesize_cached less than | |||
ca_maxresponsesize, then this is an indication to the requester | ca_maxresponsesize, then this is an indication to the requester | |||
on the channel that it needs to be selective about which | on the channel that it needs to be selective about which | |||
replies it directs the replier to cache; for example large | replies it directs the replier to cache; for example large | |||
replies from nonidempotent operations (e.g. COMPOUND requests | replies from nonidempotent operations (e.g. COMPOUND requests | |||
with a READ operation), should not be cached. The requester | with a READ operation), should not be cached. The requester | |||
decides which replies to cache via an argument to the SEQUENCE | decides which replies to cache via an argument to the SEQUENCE | |||
(the sa_cachethis field, see Section 18.46) or CB_SEQUENCE (the | (the sa_cachethis field, see Section 18.46) or CB_SEQUENCE (the | |||
csa_cachethis field, see Section 20.9) operations. If a | csa_cachethis field, see Section 20.9) operations. If a | |||
requester sends a request for which the size of the reply would | requester sends a request for which the size of the reply would | |||
exceed this value, the replier will return | exceed this value, the replier will return | |||
NFS4ERR_REP_TOO_BIG_TO_CACHE, per the description in | NFS4ERR_REP_TOO_BIG_TO_CACHE, per the description in | |||
Section 2.10.5.4. | Section 2.10.6.4. | |||
ca_maxoperations: | ca_maxoperations: | |||
The maximum number of operations the replier will accept in a | The maximum number of operations the replier will accept in a | |||
COMPOUND or CB_COMPOUND. The server MUST NOT increase | COMPOUND or CB_COMPOUND. The server MUST NOT increase | |||
ca_maxoperations in the reply to CREATE_SESSION. If the | ca_maxoperations in the reply to CREATE_SESSION. If the | |||
requester sends a COMPOUND or CB_COMPOUND with more operations | requester sends a COMPOUND or CB_COMPOUND with more operations | |||
than ca_maxoperations, the replier MUST return | than ca_maxoperations, the replier MUST return | |||
NFS4ERR_TOO_MANY_OPS. | NFS4ERR_TOO_MANY_OPS. | |||
skipping to change at page 509, line 13 | skipping to change at page 515, line 13 | |||
has no remaining associated sessions, the connection MAY be closed by | has no remaining associated sessions, the connection MAY be closed by | |||
the server. Locks, delegations, layouts, wants, and the lease, which | the server. Locks, delegations, layouts, wants, and the lease, which | |||
are all tied to the client ID, are not affected by DESTROY_SESSION. | are all tied to the client ID, are not affected by DESTROY_SESSION. | |||
DESTROY_SESSION MUST be invoked on a connection that is associated | DESTROY_SESSION MUST be invoked on a connection that is associated | |||
with the session being destroyed. In addition if SP4_MACH_CRED state | with the session being destroyed. In addition if SP4_MACH_CRED state | |||
protection was specified when the client ID was created, the | protection was specified when the client ID was created, the | |||
RPCSEC_GSS principal that created the session MUST be the one that | RPCSEC_GSS principal that created the session MUST be the one that | |||
destroys the session, using RPCSEC_GSS privacy or integrity. If | destroys the session, using RPCSEC_GSS privacy or integrity. If | |||
SP4_SSV state protection was specified when the client ID was | SP4_SSV state protection was specified when the client ID was | |||
created, RPCSEC_GSS using the SSV mechanism (Section 2.10.8) MUST be | created, RPCSEC_GSS using the SSV mechanism (Section 2.10.9) MUST be | |||
used, with integrity or privacy. | used, with integrity or privacy. | |||
If the COMPOUND request starts with SEQUENCE, and if the sessionids | If the COMPOUND request starts with SEQUENCE, and if the sessionids | |||
specified in SEQUENCE and DESTROY_SESSION are the same, then | specified in SEQUENCE and DESTROY_SESSION are the same, then | |||
o DESTROY_SESSION MUST be the final operation in the COMPOUND | o DESTROY_SESSION MUST be the final operation in the COMPOUND | |||
request. | request. | |||
o It is advisable to not place DESTROY_SESSION in a COMPOUND request | o It is advisable to not place DESTROY_SESSION in a COMPOUND request | |||
with other state-modifying operations, because the DESTROY_SESSION | with other state-modifying operations, because the DESTROY_SESSION | |||
skipping to change at page 538, line 37 | skipping to change at page 544, line 37 | |||
The sa_slotid argument is the index in the reply cache for the | The sa_slotid argument is the index in the reply cache for the | |||
request. The sa_sequenceid field is the sequence number of the | request. The sa_sequenceid field is the sequence number of the | |||
request for the reply cache entry (slot). The sr_slotid result MUST | request for the reply cache entry (slot). The sr_slotid result MUST | |||
equal sa_slotid. The sr_sequenceid result MUST equal sa_sequenceid. | equal sa_slotid. The sr_sequenceid result MUST equal sa_sequenceid. | |||
The sa_highest_slotid argument is the highest slot ID the client has | The sa_highest_slotid argument is the highest slot ID the client has | |||
a request outstanding for; it could be equal to sa_slotid. The | a request outstanding for; it could be equal to sa_slotid. The | |||
server returns two "highest_slotid" values: sr_highest_slotid, and | server returns two "highest_slotid" values: sr_highest_slotid, and | |||
sr_target_highest_slotid. The former is the highest slot ID the | sr_target_highest_slotid. The former is the highest slot ID the | |||
server will accept in future SEQUENCE operation, and SHOULD NOT be | server will accept in future SEQUENCE operation, and SHOULD NOT be | |||
less than the value of sa_highest_slotid. (but see Section 2.10.5.1 | less than the value of sa_highest_slotid. (but see Section 2.10.6.1 | |||
for an exception). The latter is the highest slot ID the server | for an exception). The latter is the highest slot ID the server | |||
would prefer the client use on a future SEQUENCE operation. | would prefer the client use on a future SEQUENCE operation. | |||
If sa_cachethis is TRUE, then the client is requesting that the | If sa_cachethis is TRUE, then the client is requesting that the | |||
server cache the entire reply in the server's reply cache; therefore | server cache the entire reply in the server's reply cache; therefore | |||
the server MUST cache the reply (see Section 2.10.5.1.3). The server | the server MUST cache the reply (see Section 2.10.6.1.3). The server | |||
MAY cache the reply if sa_cachethis is FALSE. If the server does not | MAY cache the reply if sa_cachethis is FALSE. If the server does not | |||
cache the entire reply, it MUST still record that it executed the | cache the entire reply, it MUST still record that it executed the | |||
request at the specified slot and sequence ID. | request at the specified slot and sequence ID. | |||
The response to the SEQUENCE operation contains a word of status | The response to the SEQUENCE operation contains a word of status | |||
flags (sr_status_flags) that can provide to the client information | flags (sr_status_flags) that can provide to the client information | |||
related to the status of the client's lock state and communications | related to the status of the client's lock state and communications | |||
paths. Note that any status bits relating to lock state MAY be reset | paths. Note that any status bits relating to lock state MAY be reset | |||
when lock state is lost due to a server restart (even if the session | when lock state is lost due to a server restart (even if the session | |||
is persistent across restarts; session persistence does not imply | is persistent across restarts; session persistence does not imply | |||
skipping to change at page 543, line 26 | skipping to change at page 549, line 26 | |||
case NFS4_OK: | case NFS4_OK: | |||
SET_SSV4resok ssr_resok4; | SET_SSV4resok ssr_resok4; | |||
default: | default: | |||
void; | void; | |||
}; | }; | |||
18.47.3. DESCRIPTION | 18.47.3. DESCRIPTION | |||
This operation is used to update the SSV for a client ID. Before | This operation is used to update the SSV for a client ID. Before | |||
SET_SSV is called the first time on a client ID, the SSV is zero (0). | SET_SSV is called the first time on a client ID, the SSV is zero (0). | |||
The SSV is the key used for the SSV GSS mechanism (Section 2.10.8) | The SSV is the key used for the SSV GSS mechanism (Section 2.10.9) | |||
SET_SSV MUST be preceded by a SEQUENCE operation in the same | SET_SSV MUST be preceded by a SEQUENCE operation in the same | |||
COMPOUND. It MUST NOT be used if the client did not opt for SP4_SSV | COMPOUND. It MUST NOT be used if the client did not opt for SP4_SSV | |||
state protection when the client ID was created (see Section 18.35); | state protection when the client ID was created (see Section 18.35); | |||
the server returns NFS4ERR_INVAL in that case. | the server returns NFS4ERR_INVAL in that case. | |||
The field ssa_digest is computed as the output of the HMAC RFC2104 | The field ssa_digest is computed as the output of the HMAC RFC2104 | |||
[11] using the subkey derived from the SSV4_SUBKEY_MIC_I2T and | [11] using the subkey derived from the SSV4_SUBKEY_MIC_I2T and | |||
current SSV as the key (See Section 2.10.8 for a description of | current SSV as the key (See Section 2.10.9 for a description of | |||
subkeys), and an XDR encoded value of data type ssa_digest_input4. | subkeys), and an XDR encoded value of data type ssa_digest_input4. | |||
The field sdi_seqargs is equal to the arguments of the SEQUENCE | The field sdi_seqargs is equal to the arguments of the SEQUENCE | |||
operation for the COMPOUND procedure that SET_SSV is within. | operation for the COMPOUND procedure that SET_SSV is within. | |||
The argument ssa_ssv is XORed with the current SSV to produce the new | The argument ssa_ssv is XORed with the current SSV to produce the new | |||
SSV. The argument ssa_ssv SHOULD be generated randomly. | SSV. The argument ssa_ssv SHOULD be generated randomly. | |||
In the response, ssr_digest is the output of the HMAC using the | In the response, ssr_digest is the output of the HMAC using the | |||
subkey derived from SSV4_SUBKEY_MIC_T2I and new SSV as the key, and | subkey derived from SSV4_SUBKEY_MIC_T2I and new SSV as the key, and | |||
an XDR encoded value of data type ssr_digest_input4. The field | an XDR encoded value of data type ssr_digest_input4. The field | |||
skipping to change at page 544, line 9 | skipping to change at page 550, line 9 | |||
COMPOUND procedure that SET_SSV is within. | COMPOUND procedure that SET_SSV is within. | |||
As noted in Section 18.35, the client and server can maintain | As noted in Section 18.35, the client and server can maintain | |||
multiple concurrent versions of the SSV. The client and server each | multiple concurrent versions of the SSV. The client and server each | |||
MUST maintain an internal SSV version number, which is set to one (1) | MUST maintain an internal SSV version number, which is set to one (1) | |||
the first time SET_SSV executes on the server and the client receives | the first time SET_SSV executes on the server and the client receives | |||
the first SET_SSV reply. Each subsequent SET_SSV increases the | the first SET_SSV reply. Each subsequent SET_SSV increases the | |||
internal SSV version number by one (1). The value of this version | internal SSV version number by one (1). The value of this version | |||
number corresponds to the smpt_ssv_seq, smt_ssv_seq, sspt_ssv_seq, | number corresponds to the smpt_ssv_seq, smt_ssv_seq, sspt_ssv_seq, | |||
and ssct_ssv_seq fields of the SSV GSS mechanism tokens (see | and ssct_ssv_seq fields of the SSV GSS mechanism tokens (see | |||
Section 2.10.8). | Section 2.10.9). | |||
18.47.4. IMPLEMENTATION | 18.47.4. IMPLEMENTATION | |||
When the server receives ssa_digest, it MUST verify the digest by | When the server receives ssa_digest, it MUST verify the digest by | |||
computing the digest the same way the client did and comparing it | computing the digest the same way the client did and comparing it | |||
with ssa_digest. If the server gets a different result, this is an | with ssa_digest. If the server gets a different result, this is an | |||
error, NFS4ERR_BAD_SESSION_DIGEST. This error might be the result of | error, NFS4ERR_BAD_SESSION_DIGEST. This error might be the result of | |||
another SET_SSV from the same client ID changing the SSV. If so, the | another SET_SSV from the same client ID changing the SSV. If so, the | |||
client recovers by issuing SET_SSV again with a recomputed digest | client recovers by issuing SET_SSV again with a recomputed digest | |||
based on the subkey of the new SSV. If the transport connection is | based on the subkey of the new SSV. If the transport connection is | |||
skipping to change at page 544, line 38 | skipping to change at page 550, line 38 | |||
is created). | is created). | |||
Clients SHOULD send SET_SSV with RPCSEC_GSS privacy. Servers MUST | Clients SHOULD send SET_SSV with RPCSEC_GSS privacy. Servers MUST | |||
support RPCSEC_GSS with privacy for any COMPOUND that has { SEQUENCE, | support RPCSEC_GSS with privacy for any COMPOUND that has { SEQUENCE, | |||
SET_SSV }. | SET_SSV }. | |||
A client SHOULD NOT send SET_SSV with the SSV GSS mechanism's | A client SHOULD NOT send SET_SSV with the SSV GSS mechanism's | |||
credential because the purpose of SET_SSV is to seed the SSV from | credential because the purpose of SET_SSV is to seed the SSV from | |||
non-SSV credentials. Instead SET_SSV SHOULD be sent with the | non-SSV credentials. Instead SET_SSV SHOULD be sent with the | |||
credential of a user that is accessing the client ID for the first | credential of a user that is accessing the client ID for the first | |||
time (Section 2.10.7.3). However if the client does send SET_SSV | time (Section 2.10.8.3). However if the client does send SET_SSV | |||
with SSV credentials, the digest protecting the arguments uses the | with SSV credentials, the digest protecting the arguments uses the | |||
value of the SSV before ssa_ssv is XORed in, and the digest | value of the SSV before ssa_ssv is XORed in, and the digest | |||
protecting the results uses the value of the SSV after the ssa_ssv is | protecting the results uses the value of the SSV after the ssa_ssv is | |||
XORed in. | XORed in. | |||
18.48. Operation 55: TEST_STATEID - Test stateids for validity | 18.48. Operation 55: TEST_STATEID - Test stateids for validity | |||
Test a series of stateids for validity. | Test a series of stateids for validity. | |||
18.48.1. ARGUMENT | 18.48.1. ARGUMENT | |||
skipping to change at page 557, line 20 | skipping to change at page 563, line 20 | |||
19.2.3. DESCRIPTION | 19.2.3. DESCRIPTION | |||
The CB_COMPOUND procedure is used to combine one or more of the | The CB_COMPOUND procedure is used to combine one or more of the | |||
callback procedures into a single RPC request. The main callback RPC | callback procedures into a single RPC request. The main callback RPC | |||
program has two main procedures: CB_NULL and CB_COMPOUND. All other | program has two main procedures: CB_NULL and CB_COMPOUND. All other | |||
operations use the CB_COMPOUND procedure as a wrapper. | operations use the CB_COMPOUND procedure as a wrapper. | |||
During the processing of the CB_COMPOUND procedure, the client may | During the processing of the CB_COMPOUND procedure, the client may | |||
find that it does not have the available resources to execute any or | find that it does not have the available resources to execute any or | |||
all of the operations within the CB_COMPOUND sequence. Refer to | all of the operations within the CB_COMPOUND sequence. Refer to | |||
Section 2.10.5.4 for details. | Section 2.10.6.4 for details. | |||
The minorversion field of the arguments MUST be the same as the | The minorversion field of the arguments MUST be the same as the | |||
minorversion of the COMPOUND procedure used to created the client ID | minorversion of the COMPOUND procedure used to created the client ID | |||
and session. For NFSv4.1, minorversion MUST be set to 1. | and session. For NFSv4.1, minorversion MUST be set to 1. | |||
Contained within the CB_COMPOUND results is a 'status' field. This | Contained within the CB_COMPOUND results is a 'status' field. This | |||
status must be equivalent to the status of the last operation that | status must be equivalent to the status of the last operation that | |||
was executed within the CB_COMPOUND procedure. Therefore, if an | was executed within the CB_COMPOUND procedure. Therefore, if an | |||
operation incurred an error then the 'status' value will be the same | operation incurred an error then the 'status' value will be the same | |||
error value as is being returned for the operation that failed. | error value as is being returned for the operation that failed. | |||
skipping to change at page 575, line 36 | skipping to change at page 581, line 36 | |||
contents include the session ID to which this request belongs, the | contents include the session ID to which this request belongs, the | |||
slot ID and sequence ID used by the server to implement session | slot ID and sequence ID used by the server to implement session | |||
request control and exactly once semantics, and exchanged slot ID | request control and exactly once semantics, and exchanged slot ID | |||
maxima which are used to adjust the size of the reply cache. This | maxima which are used to adjust the size of the reply cache. This | |||
operation will appear once as the first operation in each CB_COMPOUND | operation will appear once as the first operation in each CB_COMPOUND | |||
request or a protocol error MUST result. See Section 18.46.3 for a | request or a protocol error MUST result. See Section 18.46.3 for a | |||
description of how slots are processed. | description of how slots are processed. | |||
If csa_cachethis is TRUE, then the server is requesting that the | If csa_cachethis is TRUE, then the server is requesting that the | |||
client cache the reply in the callback reply cache. The client MUST | client cache the reply in the callback reply cache. The client MUST | |||
cache the reply (see Section 2.10.5.1.3). | cache the reply (see Section 2.10.6.1.3). | |||
The csa_referring_call_lists array is the list of COMPOUND requests, | The csa_referring_call_lists array is the list of COMPOUND requests, | |||
identified by session ID, slot ID and sequence ID. These are | identified by session ID, slot ID and sequence ID. These are | |||
requests that the client previously sent to the server. These | requests that the client previously sent to the server. These | |||
previous requests created state that some operation(s) in the same | previous requests created state that some operation(s) in the same | |||
CB_COMPOUND as the csa_referring_call_lists are identifying. A | CB_COMPOUND as the csa_referring_call_lists are identifying. A | |||
session ID is included because leased state is tied to a client ID, | session ID is included because leased state is tied to a client ID, | |||
and a client ID can have multiple sessions. See Section 2.10.5.3. | and a client ID can have multiple sessions. See Section 2.10.6.3. | |||
The value of the csa_sequenceid argument relative to the cached | The value of the csa_sequenceid argument relative to the cached | |||
sequence ID on the slot falls into one of three cases. | sequence ID on the slot falls into one of three cases. | |||
o If the difference between csa_sequenceid and the client's cached | o If the difference between csa_sequenceid and the client's cached | |||
sequence ID at the slot ID is two (2) or more, or if | sequence ID at the slot ID is two (2) or more, or if | |||
csa_sequenceid is less than the cached sequence ID (accounting for | csa_sequenceid is less than the cached sequence ID (accounting for | |||
wraparound of the unsigned sequence ID value), then the client | wraparound of the unsigned sequence ID value), then the client | |||
MUST return NFS4ERR_SEQ_MISORDERED. | MUST return NFS4ERR_SEQ_MISORDERED. | |||
skipping to change at page 576, line 32 | skipping to change at page 582, line 32 | |||
of what it has already executed. The client MAY however detect the | of what it has already executed. The client MAY however detect the | |||
server's illegal reuse and return NFS4ERR_SEQ_FALSE_RETRY. | server's illegal reuse and return NFS4ERR_SEQ_FALSE_RETRY. | |||
If CB_SEQUENCE returns an error, then the state of the slot (sequence | If CB_SEQUENCE returns an error, then the state of the slot (sequence | |||
ID, cached reply) MUST NOT change. | ID, cached reply) MUST NOT change. | |||
The client returns two "highest_slotid" values: csr_highest_slotid, | The client returns two "highest_slotid" values: csr_highest_slotid, | |||
and csr_target_highest_slotid. The former is the highest slot ID the | and csr_target_highest_slotid. The former is the highest slot ID the | |||
client will accept in a future CB_SEQUENCE operation, and SHOULD NOT | client will accept in a future CB_SEQUENCE operation, and SHOULD NOT | |||
be less than the value of csa_highest_slotid (but see | be less than the value of csa_highest_slotid (but see | |||
Section 2.10.5.1 for an exception). The latter is the highest slot | Section 2.10.6.1 for an exception). The latter is the highest slot | |||
ID the client would prefer the server use on a future CB_SEQUENCE | ID the client would prefer the server use on a future CB_SEQUENCE | |||
operation. | operation. | |||
20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending Delegation | 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending Delegation | |||
Wants | Wants | |||
Retracts promise to signal delegation availability. | Retracts promise to signal delegation availability. | |||
20.10.1. ARGUMENT | 20.10.1. ARGUMENT | |||
skipping to change at page 582, line 49 | skipping to change at page 588, line 49 | |||
protection is any GETATTR for the fs_locations and | protection is any GETATTR for the fs_locations and | |||
fs_locations_info attributes. The attack has two steps. First | fs_locations_info attributes. The attack has two steps. First | |||
the attacker modifies the unprotected results of some operation to | the attacker modifies the unprotected results of some operation to | |||
return NFS4ERR_MOVED. Second, when the client follows up with a | return NFS4ERR_MOVED. Second, when the client follows up with a | |||
GETATTR for the fs_locations or fs_locations_info attributes, the | GETATTR for the fs_locations or fs_locations_info attributes, the | |||
attacker modifies the results to cause the client migrate its | attacker modifies the results to cause the client migrate its | |||
traffic to a server controlled by the attacker. | traffic to a server controlled by the attacker. | |||
Relative to previous NFS versions, NFSv4.1 has additional security | Relative to previous NFS versions, NFSv4.1 has additional security | |||
considerations for pNFS (see Section 12.9 and Section 13.12), locking | considerations for pNFS (see Section 12.9 and Section 13.12), locking | |||
and session state (see Section 2.10.7.3). | and session state (see Section 2.10.8.3). | |||
22. IANA Considerations | 22. IANA Considerations | |||
This section uses terms that are defined in [43]. | This section uses terms that are defined in [43]. | |||
22.1. Named Attribute Definitions | 22.1. Named Attribute Definitions | |||
IANA will create a registry called the "NFSv4 Named Attribute | IANA will create a registry called the "NFSv4 Named Attribute | |||
Definitions Registry". | Definitions Registry". | |||
End of changes. 169 change blocks. | ||||
520 lines changed or deleted | 800 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |