Found wdiff, but it reported no recognisable version. Falling back to builtin diff colouring...
| draft-pre-ch5.txt | draft-ietf-nfsv4-minorversion1-20.txt | |||
|---|---|---|---|---|
| NFSv4 S. Shepler | NFSv4 S. Shepler | |||
| Internet-Draft M. Eisler | Internet-Draft M. Eisler | |||
| Intended status: Standards Track D. Noveck | Intended status: Standards Track D. Noveck | |||
| Expires: August 24, 2008 Editors | Expires: August 25, 2008 Editors | |||
| February 21, 2008 | February 22, 2008 | |||
| NFS Version 4 Minor Version 1 | NFS Version 4 Minor Version 1 | |||
| draft-ietf-nfsv4-minorversion1-20.txt | draft-ietf-nfsv4-minorversion1-20.txt | |||
| Status of this Memo | Status of this Memo | |||
| By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
| applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
| have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
| aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
| skipping to change at page 1, line 35 | skipping to change at page 1, line 35 | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on August 24, 2008. | This Internet-Draft will expire on August 25, 2008. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (C) The IETF Trust (2008). | Copyright (C) The IETF Trust (2008). | |||
| Abstract | Abstract | |||
| This Internet-Draft describes NFS version 4 minor version one, | This Internet-Draft describes NFS version 4 minor version one, | |||
| including features retained from the base protocol and protocol | including features retained from the base protocol and protocol | |||
| extensions made subsequently. Major extensions introduced in NFS | extensions made subsequently. Major extensions introduced in NFS | |||
| skipping to change at page 3, line 6 | skipping to change at page 3, line 6 | |||
| 2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 37 | 2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 37 | |||
| 2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 37 | 2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 37 | |||
| 2.9.2. Client and Server Transport Behavior . . . . . . . . 37 | 2.9.2. Client and Server Transport Behavior . . . . . . . . 37 | |||
| 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 39 | 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 39 | |||
| 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 39 | 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 39 | |||
| 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 39 | 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 39 | |||
| 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 40 | 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 40 | |||
| 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 42 | 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 42 | |||
| 2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 43 | 2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 43 | |||
| 2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 46 | 2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 46 | |||
| 2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 58 | 2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 59 | |||
| 2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 61 | 2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 61 | |||
| 2.10.8. The SSV GSS Mechanism . . . . . . . . . . . . . . . 66 | 2.10.8. The SSV GSS Mechanism . . . . . . . . . . . . . . . 67 | |||
| 2.10.9. Session Mechanics - Steady State . . . . . . . . . . 71 | 2.10.9. Session Mechanics - Steady State . . . . . . . . . . 71 | |||
| 2.10.10. Session Mechanics - Recovery . . . . . . . . . . . . 72 | 2.10.10. Session Inactivity Timer . . . . . . . . . . . . . . 73 | |||
| 2.10.11. Parallel NFS and Sessions . . . . . . . . . . . . . 76 | 2.10.11. Session Mechanics - Recovery . . . . . . . . . . . . 73 | |||
| 2.10.12. Parallel NFS and Sessions . . . . . . . . . . . . . 76 | ||||
| 3. Protocol Constants and Data Types . . . . . . . . . . . . . . 76 | 3. Protocol Constants and Data Types . . . . . . . . . . . . . . 76 | |||
| 3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 76 | 3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 77 | |||
| 3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 77 | 3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 77 | |||
| 3.3. Structured Data Types . . . . . . . . . . . . . . . . . 79 | 3.3. Structured Data Types . . . . . . . . . . . . . . . . . 79 | |||
| 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 88 | 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 88 | |||
| 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 88 | 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 88 | |||
| 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 89 | 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 89 | |||
| 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 89 | 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 89 | |||
| 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 89 | 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 89 | |||
| 4.2.1. General Properties of a Filehandle . . . . . . . . . 90 | 4.2.1. General Properties of a Filehandle . . . . . . . . . 90 | |||
| 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 91 | 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 91 | |||
| 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 91 | 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 91 | |||
| skipping to change at page 6, line 39 | skipping to change at page 6, line 40 | |||
| 12.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 263 | 12.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 263 | |||
| 12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 263 | 12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 263 | |||
| 12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 264 | 12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 264 | |||
| 12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 265 | 12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 265 | |||
| 12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 266 | 12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 266 | |||
| 12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 266 | 12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 266 | |||
| 12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 266 | 12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 266 | |||
| 12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 268 | 12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 268 | |||
| 12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 269 | 12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 269 | |||
| 12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 270 | 12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 270 | |||
| 12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 272 | 12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 273 | |||
| 12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 279 | 12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 280 | |||
| 12.5.7. Metadata Server Write Propagation . . . . . . . . . 279 | 12.5.7. Metadata Server Write Propagation . . . . . . . . . 280 | |||
| 12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 280 | 12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 280 | |||
| 12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 281 | 12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 282 | |||
| 12.7.1. Recovery from Client Restart . . . . . . . . . . . . 282 | 12.7.1. Recovery from Client Restart . . . . . . . . . . . . 282 | |||
| 12.7.2. Dealing with Lease Expiration on the Client . . . . 282 | 12.7.2. Dealing with Lease Expiration on the Client . . . . 282 | |||
| 12.7.3. Dealing with Loss of Layout State on the Metadata | 12.7.3. Dealing with Loss of Layout State on the Metadata | |||
| Server . . . . . . . . . . . . . . . . . . . . . . . 283 | Server . . . . . . . . . . . . . . . . . . . . . . . 283 | |||
| 12.7.4. Recovery from Metadata Server Restart . . . . . . . 284 | 12.7.4. Recovery from Metadata Server Restart . . . . . . . 284 | |||
| 12.7.5. Operations During Metadata Server Grace Period . . . 286 | 12.7.5. Operations During Metadata Server Grace Period . . . 286 | |||
| 12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 286 | 12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 286 | |||
| 12.8. Metadata and Storage Device Roles . . . . . . . . . . . 286 | 12.8. Metadata and Storage Device Roles . . . . . . . . . . . 287 | |||
| 12.9. Security Considerations for pNFS . . . . . . . . . . . . 287 | 12.9. Security Considerations for pNFS . . . . . . . . . . . . 287 | |||
| 13. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 288 | 13. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 288 | |||
| 13.1. Client ID and Session Considerations . . . . . . . . . . 288 | 13.1. Client ID and Session Considerations . . . . . . . . . . 288 | |||
| 13.2. File Layout Definitions . . . . . . . . . . . . . . . . 290 | 13.1.1. Sessions Considerations for Data Servers . . . . . . 291 | |||
| 13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 291 | 13.2. File Layout Definitions . . . . . . . . . . . . . . . . 291 | |||
| 13.4. Interpreting the File Layout . . . . . . . . . . . . . . 295 | 13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 292 | |||
| 13.4.1. Determining the Stripe Unit Number . . . . . . . . . 295 | 13.4. Interpreting the File Layout . . . . . . . . . . . . . . 296 | |||
| 13.4.2. Interpreting the File Layout Using Sparse Packing . 295 | 13.4.1. Determining the Stripe Unit Number . . . . . . . . . 296 | |||
| 13.4.3. Interpreting the File Layout Using Dense Packing . . 298 | 13.4.2. Interpreting the File Layout Using Sparse Packing . 296 | |||
| 13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 300 | 13.4.3. Interpreting the File Layout Using Dense Packing . . 299 | |||
| 13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 302 | 13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 301 | |||
| 13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 303 | 13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 303 | |||
| 13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 305 | 13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 304 | |||
| 13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 307 | 13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 306 | |||
| 13.9. Metadata and Data Server State Coordination . . . . . . 307 | 13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 308 | |||
| 13.9.1. Global Stateid Requirements . . . . . . . . . . . . 307 | 13.9. Metadata and Data Server State Coordination . . . . . . 308 | |||
| 13.9.2. Data Server State Propagation . . . . . . . . . . . 308 | 13.9.1. Global Stateid Requirements . . . . . . . . . . . . 308 | |||
| 13.10. Data Server Component File Size . . . . . . . . . . . . 310 | 13.9.2. Data Server State Propagation . . . . . . . . . . . 309 | |||
| 13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 311 | 13.10. Data Server Component File Size . . . . . . . . . . . . 311 | |||
| 13.12. Security Considerations for the File Layout Type . . . . 311 | 13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 312 | |||
| 14. Internationalization . . . . . . . . . . . . . . . . . . . . 312 | 13.12. Security Considerations for the File Layout Type . . . . 312 | |||
| 14.1. Stringprep profile for the utf8str_cs type . . . . . . . 313 | 14. Internationalization . . . . . . . . . . . . . . . . . . . . 313 | |||
| 14.2. Stringprep profile for the utf8str_cis type . . . . . . 315 | 14.1. Stringprep profile for the utf8str_cs type . . . . . . . 314 | |||
| 14.3. Stringprep profile for the utf8str_mixed type . . . . . 316 | 14.2. Stringprep profile for the utf8str_cis type . . . . . . 316 | |||
| 14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 318 | 14.3. Stringprep profile for the utf8str_mixed type . . . . . 317 | |||
| 14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 318 | 14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 319 | |||
| 15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 319 | 14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 319 | |||
| 15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 319 | 15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 320 | |||
| 15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 321 | 15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 320 | |||
| 15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 323 | 15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 322 | |||
| 15.1.3. Compound Structure Errors . . . . . . . . . . . . . 324 | 15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 324 | |||
| 15.1.4. File System Errors . . . . . . . . . . . . . . . . . 326 | 15.1.3. Compound Structure Errors . . . . . . . . . . . . . 325 | |||
| 15.1.5. State Management Errors . . . . . . . . . . . . . . 328 | 15.1.4. File System Errors . . . . . . . . . . . . . . . . . 327 | |||
| 15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 329 | 15.1.5. State Management Errors . . . . . . . . . . . . . . 329 | |||
| 15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 329 | 15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 330 | |||
| 15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 330 | 15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 330 | |||
| 15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 331 | 15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 331 | |||
| 15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 332 | 15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 332 | |||
| 15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 333 | 15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 333 | |||
| 15.1.12. Session Management Errors . . . . . . . . . . . . . 334 | 15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 334 | |||
| 15.1.13. Client Management Errors . . . . . . . . . . . . . . 335 | 15.1.12. Session Management Errors . . . . . . . . . . . . . 335 | |||
| 15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 336 | 15.1.13. Client Management Errors . . . . . . . . . . . . . . 336 | |||
| 15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 336 | 15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 337 | |||
| 15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 337 | 15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 337 | |||
| 15.2. Operations and their valid errors . . . . . . . . . . . 338 | 15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 338 | |||
| 15.3. Callback operations and their valid errors . . . . . . . 354 | 15.2. Operations and their valid errors . . . . . . . . . . . 339 | |||
| 15.4. Errors and the operations that use them . . . . . . . . 356 | 15.3. Callback operations and their valid errors . . . . . . . 355 | |||
| 16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 370 | 15.4. Errors and the operations that use them . . . . . . . . 357 | |||
| 16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 370 | 16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 371 | |||
| 16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 371 | 16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 371 | |||
| 17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 381 | 16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 372 | |||
| 18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 384 | 17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 382 | |||
| 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 384 | 18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 385 | |||
| 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 387 | 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 385 | |||
| 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 388 | 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 388 | |||
| 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 391 | 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 389 | |||
| 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 392 | ||||
| 18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting | 18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting | |||
| Recovery . . . . . . . . . . . . . . . . . . . . . . . . 394 | Recovery . . . . . . . . . . . . . . . . . . . . . . . . 395 | |||
| 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 395 | 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 396 | |||
| 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 395 | 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 396 | |||
| 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 397 | 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 398 | |||
| 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 398 | 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 399 | |||
| 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 400 | 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 401 | |||
| 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 404 | 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 405 | |||
| 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 406 | 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 407 | |||
| 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 407 | 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 408 | |||
| 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 409 | 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 410 | |||
| 18.15. Operation 17: NVERIFY - Verify Difference in | 18.15. Operation 17: NVERIFY - Verify Difference in | |||
| Attributes . . . . . . . . . . . . . . . . . . . . . . . 410 | Attributes . . . . . . . . . . . . . . . . . . . . . . . 411 | |||
| 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 411 | 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 412 | |||
| 18.17. Operation 19: OPENATTR - Open Named Attribute | 18.17. Operation 19: OPENATTR - Open Named Attribute | |||
| Directory . . . . . . . . . . . . . . . . . . . . . . . 430 | Directory . . . . . . . . . . . . . . . . . . . . . . . 431 | |||
| 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 431 | 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 432 | |||
| 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 432 | 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 433 | |||
| 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 433 | 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 434 | |||
| 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 435 | 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 436 | |||
| 18.22. Operation 25: READ - Read from File . . . . . . . . . . 435 | 18.22. Operation 25: READ - Read from File . . . . . . . . . . 436 | |||
| 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 438 | 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 439 | |||
| 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 441 | 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 442 | |||
| 18.25. Operation 28: REMOVE - Remove File System Object . . . . 442 | 18.25. Operation 28: REMOVE - Remove File System Object . . . . 443 | |||
| 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 445 | 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 446 | |||
| 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 448 | 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 449 | |||
| 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 449 | 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 450 | |||
| 18.29. Operation 33: SECINFO - Obtain Available Security . . . 450 | 18.29. Operation 33: SECINFO - Obtain Available Security . . . 451 | |||
| 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 453 | 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 454 | |||
| 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 456 | 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 457 | |||
| 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 457 | 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 458 | |||
| 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 462 | 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 463 | |||
| 18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 463 | 18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 464 | |||
| 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 466 | 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 467 | |||
| 18.36. Operation 43: CREATE_SESSION - Create New Session and | 18.36. Operation 43: CREATE_SESSION - Create New Session and | |||
| Confirm Client ID . . . . . . . . . . . . . . . . . . . 482 | Confirm Client ID . . . . . . . . . . . . . . . . . . . 483 | |||
| 18.37. Operation 44: DESTROY_SESSION - Destroy existing | 18.37. Operation 44: DESTROY_SESSION - Destroy existing | |||
| session . . . . . . . . . . . . . . . . . . . . . . . . 492 | session . . . . . . . . . . . . . . . . . . . . . . . . 493 | |||
| 18.38. Operation 45: FREE_STATEID - Free stateid with no | 18.38. Operation 45: FREE_STATEID - Free stateid with no | |||
| locks . . . . . . . . . . . . . . . . . . . . . . . . . 494 | locks . . . . . . . . . . . . . . . . . . . . . . . . . 495 | |||
| 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory | 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory | |||
| delegation . . . . . . . . . . . . . . . . . . . . . . . 495 | delegation . . . . . . . . . . . . . . . . . . . . . . . 496 | |||
| 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 499 | 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 500 | |||
| 18.41. Operation 48: GETDEVICELIST - Get All Device Mappings | 18.41. Operation 48: GETDEVICELIST - Get All Device Mappings | |||
| for a File System . . . . . . . . . . . . . . . . . . . 501 | for a File System . . . . . . . . . . . . . . . . . . . 502 | |||
| 18.42. Operation 49: LAYOUTCOMMIT - Commit writes made using | 18.42. Operation 49: LAYOUTCOMMIT - Commit writes made using | |||
| a layout . . . . . . . . . . . . . . . . . . . . . . . . 503 | a layout . . . . . . . . . . . . . . . . . . . . . . . . 504 | |||
| 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 506 | 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 507 | |||
| 18.44. Operation 51: LAYOUTRETURN - Release Layout | 18.44. Operation 51: LAYOUTRETURN - Release Layout | |||
| Information . . . . . . . . . . . . . . . . . . . . . . 510 | Information . . . . . . . . . . . . . . . . . . . . . . 511 | |||
| 18.45. Operation 52: SECINFO_NO_NAME - Get Security on | 18.45. Operation 52: SECINFO_NO_NAME - Get Security on | |||
| Unnamed Object . . . . . . . . . . . . . . . . . . . . . 515 | Unnamed Object . . . . . . . . . . . . . . . . . . . . . 516 | |||
| 18.46. Operation 53: SEQUENCE - Supply per-procedure | 18.46. Operation 53: SEQUENCE - Supply per-procedure | |||
| sequencing and control . . . . . . . . . . . . . . . . . 516 | sequencing and control . . . . . . . . . . . . . . . . . 517 | |||
| 18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 522 | 18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 523 | |||
| 18.48. Operation 55: TEST_STATEID - Test stateids for | 18.48. Operation 55: TEST_STATEID - Test stateids for | |||
| validity . . . . . . . . . . . . . . . . . . . . . . . . 524 | validity . . . . . . . . . . . . . . . . . . . . . . . . 525 | |||
| 18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 526 | 18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 527 | |||
| 18.50. Operation 57: DESTROY_CLIENTID - Destroy existing | 18.50. Operation 57: DESTROY_CLIENTID - Destroy existing | |||
| client ID . . . . . . . . . . . . . . . . . . . . . . . 529 | client ID . . . . . . . . . . . . . . . . . . . . . . . 530 | |||
| 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims | 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims | |||
| Finished . . . . . . . . . . . . . . . . . . . . . . . . 530 | Finished . . . . . . . . . . . . . . . . . . . . . . . . 531 | |||
| 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 532 | 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 533 | |||
| 19. NFSv44.1 Callback Procedures . . . . . . . . . . . . . . . . 533 | 19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 534 | |||
| 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 533 | 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 534 | |||
| 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 533 | 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 534 | |||
| 20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 538 | 20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 539 | |||
| 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 538 | 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 539 | |||
| 20.2. Operation 4: CB_RECALL - Recall an Open Delegation . . . 539 | 20.2. Operation 4: CB_RECALL - Recall an Open Delegation . . . 540 | |||
| 20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from | 20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from | |||
| Client . . . . . . . . . . . . . . . . . . . . . . . . . 540 | Client . . . . . . . . . . . . . . . . . . . . . . . . . 541 | |||
| 20.4. Operation 6: CB_NOTIFY - Notify directory changes . . . 544 | 20.4. Operation 6: CB_NOTIFY - Notify directory changes . . . 545 | |||
| 20.5. Operation 7: CB_PUSH_DELEG - Offer Delegation to | 20.5. Operation 7: CB_PUSH_DELEG - Offer Delegation to | |||
| Client . . . . . . . . . . . . . . . . . . . . . . . . . 548 | Client . . . . . . . . . . . . . . . . . . . . . . . . . 549 | |||
| 20.6. Operation 8: CB_RECALL_ANY - Keep any N delegations . . 549 | 20.6. Operation 8: CB_RECALL_ANY - Keep any N delegations . . 550 | |||
| 20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal | 20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal | |||
| Resources for Recallable Objects . . . . . . . . . . . . 551 | Resources for Recallable Objects . . . . . . . . . . . . 552 | |||
| 20.8. Operation 10: CB_RECALL_SLOT - change flow control | 20.8. Operation 10: CB_RECALL_SLOT - change flow control | |||
| limits . . . . . . . . . . . . . . . . . . . . . . . . . 552 | limits . . . . . . . . . . . . . . . . . . . . . . . . . 553 | |||
| 20.9. Operation 11: CB_SEQUENCE - Supply backchannel | 20.9. Operation 11: CB_SEQUENCE - Supply backchannel | |||
| sequencing and control . . . . . . . . . . . . . . . . . 553 | sequencing and control . . . . . . . . . . . . . . . . . 554 | |||
| 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending | 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending | |||
| Delegation Wants . . . . . . . . . . . . . . . . . . . . 555 | Delegation Wants . . . . . . . . . . . . . . . . . . . . 556 | |||
| 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible | 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible | |||
| lock availability . . . . . . . . . . . . . . . . . . . 556 | lock availability . . . . . . . . . . . . . . . . . . . 557 | |||
| 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID | 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID | |||
| changes . . . . . . . . . . . . . . . . . . . . . . . . 558 | changes . . . . . . . . . . . . . . . . . . . . . . . . 559 | |||
| 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback | 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback | |||
| Operation . . . . . . . . . . . . . . . . . . . . . . . 560 | Operation . . . . . . . . . . . . . . . . . . . . . . . 561 | |||
| 21. Security Considerations . . . . . . . . . . . . . . . . . . . 560 | 21. Security Considerations . . . . . . . . . . . . . . . . . . . 561 | |||
| 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 562 | 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 563 | |||
| 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 562 | 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 563 | |||
| 22.2. ONC RPC Network Identifiers (netids) . . . . . . . . . . 562 | 22.2. ONC RPC Network Identifiers (netids) . . . . . . . . . . 563 | |||
| 22.3. Defining New Notifications . . . . . . . . . . . . . . . 563 | 22.3. Defining New Notifications . . . . . . . . . . . . . . . 564 | |||
| 22.4. Defining New Layout Types . . . . . . . . . . . . . . . 563 | 22.4. Defining New Layout Types . . . . . . . . . . . . . . . 564 | |||
| 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 565 | 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 566 | |||
| 22.5.1. Path Variable Values . . . . . . . . . . . . . . . . 565 | 22.5.1. Path Variable Values . . . . . . . . . . . . . . . . 566 | |||
| 22.5.2. Path Variable Names . . . . . . . . . . . . . . . . 565 | 22.5.2. Path Variable Names . . . . . . . . . . . . . . . . 566 | |||
| 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 565 | 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 566 | |||
| 23.1. Normative References . . . . . . . . . . . . . . . . . . 565 | 23.1. Normative References . . . . . . . . . . . . . . . . . . 566 | |||
| 23.2. Informative References . . . . . . . . . . . . . . . . . 567 | 23.2. Informative References . . . . . . . . . . . . . . . . . 568 | |||
| Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 568 | Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 569 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 570 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 571 | |||
| Intellectual Property and Copyright Statements . . . . . . . . . 572 | Intellectual Property and Copyright Statements . . . . . . . . . 573 | |||
| 1. Introduction | 1. Introduction | |||
| 1.1. The NFS Version 4 Minor Version 1 Protocol | 1.1. The NFS Version 4 Minor Version 1 Protocol | |||
| The NFS version 4 minor version 1 (NFSv4.1) protocol is the second | The NFS version 4 minor version 1 (NFSv4.1) protocol is the second | |||
| minor version of the NFS version 4 (NFSv4) protocol. The first minor | minor version of the NFS version 4 (NFSv4) protocol. The first minor | |||
| version, NFSv4.0 is described in [21]. It generally follows the | version, NFSv4.0 is described in [21]. It generally follows the | |||
| guidelines for minor versioning model listed in Section 10 of RFC | guidelines for minor versioning model listed in Section 10 of RFC | |||
| 3530. However, it diverges from guidelines 11 ("a client and server | 3530. However, it diverges from guidelines 11 ("a client and server | |||
| skipping to change at page 27, line 45 | skipping to change at page 27, line 45 | |||
| the client ID in order to conserve resources. If the client contacts | the client ID in order to conserve resources. If the client contacts | |||
| the server after this release, the server must ensure the client | the server after this release, the server must ensure the client | |||
| receives the appropriate error so that it will use the EXCHANGE_ID/ | receives the appropriate error so that it will use the EXCHANGE_ID/ | |||
| CREATE_SESSION sequence to establish a new client ID. The server | CREATE_SESSION sequence to establish a new client ID. The server | |||
| ought to be very hesitant to release a client ID since the resulting | ought to be very hesitant to release a client ID since the resulting | |||
| work on the client to recover from such an event will be the same | work on the client to recover from such an event will be the same | |||
| burden as if the server had failed and restarted. Typically a server | burden as if the server had failed and restarted. Typically a server | |||
| would not release a client ID unless there had been no activity from | would not release a client ID unless there had been no activity from | |||
| that client for many minutes. As long as there are sessions, opens, | that client for many minutes. As long as there are sessions, opens, | |||
| locks, delegations, layouts, or wants, the server MUST NOT release | locks, delegations, layouts, or wants, the server MUST NOT release | |||
| the client ID. See Section 2.10.10.1.4 for discussion on releasing | the client ID. See Section 2.10.11.1.4 for discussion on releasing | |||
| inactive sessions. | inactive sessions. | |||
| 2.4.3. Resolving Client Owner Conflicts | 2.4.3. Resolving Client Owner Conflicts | |||
| When the server gets an EXCHANGE_ID for a client owner that currently | When the server gets an EXCHANGE_ID for a client owner that currently | |||
| has no state, or that has state, but the lease has expired, the | has no state, or that has state, but the lease has expired, the | |||
| server MUST allow the EXCHANGE_ID, and confirm the new client ID if | server MUST allow the EXCHANGE_ID, and confirm the new client ID if | |||
| followed by the appropriate CREATE_SESSION. | followed by the appropriate CREATE_SESSION. | |||
| When the server gets an EXCHANGE_ID for a new incarnation of a client | When the server gets an EXCHANGE_ID for a new incarnation of a client | |||
| skipping to change at page 46, line 43 | skipping to change at page 46, line 43 | |||
| 2.10.5. Exactly Once Semantics | 2.10.5. Exactly Once Semantics | |||
| Via the session, NFSv4.1 offers Exactly Once Semantics (EOS) for | Via the session, NFSv4.1 offers Exactly Once Semantics (EOS) for | |||
| requests sent over a channel. EOS is supported on both the fore and | requests sent over a channel. EOS is supported on both the fore and | |||
| back channels. | back channels. | |||
| Each COMPOUND or CB_COMPOUND request that is sent with a leading | Each COMPOUND or CB_COMPOUND request that is sent with a leading | |||
| SEQUENCE or CB_SEQUENCE operation MUST be executed by the receiver | SEQUENCE or CB_SEQUENCE operation MUST be executed by the receiver | |||
| exactly once. This requirement holds regardless of whether the | exactly once. This requirement holds regardless of whether the | |||
| request is sent with reply caching specified (see | request is sent with reply caching specified (see | |||
| Section 2.10.5.1.2). The requirement holds even if the requester is | Section 2.10.5.1.3). The requirement holds even if the requester is | |||
| issuing the request over a session created between a pNFS data client | issuing the request over a session created between a pNFS data client | |||
| and pNFS data server. To understand the rationale for this | and pNFS data server. To understand the rationale for this | |||
| requirement, divide the requests into three classifications: | requirement, divide the requests into three classifications: | |||
| o Nonidempotent requests. | o Nonidempotent requests. | |||
| o Idempotent modifying requests. | o Idempotent modifying requests. | |||
| o Idempotent non-modifying requests. | o Idempotent non-modifying requests. | |||
| skipping to change at page 49, line 40 | skipping to change at page 49, line 40 | |||
| seen in the slot. Note that because the sequence id must | seen in the slot. Note that because the sequence id must | |||
| wraparound to zero (0) once it reaches 0xFFFFFFFF, a misordered | wraparound to zero (0) once it reaches 0xFFFFFFFF, a misordered | |||
| new request and a misordered retry cannot be distinguished. Thus, | new request and a misordered retry cannot be distinguished. Thus, | |||
| the replier MUST return NFS4ERR_SEQ_MISORDERED (as the result from | the replier MUST return NFS4ERR_SEQ_MISORDERED (as the result from | |||
| SEQUENCE or CB_SEQUENCE). | SEQUENCE or CB_SEQUENCE). | |||
| Unlike the XID, the slot id is always within a specific range; this | Unlike the XID, the slot id is always within a specific range; this | |||
| has two implications. The first implication is that for a given | has two implications. The first implication is that for a given | |||
| session, the replier need only cache the results of a limited number | session, the replier need only cache the results of a limited number | |||
| of COMPOUND requests . The second implication derives from the | of COMPOUND requests . The second implication derives from the | |||
| first, which is unlike XID-indexed reply caches (also known as | first, which is that unlike XID-indexed reply caches (also known as | |||
| duplicate request caches - DRCs), the slot id-based reply cache | duplicate request caches - DRCs), the slot id-based reply cache | |||
| cannot be overflowed. Through use of the sequence id to identify | cannot be overflowed. Through use of the sequence id to identify | |||
| retransmitted requests, the replier does not need to actually cache | retransmitted requests, the replier does not need to actually cache | |||
| the request itself, reducing the storage requirements of the reply | the request itself, reducing the storage requirements of the reply | |||
| cache further. These facilities make it practical to maintain all | cache further. These facilities make it practical to maintain all | |||
| the required entries for an effective reply cache. | the required entries for an effective reply cache. | |||
| The slot id, sequence id, and sessionid therefore take over the | The slot id, sequence id, and sessionid therefore take over the | |||
| traditional role of the XID and source network address in the | traditional role of the XID and source network address in the | |||
| replier's reply cache implementation. This approach is considerably | replier's reply cache implementation. This approach is considerably | |||
| skipping to change at page 52, line 23 | skipping to change at page 52, line 23 | |||
| because the request may have been sent from the requester before | because the request may have been sent from the requester before | |||
| the update was received. Therefore, in the downward adjustment | the update was received. Therefore, in the downward adjustment | |||
| case, the replier may have to retain a number of reply cache | case, the replier may have to retain a number of reply cache | |||
| entries at least as large as the old value of maximum requests | entries at least as large as the old value of maximum requests | |||
| outstanding, until it can infer that the requester has seen a | outstanding, until it can infer that the requester has seen a | |||
| reply containing the new granted highest_slotid. The replier can | reply containing the new granted highest_slotid. The replier can | |||
| infer that requester as seen such a reply when it receives a new | infer that requester as seen such a reply when it receives a new | |||
| request with the same slotid as the request replied to and the | request with the same slotid as the request replied to and the | |||
| next higher sequenceid. | next higher sequenceid. | |||
| 2.10.5.1.1. Errors from SEQUENCE and CB_SEQUENCE | 2.10.5.1.1. Caching of SEQUENCE and CB_SEQUENCE Replies | |||
| When a SEQUENCE or CB_SEQUENCE operation is successfully executed, | ||||
| its reply MUST always be cached. Specifically, sessionid, | ||||
| sequenceid, and slotid MUST be cached in the reply cache. The reply | ||||
| from SEQUENCE also includes the highest slotid, target highest | ||||
| slotid, and status flags. The server SHOULD NOT cache these values, | ||||
| and instead SHOULD re-compute the values from the current state of | ||||
| the fore channel, session and/or client ID as appropriate. | ||||
| Similarly, the reply from CB_SEQUENCE includes a highest slotid and | ||||
| target highest slotid. The client SHOULD NOT cache these values, and | ||||
| SHOULD re-compute the values from the current state of the session as | ||||
| appropriate. | ||||
| 2.10.5.1.2. Errors from SEQUENCE and CB_SEQUENCE | ||||
| Any time SEQUENCE or CB_SEQUENCE return an error, the sequence id of | Any time SEQUENCE or CB_SEQUENCE return an error, the sequence id of | |||
| the slot MUST NOT change. The replier MUST NOT modify the reply | the slot MUST NOT change. The replier MUST NOT modify the reply | |||
| cache entry for the slot whenever an error is returned from SEQUENCE | cache entry for the slot whenever an error is returned from SEQUENCE | |||
| or CB_SEQUENCE. | or CB_SEQUENCE. | |||
| 2.10.5.1.2. Optional Reply Caching | 2.10.5.1.3. Optional Reply Caching | |||
| On a per-request basis the requester can choose to direct the replier | On a per-request basis the requester can choose to direct the replier | |||
| to cache the reply to all operations after the first operation | to cache the reply to all operations after the first operation | |||
| (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis | (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis | |||
| fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it | fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it | |||
| would not direct the replier to cache the entire reply is that the | would not direct the replier to cache the entire reply is that the | |||
| request is composed of all idempotent operations [24]. Caching the | request is composed of all idempotent operations [24]. Caching the | |||
| reply may offer little benefit. If the reply is too large (see | reply may offer little benefit. If the reply is too large (see | |||
| Section 2.10.5.4), it may not be cacheable anyway. Even if the reply | Section 2.10.5.4), it may not be cacheable anyway. Even if the reply | |||
| to idempotent request is small enough to cache, unnecessarily caching | to idempotent request is small enough to cache, unnecessarily caching | |||
| skipping to change at page 53, line 9 | skipping to change at page 53, line 23 | |||
| incremented by one. If a requester does not direct the replier to | incremented by one. If a requester does not direct the replier to | |||
| cache the reply, the replier MUST do one of following: | cache the reply, the replier MUST do one of following: | |||
| o The replier can cache the entire original reply. Even though | o The replier can cache the entire original reply. Even though | |||
| sa_cachethis or csa_cachethis are FALSE, the replier is always | sa_cachethis or csa_cachethis are FALSE, the replier is always | |||
| free to cache. It may choose this approach in order to simplify | free to cache. It may choose this approach in order to simplify | |||
| implementation. | implementation. | |||
| o The replier enters into its reply cache a reply consisting of the | o The replier enters into its reply cache a reply consisting of the | |||
| original results to the SEQUENCE or CB_SEQUENCE operation, and | original results to the SEQUENCE or CB_SEQUENCE operation, and | |||
| with the next operation in COMPOUND or CB)COMPOUND having the | with the next operation in COMPOUND or CB_COMPOUND having the | |||
| error NFS4ERR_RETRY_UNCACHED_REP. Thus if the requester later | error NFS4ERR_RETRY_UNCACHED_REP. Thus if the requester later | |||
| retries the request, it will get NFS4ERR_RETRY_UNCACHED_REP. | retries the request, it will get NFS4ERR_RETRY_UNCACHED_REP. | |||
| 2.10.5.2. Retry and Replay of Reply | 2.10.5.2. Retry and Replay of Reply | |||
| A requester MUST NOT retry a request, unless the connection it used | A requester MUST NOT retry a request, unless the connection it used | |||
| to send the request disconnects. The requester can then reconnect | to send the request disconnects. The requester can then reconnect | |||
| and re-send the request, or it can re-send the request over a | and re-send the request, or it can re-send the request over a | |||
| different connection that is associated with the same session. | different connection that is associated with the same session. | |||
| skipping to change at page 56, line 11 | skipping to change at page 56, line 24 | |||
| If a reply exceeds ca_maxresponsesize, the reply will have the status | If a reply exceeds ca_maxresponsesize, the reply will have the status | |||
| NFS4ERR_REP_TOO_BIG. A replier MAY return NFS4ERR_REP_TOO_BIG as the | NFS4ERR_REP_TOO_BIG. A replier MAY return NFS4ERR_REP_TOO_BIG as the | |||
| status for first operation (SEQUENCE or CB_SEQUENCE) in the request, | status for first operation (SEQUENCE or CB_SEQUENCE) in the request, | |||
| or it MAY chose to return it on a subsequent operation (in the same | or it MAY chose to return it on a subsequent operation (in the same | |||
| COMPOUND or CB_COMPOUND reply). A replier MAY return | COMPOUND or CB_COMPOUND reply). A replier MAY return | |||
| NFS4ERR_REP_TOO_BIG in the reply to SEQUENCE or CB_SEQUENCE, even if | NFS4ERR_REP_TOO_BIG in the reply to SEQUENCE or CB_SEQUENCE, even if | |||
| the response would still exceed ca_maxresponsesize. | the response would still exceed ca_maxresponsesize. | |||
| If sa_cachethis or csa_cachethis are TRUE, then the replier MUST | If sa_cachethis or csa_cachethis are TRUE, then the replier MUST | |||
| cache a reply except if an error is returned by the SEQUENCE or | cache a reply except if an error is returned by the SEQUENCE or | |||
| CB_SEQUENCE operation (see Section 2.10.5.1.1). If the reply exceeds | CB_SEQUENCE operation (see Section 2.10.5.1.2). If the reply exceeds | |||
| ca_maxresponsesize_cached, (and sa_cachethis or csa_cachethis are | ca_maxresponsesize_cached, (and sa_cachethis or csa_cachethis are | |||
| TRUE) then the server MUST return NFS4ERR_REP_TOO_BIG_TO_CACHE. Even | TRUE) then the server MUST return NFS4ERR_REP_TOO_BIG_TO_CACHE. Even | |||
| if NFS4ERR_REP_TOO_BIG_TO_CACHE (or any other error for that matter) | if NFS4ERR_REP_TOO_BIG_TO_CACHE (or any other error for that matter) | |||
| is returned on a operation other than first operation (SEQUENCE or | is returned on a operation other than first operation (SEQUENCE or | |||
| CB_SEQUENCE), then the reply MUST be cached if sa_cachethis or | CB_SEQUENCE), then the reply MUST be cached if sa_cachethis or | |||
| csa_cachethis are TRUE. For example, if a COMPOUND has eleven | csa_cachethis are TRUE. For example, if a COMPOUND has eleven | |||
| operations, including SEQUENCE, the fifth operation is a RENAME, and | operations, including SEQUENCE, the fifth operation is a RENAME, and | |||
| the tenth operation is a READ for one million bytes, the server may | the tenth operation is a READ for one million bytes, the server may | |||
| return NFS4ERR_REP_TOO_BIG_TO_CACHE on the tenth operation. Since | return NFS4ERR_REP_TOO_BIG_TO_CACHE on the tenth operation. Since | |||
| the server executed several operations, especially the non-idempotent | the server executed several operations, especially the non-idempotent | |||
| skipping to change at page 71, line 18 | skipping to change at page 71, line 27 | |||
| Section 5.2.2 "Context Creation Requests" in [4]). | Section 5.2.2 "Context Creation Requests" in [4]). | |||
| 2.10.9. Session Mechanics - Steady State | 2.10.9. Session Mechanics - Steady State | |||
| 2.10.9.1. Obligations of the Server | 2.10.9.1. Obligations of the Server | |||
| The server has the primary obligation to monitor the state of | The server has the primary obligation to monitor the state of | |||
| backchannel resources that the client has created for the server | backchannel resources that the client has created for the server | |||
| (RPCSEC_GSS contexts and backchannel connections). If these | (RPCSEC_GSS contexts and backchannel connections). If these | |||
| resources vanish, the server takes action as specified in | resources vanish, the server takes action as specified in | |||
| Section 2.10.10.2. | Section 2.10.11.2. | |||
| 2.10.9.2. Obligations of the Client | 2.10.9.2. Obligations of the Client | |||
| The client SHOULD honor the following obligations in order to utilize | The client SHOULD honor the following obligations in order to utilize | |||
| the session: | the session: | |||
| o Keep a necessary session from going idle on the server. A client | o Keep a necessary session from going idle on the server. A client | |||
| that requires a session, but nonetheless is not sending operations | that requires a session, but nonetheless is not sending operations | |||
| risks having the session be destroyed by the server. This is | risks having the session be destroyed by the server. This is | |||
| because sessions consume resources, and resource limitations may | because sessions consume resources, and resource limitations may | |||
| force the server to cull an inactive session. | force the server to cull an inactive session. A server MAY | |||
| consider a session to be inactive if the client has not used the | ||||
| session before the session inactivity timer (Section 2.10.10) has | ||||
| expired. | ||||
| o Destroy the session when not needed. If a client has multiple | o Destroy the session when not needed. If a client has multiple | |||
| sessions, one of which has no requests waiting for replies, and | sessions, one of which has no requests waiting for replies, and | |||
| has been idle for some period of time, it SHOULD destroy the | has been idle for some period of time, it SHOULD destroy the | |||
| session. | session. | |||
| o Maintain GSS contexts for the backchannel. If the client requires | o Maintain GSS contexts for the backchannel. If the client requires | |||
| the server to use the RPCSEC_GSS security flavor for callbacks, | the server to use the RPCSEC_GSS security flavor for callbacks, | |||
| then it needs to be sure the contexts handed to the server via | then it needs to be sure the contexts handed to the server via | |||
| BACKCHANNEL_CTL are unexpired. | BACKCHANNEL_CTL are unexpired. | |||
| skipping to change at page 72, line 47 | skipping to change at page 73, line 9 | |||
| If the client wants to use additional connections for the | If the client wants to use additional connections for the | |||
| backchannel, then it must call BIND_CONN_TO_SESSION on each | backchannel, then it must call BIND_CONN_TO_SESSION on each | |||
| connection it wants to use with the session. If the client wants to | connection it wants to use with the session. If the client wants to | |||
| use additional connections for the fore channel, then it must call | use additional connections for the fore channel, then it must call | |||
| BIND_CONN_TO_SESSION if it specified SP4_SSV or SP4_MACH_CRED state | BIND_CONN_TO_SESSION if it specified SP4_SSV or SP4_MACH_CRED state | |||
| protection when the client ID was created. | protection when the client ID was created. | |||
| At this point the session has reached steady state. | At this point the session has reached steady state. | |||
| 2.10.10. Session Mechanics - Recovery | 2.10.10. Session Inactivity Timer | |||
| 2.10.10.1. Events Requiring Client Action | The server MAY maintain an session inactivity timer for each session. | |||
| If the session inactivity timer expires, then the server MAY destroy | ||||
| the session. To avoid losing a session due to inactivity, the client | ||||
| MUST renew the session inactivity timer. The length of session | ||||
| inactivity timer MUST NOT be less than the lease_time attribute | ||||
| (Section 5.7.1.11). As with lease renewal (Section 8.3), when the | ||||
| server receives a SEQUENCE operation, it resets the session | ||||
| inactivity timer, and MUST NOT allow the timer to expire while the | ||||
| rest of the operations in the COMPOUND procedure's request are still | ||||
| executing. Once the last operation has finished, the server MUST set | ||||
| the session inactivity timer to expire no sooner that the sum of the | ||||
| current time and the value of the lease_time attribute. | ||||
| 2.10.11. Session Mechanics - Recovery | ||||
| 2.10.11.1. Events Requiring Client Action | ||||
| The following events require client action to recover. | The following events require client action to recover. | |||
| 2.10.10.1.1. RPCSEC_GSS Context Loss by Callback Path | 2.10.11.1.1. RPCSEC_GSS Context Loss by Callback Path | |||
| If all RPCSEC_GSS contexts granted by the client to the server for | If all RPCSEC_GSS contexts granted by the client to the server for | |||
| callback use have expired, the client MUST establish a new context | callback use have expired, the client MUST establish a new context | |||
| via BACKCHANNEL_CTL. The sr_status_flags field of the SEQUENCE | via BACKCHANNEL_CTL. The sr_status_flags field of the SEQUENCE | |||
| results indicates when callback contexts are nearly expired, or fully | results indicates when callback contexts are nearly expired, or fully | |||
| expired (see Section 18.46.3). | expired (see Section 18.46.3). | |||
| 2.10.10.1.2. Connection Loss | 2.10.11.1.2. Connection Loss | |||
| If the client loses the last connection of the session, and if wants | If the client loses the last connection of the session, and if wants | |||
| to retain the session, then it must create a new connection, and if, | to retain the session, then it must create a new connection, and if, | |||
| when the client ID was created, BIND_CONN_TO_SESSION was specified in | when the client ID was created, BIND_CONN_TO_SESSION was specified in | |||
| the spo_must_enforce list, the client MUST use BIND_CONN_TO_SESSION | the spo_must_enforce list, the client MUST use BIND_CONN_TO_SESSION | |||
| to associate the connection with the session. | to associate the connection with the session. | |||
| If there was a request outstanding at the time the of connection | If there was a request outstanding at the time the of connection | |||
| loss, then if client wants to continue to use the session it MUST | loss, then if client wants to continue to use the session it MUST | |||
| retry the request, as described in Section 2.10.5.2. Note that it is | retry the request, as described in Section 2.10.5.2. Note that it is | |||
| skipping to change at page 73, line 39 | skipping to change at page 74, line 16 | |||
| disconnect. | disconnect. | |||
| If the connection that was lost was the last one associated with the | If the connection that was lost was the last one associated with the | |||
| backchannel, and the client wants to retain the backchannel and/or | backchannel, and the client wants to retain the backchannel and/or | |||
| not put recallable state subject to revocation, the client must | not put recallable state subject to revocation, the client must | |||
| reconnect, and if it does, it MUST associate the connection to the | reconnect, and if it does, it MUST associate the connection to the | |||
| session and backchannel via BIND_CONN_TO_SESSION. The server SHOULD | session and backchannel via BIND_CONN_TO_SESSION. The server SHOULD | |||
| indicate when it has no callback connection via the sr_status_flags | indicate when it has no callback connection via the sr_status_flags | |||
| result from SEQUENCE. | result from SEQUENCE. | |||
| 2.10.10.1.3. Backchannel GSS Context Loss | 2.10.11.1.3. Backchannel GSS Context Loss | |||
| Via the sr_status_flags result of the SEQUENCE operation or other | Via the sr_status_flags result of the SEQUENCE operation or other | |||
| means, the client will learn if some or all of the RPCSEC_GSS | means, the client will learn if some or all of the RPCSEC_GSS | |||
| contexts it assigned to the backchannel have been lost. If the | contexts it assigned to the backchannel have been lost. If the | |||
| client wants to the retain the backchannel and/or not put recallable | client wants to the retain the backchannel and/or not put recallable | |||
| state subjection to revocation, the client must use BACKCHANNEL_CTL | state subjection to revocation, the client must use BACKCHANNEL_CTL | |||
| to assign new contexts. | to assign new contexts. | |||
| 2.10.10.1.4. Loss of Session | 2.10.11.1.4. Loss of Session | |||
| The replier might lose a record of the session. Causes include: | The replier might lose a record of the session. Causes include: | |||
| o Replier failure and restart | o Replier failure and restart | |||
| o A catastrophe that causes the reply cache to be corrupted or lost | o A catastrophe that causes the reply cache to be corrupted or lost | |||
| on the media it was stored on. This applies even if the replier | on the media it was stored on. This applies even if the replier | |||
| indicated in the CREATE_SESSION results that it would persist the | indicated in the CREATE_SESSION results that it would persist the | |||
| cache. | cache. | |||
| skipping to change at page 75, line 5 | skipping to change at page 75, line 27 | |||
| client ID; loss of client ID however does imply loss of session, | client ID; loss of client ID however does imply loss of session, | |||
| lock, open, delegation, and layout state. See Section 8.4.2. A | lock, open, delegation, and layout state. See Section 8.4.2. A | |||
| session can survive a server restart, but lock recovery may still be | session can survive a server restart, but lock recovery may still be | |||
| needed. | needed. | |||
| It is possible CREATE_SESSION will fail with NFS4ERR_STALE_CLIENTID | It is possible CREATE_SESSION will fail with NFS4ERR_STALE_CLIENTID | |||
| (for example the server restarts and does not preserve client ID | (for example the server restarts and does not preserve client ID | |||
| state). If so, the client needs to call EXCHANGE_ID, followed by | state). If so, the client needs to call EXCHANGE_ID, followed by | |||
| CREATE_SESSION. | CREATE_SESSION. | |||
| 2.10.10.2. Events Requiring Server Action | 2.10.11.2. Events Requiring Server Action | |||
| The following events require server action to recover. | The following events require server action to recover. | |||
| 2.10.10.2.1. Client Crash and Restart | 2.10.11.2.1. Client Crash and Restart | |||
| As described in Section 18.35, a restarted client sends EXCHANGE_ID | As described in Section 18.35, a restarted client sends EXCHANGE_ID | |||
| in such a way it causes the server to delete any sessions it had. | in such a way it causes the server to delete any sessions it had. | |||
| 2.10.10.2.2. Client Crash with No Restart | 2.10.11.2.2. Client Crash with No Restart | |||
| If a client crashes and never comes back, it will never send | If a client crashes and never comes back, it will never send | |||
| EXCHANGE_ID with its old client owner. Thus the server has session | EXCHANGE_ID with its old client owner. Thus the server has session | |||
| state that will never be used again. After an extended period of | state that will never be used again. After an extended period of | |||
| time and if the server has resource constraints, it MAY destroy the | time and if the server has resource constraints, it MAY destroy the | |||
| old session as well as locking state. | old session as well as locking state. | |||
| 2.10.10.2.3. Extended Network Partition | 2.10.11.2.3. Extended Network Partition | |||
| To the server, the extended network partition may be no different | To the server, the extended network partition may be no different | |||
| from a client crash with no restart (see Section 2.10.10.2.2). | from a client crash with no restart (see Section 2.10.11.2.2). | |||
| Unless the server can discern that there is a network partition, it | Unless the server can discern that there is a network partition, it | |||
| is free to treat the situation as if the client has crashed | is free to treat the situation as if the client has crashed | |||
| permanently. | permanently. | |||
| 2.10.10.2.4. Backchannel Connection Loss | 2.10.11.2.4. Backchannel Connection Loss | |||
| If there were callback requests outstanding at the time of a | If there were callback requests outstanding at the time of a | |||
| connection loss, then the server MUST retry the request, as described | connection loss, then the server MUST retry the request, as described | |||
| in Section 2.10.5.2. Note that it is not necessary to retry requests | in Section 2.10.5.2. Note that it is not necessary to retry requests | |||
| over a connection with the same source network address or the same | over a connection with the same source network address or the same | |||
| destination network address as the lost connection. As long as the | destination network address as the lost connection. As long as the | |||
| sessionid, slot id, and sequence id in the retry match that of the | sessionid, slot id, and sequence id in the retry match that of the | |||
| original request, the callback target will recognize the request as a | original request, the callback target will recognize the request as a | |||
| retry even if it did see the request prior to disconnect. | retry even if it did see the request prior to disconnect. | |||
| If the connection lost is the last one associated with the | If the connection lost is the last one associated with the | |||
| backchannel, then the server MUST indicate that in the | backchannel, then the server MUST indicate that in the | |||
| sr_status_flags field of every SEQUENCE reply until the backchannel | sr_status_flags field of every SEQUENCE reply until the backchannel | |||
| is reestablished. There are two situations each of which use | is reestablished. There are two situations each of which use | |||
| different status flags: no connectivity for the session's | different status flags: no connectivity for the session's | |||
| backchannel, and no connectivity for any session backchannel of the | backchannel, and no connectivity for any session backchannel of the | |||
| client. See Section 18.46 for a description of the appropriate flags | client. See Section 18.46 for a description of the appropriate flags | |||
| in sr_status_flags. | in sr_status_flags. | |||
| 2.10.10.2.5. GSS Context Loss | 2.10.11.2.5. GSS Context Loss | |||
| The server SHOULD monitor when the number RPCSEC_GSS contexts | The server SHOULD monitor when the number RPCSEC_GSS contexts | |||
| assigned to the backchannel reaches one, and when that one context is | assigned to the backchannel reaches one, and when that one context is | |||
| near expiry (i.e. between one and two periods of lease time), | near expiry (i.e. between one and two periods of lease time), | |||
| indicate so in the sr_status_flags field of all SEQUENCE replies. | indicate so in the sr_status_flags field of all SEQUENCE replies. | |||
| The server MUST indicate when the all of the backchannel's assigned | The server MUST indicate when the all of the backchannel's assigned | |||
| RPCSEC_GSS contexts have expired in the sr_status_flags field of all | RPCSEC_GSS contexts have expired in the sr_status_flags field of all | |||
| SEQUENCE replies. | SEQUENCE replies. | |||
| 2.10.11. Parallel NFS and Sessions | 2.10.12. Parallel NFS and Sessions | |||
| A client and server can potentially be a non-pNFS implementation, a | A client and server can potentially be a non-pNFS implementation, a | |||
| metadata server implementation, a data server implementation, or two | metadata server implementation, a data server implementation, or two | |||
| or three types of implementations. The EXCHGID4_FLAG_USE_NON_PNFS, | or three types of implementations. The EXCHGID4_FLAG_USE_NON_PNFS, | |||
| EXCHGID4_FLAG_USE_PNFS_MDS, and EXCHGID4_FLAG_USE_PNFS_DS flags (not | EXCHGID4_FLAG_USE_PNFS_MDS, and EXCHGID4_FLAG_USE_PNFS_DS flags (not | |||
| mutually exclusive) are passed in the EXCHANGE_ID arguments and | mutually exclusive) are passed in the EXCHANGE_ID arguments and | |||
| results to allow the client to indicate how it wants to use sessions | results to allow the client to indicate how it wants to use sessions | |||
| created under the client ID, and to allow the server to indicate how | created under the client ID, and to allow the server to indicate how | |||
| it will allow the sessions to be used. See Section 13.1 for pNFS | it will allow the sessions to be used. See Section 13.1 for pNFS | |||
| sessions considerations. | sessions considerations. | |||
| skipping to change at page 94, line 32 | skipping to change at page 94, line 32 | |||
| server supports and construct requests with only those supported | server supports and construct requests with only those supported | |||
| attributes (or a subset thereof). | attributes (or a subset thereof). | |||
| To this end, attributes are divided into three groups: REQUIRED, | To this end, attributes are divided into three groups: REQUIRED, | |||
| RECOMMENDED, and named. Both REQUIRED and RECOMMENDED attributes are | RECOMMENDED, and named. Both REQUIRED and RECOMMENDED attributes are | |||
| supported in the NFSv4.1 protocol by a specific and well-defined | supported in the NFSv4.1 protocol by a specific and well-defined | |||
| encoding and are identified by number. They are requested by setting | encoding and are identified by number. They are requested by setting | |||
| a bit in the bit vector sent in the GETATTR request; the server | a bit in the bit vector sent in the GETATTR request; the server | |||
| response includes a bit vector to list what attributes were returned | response includes a bit vector to list what attributes were returned | |||
| in the response. New REQUIRED or RECOMMENDED attributes may be added | in the response. New REQUIRED or RECOMMENDED attributes may be added | |||
| to the NFS protocol between major revisions by publishing a | to the NFSv4 protocol as part of a new minor version by publishing a | |||
| standards-track RFC which allocates a new attribute number value and | standards-track RFC which allocates a new attribute number value and | |||
| defines the encoding for the attribute. See Section 2.7 for further | defines the encoding for the attribute. See Section 2.7 for further | |||
| discussion. | discussion. | |||
| Named attributes are accessed by the new OPENATTR operation, which | Named attributes are accessed by the new OPENATTR operation, which | |||
| accesses a hidden directory of attributes associated with a file | accesses a hidden directory of attributes associated with a file | |||
| system object. OPENATTR takes a filehandle for the object and | system object. OPENATTR takes a filehandle for the object and | |||
| returns the filehandle for the attribute hierarchy. The filehandle | returns the filehandle for the attribute hierarchy. The filehandle | |||
| for the named attributes is a directory object accessible by LOOKUP | for the named attributes is a directory object accessible by LOOKUP | |||
| or READDIR and contains files whose names represent the named | or READDIR and contains files whose names represent the named | |||
| skipping to change at page 95, line 37 | skipping to change at page 95, line 37 | |||
| Note that the hidden directory returned by OPENATTR is a convenience | Note that the hidden directory returned by OPENATTR is a convenience | |||
| for protocol processing. The client should not make any assumptions | for protocol processing. The client should not make any assumptions | |||
| about the server's implementation of named attributes and whether the | about the server's implementation of named attributes and whether the | |||
| underlying file system at the server has a named attribute directory | underlying file system at the server has a named attribute directory | |||
| or not. Therefore, operations such as SETATTR and GETATTR on the | or not. Therefore, operations such as SETATTR and GETATTR on the | |||
| named attribute directory are undefined. | named attribute directory are undefined. | |||
| 5.1. REQUIRED Attributes | 5.1. REQUIRED Attributes | |||
| These MUST be supported by every NFSv4.1 client and server in order | These MUST be supported by every NFSv4.1 client and server in order | |||
| to ensure a minimum level of interoperability. The server must store | to ensure a minimum level of interoperability. The server MUST store | |||
| and return these attributes and the client must be able to function | and return these attributes and the client MUST be able to function | |||
| with an attribute set limited to these attributes. With just the | with an attribute set limited to these attributes. With just the | |||
| REQUIRED attributes some client functionality may be impaired or | REQUIRED attributes some client functionality may be impaired or | |||
| limited in some ways. A client may ask for any of these attributes | limited in some ways. A client may ask for any of these attributes | |||
| to be returned by setting a bit in the GETATTR request and the server | to be returned by setting a bit in the GETATTR request and the server | |||
| must return their value. | must return their value. | |||
| 5.2. RECOMMENDED Attributes | 5.2. RECOMMENDED Attributes | |||
| These attributes are understood well enough to warrant support in the | These attributes are understood well enough to warrant support in the | |||
| NFSv4.1 protocol. However, they may not be supported on all clients | NFSv4.1 protocol. However, they may not be supported on all clients | |||
| and servers. A client may ask for any of these attributes to be | and servers. A client may ask for any of these attributes to be | |||
| returned by setting a bit in the GETATTR request but must handle the | returned by setting a bit in the GETATTR request but must handle the | |||
| case where the server does not return them. A client may ask for the | case where the server does not return them. A client may ask for the | |||
| set of attributes the server supports and should not request | set of attributes the server supports and SHOULD NOT request | |||
| attributes the server does not support. A server should be tolerant | attributes the server does not support. A server should be tolerant | |||
| of requests for unsupported attributes and simply not return them | of requests for unsupported attributes and simply not return them | |||
| rather than considering the request an error. It is expected that | rather than considering the request an error. It is expected that | |||
| servers will support all attributes they comfortably can and only | servers will support all attributes they comfortably can and only | |||
| fail to support attributes which are difficult to support in their | fail to support attributes which are difficult to support in their | |||
| operating environments. A server should provide attributes whenever | operating environments. A server should provide attributes whenever | |||
| they don't have to "tell lies" to the client. For example, a file | they don't have to "tell lies" to the client. For example, a file | |||
| modification time should be either an accurate time or should not be | modification time should be either an accurate time or should not be | |||
| supported by the server. This will not always be comfortable to | supported by the server. This will not always be comfortable to | |||
| clients but the client is better positioned decide whether and how to | clients but the client is better positioned decide whether and how to | |||
| skipping to change at page 97, line 5 | skipping to change at page 97, line 5 | |||
| of delegations (in the case of the named attribute directory these | of delegations (in the case of the named attribute directory these | |||
| will be directory delegations). However, since granting of | will be directory delegations). However, since granting of | |||
| delegations or not is within the server's discretion, a server need | delegations or not is within the server's discretion, a server need | |||
| not support delegations on named attributes or the named attribute | not support delegations on named attributes or the named attribute | |||
| directory. | directory. | |||
| It is RECOMMENDED that servers support arbitrary named attributes. A | It is RECOMMENDED that servers support arbitrary named attributes. A | |||
| client should not depend on the ability to store any named attributes | client should not depend on the ability to store any named attributes | |||
| in the server's file system. If a server does support named | in the server's file system. If a server does support named | |||
| attributes, a client which is also able to handle them should be able | attributes, a client which is also able to handle them should be able | |||
| to copy a file's data and meta-data with complete transparency from | to copy a file's data and metadata with complete transparency from | |||
| one location to another; this would imply that names allowed for | one location to another; this would imply that names allowed for | |||
| regular directory entries are valid for named attribute names as | regular directory entries are valid for named attribute names as | |||
| well. | well. | |||
| In NFSv4.1, the structure of named attribute directories is | In NFSv4.1, the structure of named attribute directories is | |||
| restricted in a number of ways, in order to prevent the development | restricted in a number of ways, in order to prevent the development | |||
| of non-interoperable implementations in which some servers support a | of non-interoperable implementations in which some servers support a | |||
| fully general hierarchical directory structure for named attributes | fully general hierarchical directory structure for named attributes | |||
| while others support a limited set, but fully adequate to the | while others support a limited set, but fully adequate to the | |||
| feature's goals. In such an environment, clients or applications | feature's goals. In such an environment, clients or applications | |||
| might come to depend on non-portable extensions. The restrictions | might come to depend on non-portable extensions. The restrictions | |||
| are: | are: | |||
| o CREATE is not allowed in a named attribute directory. Thus, such | o CREATE is not allowed in a named attribute directory. Thus, such | |||
| objects as symbolic links and special files are not allowed to be | objects as symbolic links and special files are not allowed to be | |||
| named attributes. Further, directories may not be created in a | named attributes. Further, directories may not be created in a | |||
| named attribute directory so no hierarchical structure of named | named attribute directory so no hierarchical structure of named | |||
| attributes for a single object is allowed. | attributes for a single object is allowed. | |||
| o OPENATTR many not be done on a named attribute directory or on a | o OPENATTR MUST NOT be done on a named attribute directory or on a | |||
| named attribute. Thus, although these object have attributes, | named attribute. | |||
| they may not may named attributes. | ||||
| o Doing a RENAME of a named attribute to a different named attribute | o Doing a RENAME of a named attribute to a different named attribute | |||
| directory or to an ordinary (i.e. non-named-attribute) directory | directory or to an ordinary (i.e. non-named-attribute) directory | |||
| is not allowed. | is not allowed. | |||
| o Creating hard links between names attribute directories or between | o Creating hard links between named attribute directories or between | |||
| named attribute directories and ordinary directories is not | named attribute directories and ordinary directories is not | |||
| allowed. | allowed. | |||
| Names of attributes will not be controlled by this document or other | Names of attributes will not be controlled by this document or other | |||
| IETF standards track documents. See Section 22.1 for further | IETF standards track documents. See Section 22.1 for further | |||
| discussion. | discussion. | |||
| 5.4. Classification of Attributes | 5.4. Classification of Attributes | |||
| Each of the REQUIRED and RECOMMENDED attributes can be classified in | Each of the REQUIRED and RECOMMENDED attributes can be classified in | |||
| skipping to change at page 103, line 43 | skipping to change at page 103, line 43 | |||
| True, if the server able to change the times for a file system object | True, if the server able to change the times for a file system object | |||
| as specified in a SETATTR operation. | as specified in a SETATTR operation. | |||
| 5.7.2.3. Attribute 16: case_insensitive | 5.7.2.3. Attribute 16: case_insensitive | |||
| True, if filename comparisons on this file system are case | True, if filename comparisons on this file system are case | |||
| insensitive. | insensitive. | |||
| 5.7.2.4. Attribute 17: case_preserving | 5.7.2.4. Attribute 17: case_preserving | |||
| True, if filename case on this file system are preserved. | True, if file name case on this file system is preserved. | |||
| 5.7.2.5. Attribute 60: change_policy | 5.7.2.5. Attribute 60: change_policy | |||
| A value created by the server that the client can use to determine if | A value created by the server that the client can use to determine if | |||
| some server policy related to the current file system has been | some server policy related to the current file system has been | |||
| subject to change. If the value remains the same then the client can | subject to change. If the value remains the same then the client can | |||
| be sure that the values of the attributes related to fs location and | be sure that the values of the attributes related to fs location and | |||
| the fss_type field of the fs_status attribute have not changed. On | the fss_type field of the fs_status attribute have not changed. On | |||
| the other hand, a change in this value does necessarily imply a | the other hand, a change in this value does necessarily imply a | |||
| change in policy. It is up to the client to interrogate the server | change in policy. It is up to the client to interrogate the server | |||
| skipping to change at page 105, line 49 | skipping to change at page 105, line 49 | |||
| lead to the client either wasting bandwidth or not receiving the best | lead to the client either wasting bandwidth or not receiving the best | |||
| performance. | performance. | |||
| 5.7.2.22. Attribute 32: mimetype | 5.7.2.22. Attribute 32: mimetype | |||
| MIME body type/subtype of this object. | MIME body type/subtype of this object. | |||
| 5.7.2.23. Attribute 55: mounted_on_fileid | 5.7.2.23. Attribute 55: mounted_on_fileid | |||
| Like fileid, but if the target filehandle is the root of a file | Like fileid, but if the target filehandle is the root of a file | |||
| system return the fileid of the underlying directory. | system, this attribute represents the fileid of the underlying | |||
| directory. | ||||
| UNIX-based operating environments connect a file system into the | UNIX-based operating environments connect a file system into the | |||
| namespace by connecting (mounting) the file system onto the existing | namespace by connecting (mounting) the file system onto the existing | |||
| file object (the mount point, usually a directory) of an existing | file object (the mount point, usually a directory) of an existing | |||
| file system. When the mount point's parent directory is read via an | file system. When the mount point's parent directory is read via an | |||
| API like readdir(), the return results are directory entries, each | API like readdir(), the return results are directory entries, each | |||
| with a component name and a fileid. The fileid of the mount point's | with a component name and a fileid. The fileid of the mount point's | |||
| directory entry will be different from the fileid that the stat() | directory entry will be different from the fileid that the stat() | |||
| system call returns. The stat() system call is returning the fileid | system call returns. The stat() system call is returning the fileid | |||
| of the root of the mounted file system, whereas readdir() is | of the root of the mounted file system, whereas readdir() is | |||
| skipping to change at page 107, line 7 | skipping to change at page 107, line 7 | |||
| should obey an invariant that has it returning a value that is equal | should obey an invariant that has it returning a value that is equal | |||
| to the file object's entry in the object's parent directory, i.e. | to the file object's entry in the object's parent directory, i.e. | |||
| what readdir() would have returned. Some operating environments | what readdir() would have returned. Some operating environments | |||
| allow a series of two or more file systems to be mounted onto a | allow a series of two or more file systems to be mounted onto a | |||
| single mount point. In this case, for the server to obey the | single mount point. In this case, for the server to obey the | |||
| aforementioned invariant, it will need to find the base mount point, | aforementioned invariant, it will need to find the base mount point, | |||
| and not the intermediate mount points. | and not the intermediate mount points. | |||
| 5.7.2.24. Attribute 34: no_trunc | 5.7.2.24. Attribute 34: no_trunc | |||
| True, if a name longer than name_max is used, an error be returned | If this attribute is TRUE, then if the client uses a file name longer | |||
| and name is not truncated. | than name_max, an error will be returned instead of the name being | |||
| truncated. | ||||
| 5.7.2.25. Attribute 35: numlinks | 5.7.2.25. Attribute 35: numlinks | |||
| Number of hard links to this object. | Number of hard links to this object. | |||
| 5.7.2.26. Attribute 36: owner | 5.7.2.26. Attribute 36: owner | |||
| The string name of the owner of this object. | The string name of the owner of this object. | |||
| 5.7.2.27. Attribute 37: owner_group | 5.7.2.27. Attribute 37: owner_group | |||
| The string name of the group ownership of this object. | The string name of the group ownership of this object. | |||
| 5.7.2.28. Attribute 38: quota_avail_hard | 5.7.2.28. Attribute 38: quota_avail_hard | |||
| The value in bytes which represent the amount of additional disk | The value in bytes which represents the amount of additional disk | |||
| space beyond the current allocation that can be allocated to this | space beyond the current allocation that can be allocated to this | |||
| file or directory before further allocations will be refused. It is | file or directory before further allocations will be refused. It is | |||
| understood that this space may be consumed by allocations to other | understood that this space may be consumed by allocations to other | |||
| files or directories. | files or directories. | |||
| 5.7.2.29. Attribute 39: quota_avail_soft | 5.7.2.29. Attribute 39: quota_avail_soft | |||
| The value in bytes which represents the amount of additional disk | The value in bytes which represents the amount of additional disk | |||
| space that can be allocated to this file or directory before the user | space that can be allocated to this file or directory before the user | |||
| may reasonably be warned. It is understood that this space may be | may reasonably be warned. It is understood that this space may be | |||
| skipping to change at page 108, line 9 | skipping to change at page 108, line 11 | |||
| files or directories for which a quota_used value is maintained. | files or directories for which a quota_used value is maintained. | |||
| E.g. "all files with a given owner", "all files with a given group | E.g. "all files with a given owner", "all files with a given group | |||
| owner". etc. | owner". etc. | |||
| The server is at liberty to choose any of those sets but should do so | The server is at liberty to choose any of those sets but should do so | |||
| in a repeatable way. The rule may be configured per file system or | in a repeatable way. The rule may be configured per file system or | |||
| may be "choose the set with the smallest quota". | may be "choose the set with the smallest quota". | |||
| 5.7.2.31. Attribute 41: rawdev | 5.7.2.31. Attribute 41: rawdev | |||
| Raw device identifier. UNIX device major/minor node information. If | Raw device identifier; the UNIX device major/minor node information. | |||
| the value of type is not NF4BLK or NF4CHR, the value return SHOULD | If the value of type is not NF4BLK or NF4CHR, the value returned | |||
| NOT be considered useful. | SHOULD NOT be considered useful. | |||
| 5.7.2.32. Attribute 42: space_avail | 5.7.2.32. Attribute 42: space_avail | |||
| Disk space in bytes available to this user on the file system | Disk space in bytes available to this user on the file system | |||
| containing this object - this should be the smallest relevant limit. | containing this object - this should be the smallest relevant limit. | |||
| 5.7.2.33. Attribute 43: space_free | 5.7.2.33. Attribute 43: space_free | |||
| Free disk space in bytes on the file system containing this object - | Free disk space in bytes on the file system containing this object - | |||
| this should be the smallest relevant limit. | this should be the smallest relevant limit. | |||
| skipping to change at page 108, line 33 | skipping to change at page 108, line 35 | |||
| 5.7.2.34. Attribute 44: space_total | 5.7.2.34. Attribute 44: space_total | |||
| Total disk space in bytes on the file system containing this object. | Total disk space in bytes on the file system containing this object. | |||
| 5.7.2.35. Attribute 45: space_used | 5.7.2.35. Attribute 45: space_used | |||
| Number of file system bytes allocated to this object. | Number of file system bytes allocated to this object. | |||
| 5.7.2.36. Attribute 46: system | 5.7.2.36. Attribute 46: system | |||
| True, if this file is a "system" file with respect to the Windows | This attribute is TRUE if this file is a "system" file with respect | |||
| API. | to the Windows operating environment. | |||
| 5.7.2.37. Attribute 47: time_access | 5.7.2.37. Attribute 47: time_access | |||
| The time_access attribute represents the time of last access to the | The time_access attribute represents the time of last access to the | |||
| object by a read that was satisfied by the server. The notion of | object by a read that was satisfied by the server. The notion of | |||
| what is an "access" depends on server's operating environment and/or | what is an "access" depends on server's operating environment and/or | |||
| the server's file system semantics. For example, for servers obeying | the server's file system semantics. For example, for servers obeying | |||
| POSIX semantics, time_access would be updated only by the READLINK, | POSIX semantics, time_access would be updated only by the READLINK, | |||
| READ, and READDIR operations and not any of the operations that | READ, and READDIR operations and not any of the operations that | |||
| modify the content of the object. Of course, setting the | modify the content of the object. Of course, setting the | |||
| skipping to change at page 109, line 29 | skipping to change at page 109, line 30 | |||
| The time of creation of the object. This attribute does not have any | The time of creation of the object. This attribute does not have any | |||
| relation to the traditional UNIX file attribute "ctime" or "change | relation to the traditional UNIX file attribute "ctime" or "change | |||
| time". | time". | |||
| 5.7.2.41. Attribute 51: time_delta | 5.7.2.41. Attribute 51: time_delta | |||
| Smallest useful server time granularity. | Smallest useful server time granularity. | |||
| 5.7.2.42. Attribute 52: time_metadata | 5.7.2.42. Attribute 52: time_metadata | |||
| The time of last meta-data modification of the object. | The time of last metadata modification of the object. | |||
| 5.7.2.43. Attribute 53: time_modify | 5.7.2.43. Attribute 53: time_modify | |||
| The time of last modification to the object. | The time of last modification to the object. | |||
| 5.7.2.44. Attribute 54: time_modify_set | 5.7.2.44. Attribute 54: time_modify_set | |||
| Set the time of last modification to the object. SETATTR use only. | Set the time of last modification to the object. SETATTR use only. | |||
| 5.8. Interpreting owner and owner_group | 5.8. Interpreting owner and owner_group | |||
| skipping to change at page 110, line 31 | skipping to change at page 110, line 32 | |||
| service may also be used to accomplish the translation. A server may | service may also be used to accomplish the translation. A server may | |||
| provide a more general service, not limited by any particular | provide a more general service, not limited by any particular | |||
| translation (which would only translate a limited set of possible | translation (which would only translate a limited set of possible | |||
| strings) by storing the owner and owner_group attributes in local | strings) by storing the owner and owner_group attributes in local | |||
| storage without any translation or it may augment a translation | storage without any translation or it may augment a translation | |||
| method by storing the entire string for attributes for which no | method by storing the entire string for attributes for which no | |||
| translation is available while using the local representation for | translation is available while using the local representation for | |||
| those cases in which a translation is available. | those cases in which a translation is available. | |||
| Servers that do not provide support for all possible values of the | Servers that do not provide support for all possible values of the | |||
| owner and owner_group attributes, should return an error | owner and owner_group attributes, SHOULD return an error | |||
| (NFS4ERR_BADOWNER) when a string is presented that has no | (NFS4ERR_BADOWNER) when a string is presented that has no | |||
| translation, as the value to be set for a SETATTR of the owner, | translation, as the value to be set for a SETATTR of the owner, | |||
| owner_group, or acl attributes. When a server does accept an owner | owner_group, or acl attributes. When a server does accept an owner | |||
| or owner_group value as valid on a SETATTR (and similarly for the | or owner_group value as valid on a SETATTR (and similarly for the | |||
| owner and group strings in an acl), it is promising to return that | owner and group strings in an acl), it is promising to return that | |||
| same string when a corresponding GETATTR is done. Configuration | same string when a corresponding GETATTR is done. Configuration | |||
| changes and ill-constructed name translations (those that contain | changes (including changes from the mapping of the string to the | |||
| aliasing) may make that promise impossible to honor. Servers should | local representation) and ill-constructed name translations (those | |||
| make appropriate efforts to avoid a situation in which these | that contain aliasing) may make that promise impossible to honor. | |||
| attributes have their values changed when no real change to ownership | Servers should make appropriate efforts to avoid a situation in which | |||
| has occurred. | these attributes have their values changed when no real change to | |||
| ownership has occurred. | ||||
| The "dns_domain" portion of the owner string is meant to be a DNS | The "dns_domain" portion of the owner string is meant to be a DNS | |||
| domain name. For example, user@ietf.org. Servers should accept as | domain name. For example, user@ietf.org. Servers should accept as | |||
| valid a set of users for at least one domain. A server may treat | valid a set of users for at least one domain. A server may treat | |||
| other domains as having no valid translations. A more general | other domains as having no valid translations. A more general | |||
| service is provided when a server is capable of accepting users for | service is provided when a server is capable of accepting users for | |||
| multiple domains, or for all domains, subject to security | multiple domains, or for all domains, subject to security | |||
| constraints. | constraints. | |||
| In the case where there is no translation available to the client or | In the case where there is no translation available to the client or | |||
| server, the attribute value must be constructed without the "@". | server, the attribute value must be constructed without the "@". | |||
| Therefore, the absence of the @ from the owner or owner_group | Therefore, the absence of the @ from the owner or owner_group | |||
| attribute signifies that no translation was available at the sender | attribute signifies that no translation was available at the sender | |||
| and that the receiver of the attribute should not use that string as | and that the receiver of the attribute should not use that string as | |||
| a basis for translation into its own internal format. Even though | a basis for translation into its own internal format. Even though | |||
| the attribute value can not be translated, it may still be useful. | the attribute value can not be translated, it may still be useful. | |||
| In the case of a client, the attribute string may be used for local | In the case of a client, the attribute string may be used for local | |||
| display of ownership. | display of ownership. | |||
| To provide a greater degree of compatibility with NFSv3, which | To provide a greater degree of compatibility with NFSv3, which | |||
| identified users and groups by 32-bit unsigned uid's and gid's, owner | identified users and groups by 32-bit unsigned user identifiers and | |||
| and group strings that consist of decimal numeric values with no | group identifiers, owner and group strings that consist of decimal | |||
| leading zeros can be given a special interpretation by clients and | numeric values with no leading zeros can be given a special | |||
| servers which choose to provide such support. The receiver may treat | interpretation by clients and servers which choose to provide such | |||
| such a user or group string as representing the same user as would be | support. The receiver may treat such a user or group string as | |||
| represented by an NFSv3 uid or gid having the corresponding numeric | representing the same user as would be represented by an NFSv3 uid or | |||
| value. A server is not obligated to accept such a string, but may | gid having the corresponding numeric value. A server is not | |||
| return an NFS4ERR_BADOWNER instead. To avoid this mechanism being | obligated to accept such a string, but may return an NFS4ERR_BADOWNER | |||
| used to subvert user and group translation, so that a client might | instead. To avoid this mechanism being used to subvert user and | |||
| pass all of the owners and groups in numeric form, a server SHOULD | group translation, so that a client might pass all of the owners and | |||
| return an NFS4ERR_BADOWNER error when there is a valid translation | groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER | |||
| for the user or owner designated in this way. In that case, the | error when there is a valid translation for the user or owner | |||
| client must use the appropriate name@domain string and not the | designated in this way. In that case, the client must use the | |||
| special form for compatibility. | appropriate name@domain string and not the special form for | |||
| compatibility. | ||||
| The owner string "nobody" may be used to designate an anonymous user, | The owner string "nobody" may be used to designate an anonymous user, | |||
| which will be associated with a file created by a security principal | which will be associated with a file created by a security principal | |||
| that cannot be mapped through normal means to the owner attribute. | that cannot be mapped through normal means to the owner attribute. | |||
| 5.9. Character Case Attributes | 5.9. Character Case Attributes | |||
| With respect to the case_insensitive and case_preserving attributes, | With respect to the case_insensitive and case_preserving attributes, | |||
| each UCS-4 character (which UTF-8 encodes) has a "long descriptive | each UCS-4 character (which UTF-8 encodes) has a "long descriptive | |||
| name" RFC1345 [35] which may or may not included the word "CAPITAL" | name" RFC1345 [35] which may or may not include the word "CAPITAL" or | |||
| or "SMALL". The presence of SMALL or CAPITAL allows an NFS server to | "SMALL". The presence of SMALL or CAPITAL allows an NFS server to | |||
| implement unambiguous and efficient table driven mappings for case | implement unambiguous and efficient table driven mappings for case | |||
| insensitive comparisons, and non-case-preserving storage. For | insensitive comparisons, and non-case-preserving storage. For | |||
| general character handling and internationalization issues, see | general character handling and internationalization issues, see | |||
| Section 14. | Section 14. | |||
| 5.10. Directory Notification Attributes | 5.10. Directory Notification Attributes | |||
| As described in Section 18.39, the client can request a minimum delay | As described in Section 18.39, the client can request a minimum delay | |||
| for notifications of changes to attributes, but the server is free to | for notifications of changes to attributes, but the server is free to | |||
| ignore what the client requests. The client can determine in advance | ignore what the client requests. The client can determine in advance | |||
| skipping to change at page 112, line 24 | skipping to change at page 112, line 27 | |||
| 5.10.2. Attribute 57: dirent_notif_delay | 5.10.2. Attribute 57: dirent_notif_delay | |||
| The dirent_notif_delay attribute is the minimum number of seconds the | The dirent_notif_delay attribute is the minimum number of seconds the | |||
| server will delay before notifying the client of a change to a file | server will delay before notifying the client of a change to a file | |||
| object that has an entry in the directory. | object that has an entry in the directory. | |||
| 5.11. pNFS Attribute Definitions | 5.11. pNFS Attribute Definitions | |||
| 5.11.1. Attribute 62: fs_layout_type | 5.11.1. Attribute 62: fs_layout_type | |||
| The fs_layout_type attribute (data type layouttype4 (Section 3.3.13)) | The fs_layout_type attribute (see Section 3.3.13) applies to a file | |||
| applies to a file system and indicates what layout types are | system and indicates what layout types are supported by the file | |||
| supported by the file system. When the client encounters a new fsid, | system. When the client encounters a new fsid, the client SHOULD | |||
| the client should obtain the value for the fs_layout_type attribute | obtain the value for the fs_layout_type attribute associated with the | |||
| associated with the new file system. This attribute is used by the | new file system. This attribute is used by the client to determine | |||
| client to determine if the layout types supported by the server match | if the layout types supported by the server match any of the client's | |||
| any of the client's supported layout types. | supported layout types. | |||
| 5.11.2. Attribute 66: layout_alignment | 5.11.2. Attribute 66: layout_alignment | |||
| When a client has layouts for a file system, the layout_alignment | When a client has layouts for a file system, the layout_alignment | |||
| attribute indicates the preferred alignment for I/O to files on that | attribute indicates the preferred alignment for I/O to files on that | |||
| file system. Where possible, the client should send READ and WRITE | file system. Where possible, the client should send READ and WRITE | |||
| operations with offsets that are whole multiples of the | operations with offsets that are whole multiples of the | |||
| layout_alignment attribute. | layout_alignment attribute. | |||
| 5.11.3. Attribute 65: layout_blksize | 5.11.3. Attribute 65: layout_blksize | |||
| When a client has layouts for a file system, the layout_blksize | When a client has layouts for a file system, the layout_blksize | |||
| attribute indicates the preferred block size for I/O to files on that | attribute indicates the preferred block size for I/O to files on that | |||
| file system. Where possible, the client should send READ operations | file system. Where possible, the client should send READ operations | |||
| with a count argument that is a whole multiple of layout_blksize, and | with a count argument that is a whole multiple of layout_blksize, and | |||
| WRITE operations with a data argument of size that is a whole | WRITE operations with a data argument of size that is a whole | |||
| multiple of layout_blksize. | multiple of layout_blksize. | |||
| 5.11.4. Attribute 63: layout_hint | 5.11.4. Attribute 63: layout_hint | |||
| The layout_hint attribute (data type layouthint4 (Section 3.3.19)) | The layout_hint attribute (see Section 3.3.19) may be set on newly | |||
| may be set on newly created files to influence the metadata server's | created files to influence the metadata server's choice for the | |||
| choice for the file's layout. If possible, this attribute is one of | file's layout. If possible, this attribute is one of those set in | |||
| those set in the initial attributes within the OPEN operation. The | the initial attributes within the OPEN operation. The metadata | |||
| metadata server may choose to ignore this attribute. The layout_hint | server may choose to ignore this attribute. The layout_hint | |||
| attribute is a sub-set of the layout structure returned by LAYOUTGET. | attribute is a sub-set of the layout structure returned by LAYOUTGET. | |||
| For example, instead of specifying particular devices, this would be | For example, instead of specifying particular devices, this would be | |||
| used to suggest the stripe width of a file. The server | used to suggest the stripe width of a file. The server | |||
| implementation determines which fields within the layout will be | implementation determines which fields within the layout will be | |||
| used. | used. | |||
| 5.11.5. Attribute 64: layout_type | 5.11.5. Attribute 64: layout_type | |||
| This attribute lists the layout type(s) available for a file. The | This attribute lists the layout type(s) available for a file. The | |||
| value returned by the server is for informational purposes only. The | value returned by the server is for informational purposes only. The | |||
| skipping to change at page 113, line 33 | skipping to change at page 113, line 33 | |||
| needed in order to perform I/O. For example, the specific device | needed in order to perform I/O. For example, the specific device | |||
| information for the file and its layout. | information for the file and its layout. | |||
| 5.11.6. Attribute 68: mdsthreshold | 5.11.6. Attribute 68: mdsthreshold | |||
| This attribute is a server provided hint used to communicate to the | This attribute is a server provided hint used to communicate to the | |||
| client when it is more efficient to send READ and WRITE operations to | client when it is more efficient to send READ and WRITE operations to | |||
| the metadata server or the data server. The two types of thresholds | the metadata server or the data server. The two types of thresholds | |||
| described are file size thresholds and I/O size thresholds. If a | described are file size thresholds and I/O size thresholds. If a | |||
| file's size is smaller than the file size threshold, data accesses | file's size is smaller than the file size threshold, data accesses | |||
| should be sent to the metadata server. If an I/O is below the I/O | SHOULD be sent to the metadata server. If an I/O request has a | |||
| size threshold, the I/O should be sent to the metadata server. As | length that is below the I/O size threshold, the I/O SHOULD be sent | |||
| defined, each threshold type is specified separately for READ and | to the metadata server. Each threshold type is specified separately | |||
| WRITE. | for READ and WRITE. | |||
| The server may provide both types of thresholds for a file. If both | The server MAY provide both types of thresholds for a file. If both | |||
| file size and I/O size are provided, the client should exceed both | file size and I/O size are provided, the client SHOULD reach or | |||
| thresholds before issuing its READ or WRITE requests to the data | exceed both thresholds before issuing its READ or WRITE requests to | |||
| server. Alternatively, if only one of the specified thresholds is | the data server. Alternatively, if only one of the specified | |||
| exceeded, the I/O requests are sent to the metadata server. | thresholds are reached or exceeded, the I/O requests are sent to the | |||
| metadata server. | ||||
| For each threshold type, a value of 0 indicates no READ or WRITE | For each threshold type, a value of 0 indicates no READ or WRITE | |||
| should be sent to the metadata server, while a value of all 1s | should be sent to the metadata server, while a value of all 1s | |||
| indicates all READS or WRITES should be sent to the metadata server. | indicates all READS or WRITES should be sent to the metadata server. | |||
| The attribute is available on a per filehandle basis. If the current | The attribute is available on a per filehandle basis. If the current | |||
| filehandle refers to a non-pNFS file or directory, the metadata | filehandle refers to a non-pNFS file or directory, the metadata | |||
| server should return an attribute that is representative of the | server should return an attribute that is representative of the | |||
| filehandle's file system. It is suggested that this attribute is | filehandle's file system. It is suggested that this attribute is | |||
| queried as part of the OPEN operation. Due to dynamic system | queried as part of the OPEN operation. Due to dynamic system | |||
| skipping to change at page 114, line 24 | skipping to change at page 114, line 25 | |||
| reached. | reached. | |||
| When retention is enabled, retention MUST extend to the data of the | When retention is enabled, retention MUST extend to the data of the | |||
| file, and the name of file. The server MAY extend retention any | file, and the name of file. The server MAY extend retention any | |||
| other property of the file, including any subset of REQUIRED, | other property of the file, including any subset of REQUIRED, | |||
| RECOMMENDED, and named attributes, with the exceptions noted in this | RECOMMENDED, and named attributes, with the exceptions noted in this | |||
| section. | section. | |||
| Servers MAY support or not support retention on any file object type. | Servers MAY support or not support retention on any file object type. | |||
| The five retention attributes are as follows: | The five retention attributes are explained in the next subsections. | |||
| 5.12.1. Attribute 69: retention_get | 5.12.1. Attribute 69: retention_get | |||
| If retention is enabled for the associated file, this attribute's | If retention is enabled for the associated file, this attribute's | |||
| value represents the retention begin time of the file object. This | value represents the retention begin time of the file object. This | |||
| attribute's value is only readable with the GETATTR operation and may | attribute's value is only readable with the GETATTR operation and may | |||
| not be modified by the SETATTR operation. The value of the attribute | not be modified by the SETATTR operation. The value of the attribute | |||
| consists of: | consists of: | |||
| const RET4_DURATION_INFINITE = 0xffffffffffffffff; | const RET4_DURATION_INFINITE = 0xffffffffffffffff; | |||
| skipping to change at page 115, line 43 | skipping to change at page 115, line 48 | |||
| 5.12.4. Attribute 72: retentevt_set | 5.12.4. Attribute 72: retentevt_set | |||
| Set the event-based retention duration, and optionally enable event- | Set the event-based retention duration, and optionally enable event- | |||
| based retention on the file object. This attribute corresponds to | based retention on the file object. This attribute corresponds to | |||
| retentevt_get, is like retention_set, but refers to event-based | retentevt_get, is like retention_set, but refers to event-based | |||
| retention. When event based retention is set, the file MUST be | retention. When event based retention is set, the file MUST be | |||
| retained even if non-event-based retention has been set, and the | retained even if non-event-based retention has been set, and the | |||
| duration of non-event-based retention has been reached. Conversely, | duration of non-event-based retention has been reached. Conversely, | |||
| when non-event-based retention has been set, the file MUST be | when non-event-based retention has been set, the file MUST be | |||
| retained even the event-based retention has been set, and the | retained even if event-based retention has been set, and the duration | |||
| duration of event-based retention has been reached. The server MAY | of event-based retention has been reached. The server MAY restrict | |||
| restrict the enabling of event-based retention or the duration of | the enabling of event-based retention or the duration of event-based | |||
| event-based retention on the basis of the ACE4_WRITE_RETENTION ACL | retention on the basis of the ACE4_WRITE_RETENTION ACL permission. | |||
| permission. The enabling of event-based retention does not prevent | The enabling of event-based retention does not prevent the enabling | |||
| the enabling of non-event-based retention nor the modification of the | of non-event-based retention nor the modification of the | |||
| retention_hold attribute. | retention_hold attribute. | |||
| 5.12.5. Attribute 73: retention_hold | 5.12.5. Attribute 73: retention_hold | |||
| Get or set administrative retention holds, one hold per bit position. | Get or set administrative retention holds, one hold per bit position. | |||
| This attribute allows one to 64 administrative holds, one hold per | This attribute allows one to 64 administrative holds, one hold per | |||
| bit on the attribute. If retention_hold is not zero, then the file | bit on the attribute. If retention_hold is not zero, then the file | |||
| MUST NOT be deleted, renamed, or modified, even if the duration on | MUST NOT be deleted, renamed, or modified, even if the duration on | |||
| enabled event or non-event-based retention has been reached. The | enabled event or non-event-based retention has been reached. The | |||
| skipping to change at page 160, line 13 | skipping to change at page 160, line 13 | |||
| type locking requests are allowed, unless the server is able to | type locking requests are allowed, unless the server is able to | |||
| reliably determine (through state persistently maintained across | reliably determine (through state persistently maintained across | |||
| reboot instances), that granting any such lock cannot possibly | reboot instances), that granting any such lock cannot possibly | |||
| conflict with a subsequent reclaim. When a request is made to obtain | conflict with a subsequent reclaim. When a request is made to obtain | |||
| a new lock (i.e. not a reclaim-type request) during the grace period | a new lock (i.e. not a reclaim-type request) during the grace period | |||
| and such a determination cannot be made, the server must return the | and such a determination cannot be made, the server must return the | |||
| error NFS4ERR_GRACE. | error NFS4ERR_GRACE. | |||
| Once a session is established using the new client ID, the client | Once a session is established using the new client ID, the client | |||
| will use reclaim-type locking requests (e.g. LOCK requests with | will use reclaim-type locking requests (e.g. LOCK requests with | |||
| reclaim set to true and OPEN operations with a claim type of | reclaim set to TRUE and OPEN operations with a claim type of | |||
| CLAIM_PREVIOUS. See Section 9.11) to re-establish its locking state. | CLAIM_PREVIOUS. See Section 9.11) to re-establish its locking state. | |||
| Once this is done, or if there is no such locking state to reclaim, | Once this is done, or if there is no such locking state to reclaim, | |||
| the client sends a global RECLAIM_COMPLETE operation, i.e. one with | the client sends a global RECLAIM_COMPLETE operation, i.e. one with | |||
| the one_fs argument set to false, to indicate that it has reclaimed | the rca_one_fs argument set to FALSE, to indicate that it has | |||
| all of the locking state that it will reclaim. Once a client sends | reclaimed all of the locking state that it will reclaim. Once a | |||
| such a RECLAIM_COMPLETE operation, it may attempt non-reclaim locking | client sends such a RECLAIM_COMPLETE operation, it may attempt non- | |||
| operations, although it may get NFS4ERR_GRACE errors the operations | reclaim locking operations, although it may get NFS4ERR_GRACE errors | |||
| until the period of special handling is over. See Section 11.7.7 for | the operations until the period of special handling is over. See | |||
| a discussion of the analogous handling lock reclamation in the case | Section 11.7.7 for a discussion of the analogous handling lock | |||
| of file systems transitioning from server to server. | reclamation in the case of file systems transitioning from server to | |||
| server. | ||||
| During the grace period, the server must reject READ and WRITE | During the grace period, the server must reject READ and WRITE | |||
| operations and non-reclaim locking requests (i.e. other LOCK and OPEN | operations and non-reclaim locking requests (i.e. other LOCK and OPEN | |||
| operations) with an error of NFS4ERR_GRACE, unless it is able to | operations) with an error of NFS4ERR_GRACE, unless it is able to | |||
| guarantee that these may be done safely, as described below. | guarantee that these may be done safely, as described below. | |||
| The grace period may last until all clients who are known to possibly | The grace period may last until all clients who are known to possibly | |||
| have had locks have done a global RECLAIM_COMPLETE operation, | have had locks have done a global RECLAIM_COMPLETE operation, | |||
| indicating that they have finished reclaiming the locks they held | indicating that they have finished reclaiming the locks they held | |||
| before the server reboot. This means that a client which has done a | before the server reboot. This means that a client which has done a | |||
| skipping to change at page 196, line 34 | skipping to change at page 196, line 34 | |||
| storage is OPTIONAL. | storage is OPTIONAL. | |||
| As discussed earlier in this section, the client MAY return the same | As discussed earlier in this section, the client MAY return the same | |||
| cc value on subsequent CB_GETATTR calls, even if the file was | cc value on subsequent CB_GETATTR calls, even if the file was | |||
| modified in the client's cache yet again between successive | modified in the client's cache yet again between successive | |||
| CB_GETATTR calls. Therefore, the server must assume that the file | CB_GETATTR calls. Therefore, the server must assume that the file | |||
| has been modified yet again, and MUST take care to ensure that the | has been modified yet again, and MUST take care to ensure that the | |||
| new nsc it constructs and returns is greater than the previous nsc it | new nsc it constructs and returns is greater than the previous nsc it | |||
| returned. An example implementation's delegation record would | returned. An example implementation's delegation record would | |||
| satisfy this mandate by including a boolean field (let us call it | satisfy this mandate by including a boolean field (let us call it | |||
| "modified") that is set to false when the delegation is granted, and | "modified") that is set to FALSE when the delegation is granted, and | |||
| an sc value set at the time of grant to the change attribute value. | an sc value set at the time of grant to the change attribute value. | |||
| The modified field would be set to true the first time cc != sc, and | The modified field would be set to true the first time cc != sc, and | |||
| would stay true until the delegation is returned or revoked. The | would stay true until the delegation is returned or revoked. The | |||
| processing for constructing nsc, time_modify, and time_metadata would | processing for constructing nsc, time_modify, and time_metadata would | |||
| use this pseudo code: | use this pseudo code: | |||
| if (!modified) { | if (!modified) { | |||
| do CB_GETATTR for change and size; | do CB_GETATTR for change and size; | |||
| if (cc != sc) | if (cc != sc) | |||
| skipping to change at page 231, line 15 | skipping to change at page 231, line 15 | |||
| reclaim after server reboot (although in the case of the planned | reclaim after server reboot (although in the case of the planned | |||
| state transfer associated with migration, these can be avoided by | state transfer associated with migration, these can be avoided by | |||
| securely recording lock state as part of state migration). Unless | securely recording lock state as part of state migration). Unless | |||
| the destination server can guarantee that locks will not be | the destination server can guarantee that locks will not be | |||
| incorrectly granted, the destination server should not allow lock | incorrectly granted, the destination server should not allow lock | |||
| reclaims and avoid establishing a grace period. | reclaims and avoid establishing a grace period. | |||
| Once all locks have been reclaimed, or there were no locks to | Once all locks have been reclaimed, or there were no locks to | |||
| reclaim, the client indicates that there are no more reclaims to be | reclaim, the client indicates that there are no more reclaims to be | |||
| done for the file system in question by issuing a RECLAIM_COMPLETE | done for the file system in question by issuing a RECLAIM_COMPLETE | |||
| operation with the one_fs parameter set to true. Once this has been | operation with the rca_one_fs parameter set to true. Once this has | |||
| done, non-reclaim locking operations may be done, and any subsequent | been done, non-reclaim locking operations may be done, and any | |||
| request to do reclaims will be rejected with the error | subsequent request to do reclaims will be rejected with the error | |||
| NFS4ERR_NO_GRACE. | NFS4ERR_NO_GRACE. | |||
| Information about client identity may be propagated between servers | Information about client identity may be propagated between servers | |||
| in the form of client_owner4 and associated verifiers, under the | in the form of client_owner4 and associated verifiers, under the | |||
| assumption that the client presents the same values to all the | assumption that the client presents the same values to all the | |||
| servers with which it deals. | servers with which it deals. | |||
| Servers are encouraged to provide facilities to allow locks to be | Servers are encouraged to provide facilities to allow locks to be | |||
| reclaimed on the new server after a file system transition. Often, | reclaimed on the new server after a file system transition. Often, | |||
| however, in cases in which the two servers do not share a server | however, in cases in which the two servers do not share a server | |||
| skipping to change at page 268, line 20 | skipping to change at page 268, line 20 | |||
| the server supports and the client is prepared to use. The layout | the server supports and the client is prepared to use. The layout | |||
| returned to the client may not exactly align with the requested byte | returned to the client may not exactly align with the requested byte | |||
| range. A field within the LAYOUTGET request, loga_minlength, | range. A field within the LAYOUTGET request, loga_minlength, | |||
| specifies the minimum length of the layout. The loga_minlength field | specifies the minimum length of the layout. The loga_minlength field | |||
| should be at least one. As needed a client may make multiple | should be at least one. As needed a client may make multiple | |||
| LAYOUTGET requests; these will result in multiple overlapping, non- | LAYOUTGET requests; these will result in multiple overlapping, non- | |||
| conflicting layouts. | conflicting layouts. | |||
| In order to get a layout, the client must first have opened the file | In order to get a layout, the client must first have opened the file | |||
| via the OPEN operation. When a client has no layout on a file, it | via the OPEN operation. When a client has no layout on a file, it | |||
| presents a stateid as returned by OPEN, a delegation stateid, or a | MUST present a stateid as returned by OPEN, a delegation stateid, or | |||
| byte-range lock stateid in the loga_stateid argument. A successful | a byte-range lock stateid in the loga_stateid argument. A successful | |||
| LAYOUTGET result includes a layout stateid. The first successful | LAYOUTGET result includes a layout stateid. The first successful | |||
| LAYOUTGET processed by the server using a non-layout stateid as an | LAYOUTGET processed by the server using a non-layout stateid as an | |||
| argument MUST have the "seqid" field of the layout stateid in the | argument MUST have the "seqid" field of the layout stateid in the | |||
| response set to one. Thereafter, the client uses a layout stateid | response set to one. Thereafter, the client uses a layout stateid | |||
| (see Section 12.5.3) on future invocations of LAYOUTGET on the file, | (see Section 12.5.3) on future invocations of LAYOUTGET on the file, | |||
| and the "seqid" MUST NOT ever be set to zero. Once the layout has | and the "seqid" MUST NOT ever be set to zero. Once the layout has | |||
| been retrieved, it can be held across multiple OPEN and CLOSE | been retrieved, it can be held across multiple OPEN and CLOSE | |||
| sequences. Therefore, a client may hold a layout for a file that is | sequences. Therefore, a client may hold a layout for a file that is | |||
| not currently open by any user on the client. This allows for the | not currently open by any user on the client. This allows for the | |||
| caching of layouts beyond CLOSE. | caching of layouts beyond CLOSE. | |||
| skipping to change at page 270, line 10 | skipping to change at page 270, line 10 | |||
| CB_LAYOUTRECALL request. Simply seeing the result or the | CB_LAYOUTRECALL request. Simply seeing the result or the | |||
| CB_LAYOUTRECALL request is not sufficient cause to use the seqid. | CB_LAYOUTRECALL request is not sufficient cause to use the seqid. | |||
| For LAYOUTGET results, if the client is not using the forgetful model | For LAYOUTGET results, if the client is not using the forgetful model | |||
| (Section 12.5.5.1), it MUST first update its record of what ranges of | (Section 12.5.5.1), it MUST first update its record of what ranges of | |||
| the file's layout it has before using the seqid. For LAYOUTRETURN | the file's layout it has before using the seqid. For LAYOUTRETURN | |||
| results, the client MUST delete the range from its record of what | results, the client MUST delete the range from its record of what | |||
| ranges of the file's layout it had before using the seqid. For | ranges of the file's layout it had before using the seqid. For | |||
| CB_LAYOUTRECALL arguments, the client MUST send a response to the | CB_LAYOUTRECALL arguments, the client MUST send a response to the | |||
| recall before using the seqid. | recall before using the seqid. | |||
| Once a client has no more layouts on a file, the layout stateid is no | ||||
| longer valid, and MUST NOT be used. Any attempt to use such a layout | ||||
| stateid will result in NFS4ERR_BAD_STATEID. | ||||
| 12.5.4. Committing a Layout | 12.5.4. Committing a Layout | |||
| Allowing for varying storage protocols capabilities, the pNFS | Allowing for varying storage protocols capabilities, the pNFS | |||
| protocol does not require the metadata server and storage devices to | protocol does not require the metadata server and storage devices to | |||
| have a consistent view of file attributes and data location mappings. | have a consistent view of file attributes and data location mappings. | |||
| Data location mapping refers to aspects such as which offsets store | Data location mapping refers to aspects such as which offsets store | |||
| data as opposed to storing holes (see Section 13.4.4 for a | data as opposed to storing holes (see Section 13.4.4 for a | |||
| discussion). Related issues arise for storage protocols where a | discussion). Related issues arise for storage protocols where a | |||
| layout may hold provisionally allocated blocks where the allocation | layout may hold provisionally allocated blocks where the allocation | |||
| of those blocks does not survive a complete restart of both the | of those blocks does not survive a complete restart of both the | |||
| skipping to change at page 271, line 5 | skipping to change at page 271, line 8 | |||
| The control protocol is free to synchronize the attributes before it | The control protocol is free to synchronize the attributes before it | |||
| receives a LAYOUTCOMMIT, however upon successful completion of a | receives a LAYOUTCOMMIT, however upon successful completion of a | |||
| LAYOUTCOMMIT, state that exists on the metadata server that describes | LAYOUTCOMMIT, state that exists on the metadata server that describes | |||
| the file MUST be in sync with the state existing on the storage | the file MUST be in sync with the state existing on the storage | |||
| devices that comprise that file as of the issuing client's last | devices that comprise that file as of the issuing client's last | |||
| operation. Thus, a client that queries the size of a file between a | operation. Thus, a client that queries the size of a file between a | |||
| WRITE to a storage device and the LAYOUTCOMMIT may observe a size | WRITE to a storage device and the LAYOUTCOMMIT may observe a size | |||
| that does not reflect the actual data written. | that does not reflect the actual data written. | |||
| The client MUST have a layout in order to issue LAYOUTCOMMIT. | ||||
| 12.5.4.1. LAYOUTCOMMIT and change/time_modify | 12.5.4.1. LAYOUTCOMMIT and change/time_modify | |||
| The change and time_modify attributes may be updated by the server | The change and time_modify attributes may be updated by the server | |||
| when the LAYOUTCOMMIT operation is processed. The reason for this is | when the LAYOUTCOMMIT operation is processed. The reason for this is | |||
| that some layout types do not support the update of these attributes | that some layout types do not support the update of these attributes | |||
| when the storage devices process I/O operations. The client is | when the storage devices process I/O operations. If client has a | |||
| capable providing a suggested value to the server for time_modify | layout with the LAYOUTIOMODE4_RW iomode on the file, the client MAY | |||
| within the arguments to LAYOUTCOMMIT. Based on layout type, the | provide a suggested value to the server for time_modify within the | |||
| provided value may or may not be used. The server should sanity | arguments to LAYOUTCOMMIT. Based on the layout type, the provided | |||
| check the client provided values before they are used. For example, | value may or may not be used. The server should sanity check the | |||
| the server should ensure that time does not flow backwards. The | client provided values before they are used. For example, the server | |||
| client always has the option to set time_modify through an explicit | should ensure that time does not flow backwards. The client always | |||
| SETATTR operation. | has the option to set time_modify through an explicit SETATTR | |||
| operation. | ||||
| For some layout protocols, the storage device is able to notify the | For some layout protocols, the storage device is able to notify the | |||
| metadata server of the occurrence of an I/O and as a result the | metadata server of the occurrence of an I/O and as a result the | |||
| change and time_modify attributes may be updated at the metadata | change and time_modify attributes may be updated at the metadata | |||
| server. For a metadata server that is capable of monitoring updates | server. For a metadata server that is capable of monitoring updates | |||
| to the change and time_modify attributes, LAYOUTCOMMIT processing is | to the change and time_modify attributes, LAYOUTCOMMIT processing is | |||
| not required to update the change attribute; in this case the | not required to update the change attribute; in this case the | |||
| metadata server must ensure that no further update to the data has | metadata server must ensure that no further update to the data has | |||
| occurred since the last update of the attributes; file-based | occurred since the last update of the attributes; file-based | |||
| protocols may have enough information to make this determination or | protocols may have enough information to make this determination or | |||
| skipping to change at page 271, line 45 | skipping to change at page 271, line 51 | |||
| 12.5.4.2. LAYOUTCOMMIT and size | 12.5.4.2. LAYOUTCOMMIT and size | |||
| The size of a file may be updated when the LAYOUTCOMMIT operation is | The size of a file may be updated when the LAYOUTCOMMIT operation is | |||
| used by the client. One of the fields in the argument to | used by the client. One of the fields in the argument to | |||
| LAYOUTCOMMIT is loca_last_write_offset; this field indicates the | LAYOUTCOMMIT is loca_last_write_offset; this field indicates the | |||
| highest byte offset written but not yet committed with the | highest byte offset written but not yet committed with the | |||
| LAYOUTCOMMIT operation. The data type of lora_last_write_offset is | LAYOUTCOMMIT operation. The data type of lora_last_write_offset is | |||
| newoffset4 and is switched on a boolean value, no_newoffset, that | newoffset4 and is switched on a boolean value, no_newoffset, that | |||
| indicates if a previous write occurred or not. If no_newoffset is | indicates if a previous write occurred or not. If no_newoffset is | |||
| FALSE, an offset is not given. A loca_last_write_offset value of | FALSE, an offset is not given. If the client has a layout with | |||
| zero means that one byte was written at offset zero. | LAYOUTIOMODE4_RW iomode on the file, with an lo_offset and lo_length | |||
| that overlaps loca_last_write_offset, then the client MAY set | ||||
| no_newoffset to TRUE and provide an offset that will update the file | ||||
| size. Keep in mind that offset is not the same as length, though | ||||
| they are related. For example, a loca_last_write_offset value of | ||||
| zero means that one byte was written at offset zero, and so the | ||||
| length of the file is at least one byte. | ||||
| The metadata server may do one of the following: | The metadata server may do one of the following: | |||
| 1. Update the file's size using the last write offset provided by | 1. Update the file's size using the last write offset provided by | |||
| the client as either the true file size or as a hint of the file | the client as either the true file size or as a hint of the file | |||
| size. If the metadata server has a method available, any new | size. If the metadata server has a method available, any new | |||
| value for file size should be sanity checked. For example, the | value for file size should be sanity checked. For example, the | |||
| file must not be truncated if the client presents a last write | file must not be truncated if the client presents a last write | |||
| offset less than the file's current size. | offset less than the file's current size. | |||
| skipping to change at page 281, line 46 | skipping to change at page 282, line 11 | |||
| LAYOUTCOMMIT to commit the modification time and the new size of the | LAYOUTCOMMIT to commit the modification time and the new size of the | |||
| file (if it believes it extended the file size) to the metadata | file (if it believes it extended the file size) to the metadata | |||
| server and the modified data to the file system. | server and the modified data to the file system. | |||
| 12.7. Recovery | 12.7. Recovery | |||
| Recovery is complicated by the distributed nature of the pNFS | Recovery is complicated by the distributed nature of the pNFS | |||
| protocol. In general, crash recovery for layouts is similar to crash | protocol. In general, crash recovery for layouts is similar to crash | |||
| recovery for delegations in the base NFSv4.1 protocol. However, the | recovery for delegations in the base NFSv4.1 protocol. However, the | |||
| client's ability to perform I/O without contacting the metadata | client's ability to perform I/O without contacting the metadata | |||
| server subtleties that must be handled correctly if the possibility | server introduces subtleties that must be handled correctly if the | |||
| of file system corruption is to be avoided. [[Comment.4: mre: | possibility of file system corruption is to be avoided. | |||
| layouts are bound to stateids]] | ||||
| 12.7.1. Recovery from Client Restart | 12.7.1. Recovery from Client Restart | |||
| Client recovery for layouts is similar to client recovery for other | Client recovery for layouts is similar to client recovery for other | |||
| lock and delegation state. When an pNFS client restarts, it will | lock and delegation state. When an pNFS client restarts, it will | |||
| lose all information about the layouts that it previously owned. | lose all information about the layouts that it previously owned. | |||
| There are two methods by which the server can reclaim these resources | There are two methods by which the server can reclaim these resources | |||
| and allow otherwise conflicting layouts to be provided to other | and allow otherwise conflicting layouts to be provided to other | |||
| clients. | clients. | |||
| skipping to change at page 290, line 39 | skipping to change at page 290, line 45 | |||
| If a server is both a metadata server and a data server, the server | If a server is both a metadata server and a data server, the server | |||
| might need to distinguish operations on files that are directed to | might need to distinguish operations on files that are directed to | |||
| the metadata server from those that are directed to the data server. | the metadata server from those that are directed to the data server. | |||
| It is RECOMMENDED that the values of the filehandles returned by the | It is RECOMMENDED that the values of the filehandles returned by the | |||
| LAYOUTGET operation to be different than the value of the filehandle | LAYOUTGET operation to be different than the value of the filehandle | |||
| returned by the OPEN of the same file. | returned by the OPEN of the same file. | |||
| Another scenario is for the metadata server and the storage device to | Another scenario is for the metadata server and the storage device to | |||
| be distinct from one client's point of view, and the roles reversed | be distinct from one client's point of view, and the roles reversed | |||
| from another client's point of view. For example, in the cluster | from another client's point of view. For example, in the cluster | |||
| file system model a metadata server to one client, may be a data | file system model, a metadata server to one client may be a data | |||
| server to another client. If NFSv4.1 is being used as the storage | server to another client. If NFSv4.1 is being used as the storage | |||
| protocol, then pNFS servers need to encode the values of filehandles | protocol, then pNFS servers need to encode the values of filehandles | |||
| according to their specific roles. | according to their specific roles. | |||
| 13.1.1. Sessions Considerations for Data Servers | ||||
| Section 2.10.9.2 states that a client has to keep its lease renewed | ||||
| in order to prevent a session from being deleted by the server. If | ||||
| the reply to EXCHANGE_ID has just the EXCHGID4_FLAG_USE_PNFS_DS role | ||||
| set, then as noted in Section 13.6 the client will not be able to | ||||
| determine the data server's lease_time attribute, because GETATTR | ||||
| will not be permitted. Instead, the rule is that any time a client | ||||
| receives a layout referring it to a data server that returns just the | ||||
| EXCHGID4_FLAG_USE_PNFS_DS role, the client MAY assume that the | ||||
| lease_time attribute from the metadata server that returned the | ||||
| layout applies to the data server. Thus the data server MUST be | ||||
| aware of the values of all lease_time attributes of all metadata | ||||
| servers it is providing I/O for, and MUST use the maximum of all such | ||||
| lease_time values as the lease interval for all client IDs and | ||||
| sessions established on it. | ||||
| For example, if one metadata server has a lease_time attribute of 20 | ||||
| seconds, and a second metadata server has a lease_time attribute of | ||||
| 10 seconds, then if both servers return layouts that refer to an | ||||
| EXCHGID4_FLAG_USE_PNFS_DS-only data server, the data server MUST | ||||
| renew a client's lease if the interval between two SEQUENCE | ||||
| operations on different COMPOUND requests is less than 20 seconds. | ||||
| 13.2. File Layout Definitions | 13.2. File Layout Definitions | |||
| The following definitions apply to the LAYOUT4_NFSV4_1_FILES layout | The following definitions apply to the LAYOUT4_NFSV4_1_FILES layout | |||
| type, and may be applicable to other layout types. | type, and may be applicable to other layout types. | |||
| Unit. A unit is a fixed size quantity of data written to a data | Unit. A unit is a fixed size quantity of data written to a data | |||
| server. | server. | |||
| Pattern. A pattern is a method of distributing one or more equal | Pattern. A pattern is a method of distributing one or more equal | |||
| sized units across a set of data servers. A pattern is iterated | sized units across a set of data servers. A pattern is iterated | |||
| skipping to change at page 304, line 20 | skipping to change at page 305, line 20 | |||
| personalities, each COMPOUND sent by the client MUST be constructed | personalities, each COMPOUND sent by the client MUST be constructed | |||
| so that it is appropriate to one of the two personalities, and must | so that it is appropriate to one of the two personalities, and must | |||
| not contain operations directed to a mix of those personalities. The | not contain operations directed to a mix of those personalities. The | |||
| server MUST enforce this. To understand the constraints, operations | server MUST enforce this. To understand the constraints, operations | |||
| within a COMPOUND are divided into the following three classes: | within a COMPOUND are divided into the following three classes: | |||
| 1. An operation which is ambiguous regarding its personality | 1. An operation which is ambiguous regarding its personality | |||
| assignment. These include all of the data-server housekeeping | assignment. These include all of the data-server housekeeping | |||
| operations. Additionally, if the server has assigned filehandles | operations. Additionally, if the server has assigned filehandles | |||
| so that the ones defined by the layout are the same as those used | so that the ones defined by the layout are the same as those used | |||
| by the meta-data server, all operations in the second class are | by the metadata server, all operations in the second class are | |||
| within this group unless a stateid used is incompatible with a | within this group unless a stateid used is incompatible with a | |||
| data-server personality in that it is a special stateid or has a | data-server personality in that it is a special stateid or has a | |||
| non-zero seqid field. | non-zero seqid field. | |||
| 2. An operation which is referable to the data server personality. | 2. An operation which is referable to the data server personality. | |||
| These are data-server I/O operations where the filehandle is one | These are data-server I/O operations where the filehandle is one | |||
| that can only be validly directed to the data-server personality. | that can only be validly directed to the data-server personality. | |||
| 3. An operation which is referable to the non-data-server | 3. An operation which is referable to the non-data-server | |||
| personality. These include all COMPOUND operations that are | personality. These include all COMPOUND operations that are | |||
| skipping to change at page 305, line 41 | skipping to change at page 306, line 41 | |||
| has completed (see Section 12.5.4.2). Section 13.10, describes the | has completed (see Section 12.5.4.2). Section 13.10, describes the | |||
| mechanism by which the client is to handle data server files that do | mechanism by which the client is to handle data server files that do | |||
| not reflect the metadata server's size. | not reflect the metadata server's size. | |||
| 13.7. COMMIT Through Metadata Server | 13.7. COMMIT Through Metadata Server | |||
| The file layout provides two alternate means of providing for the | The file layout provides two alternate means of providing for the | |||
| commit of data written through data servers. The flag | commit of data written through data servers. The flag | |||
| NFL4_UFLG_COMMIT_THRU_MDS in the field nfl_util of the file layout | NFL4_UFLG_COMMIT_THRU_MDS in the field nfl_util of the file layout | |||
| (data type nfsv4_1_file_layout4) is an indication from the metadata | (data type nfsv4_1_file_layout4) is an indication from the metadata | |||
| server to the client of the preferred way of performing COMMIT, | server to the client of the REQUIRED way of performing COMMIT, either | |||
| either by sending the COMMIT to the data server or the metadata | by sending the COMMIT to the data server or the metadata server. | |||
| server. These two methods of dealing with the issue correspond to | These two methods of dealing with the issue correspond to broad | |||
| broad styles of implementation for a pNFS server supporting the files | styles of implementation for a pNFS server supporting the files | |||
| layout type. | layout type. | |||
| o When the flag is false, COMMIT operations are to be done to the | o When the flag is FALSE, COMMIT operations MUST to be sent to the | |||
| data server to which the corresponding writes were done. This | data server to which the corresponding WRITE operations were sent. | |||
| approach is most useful when striping of files is implemented as | This approach is most useful when striping of files is implemented | |||
| part of pNFS server, with the individual data servers each | as part of pNFS server, with the individual data servers each | |||
| implementing their own file systems. | implementing their own file systems. | |||
| o When the flag is true, COMMIT operations are done to the metadata | o When the flag is TRUE, COMMIT operations MUST be sent to the | |||
| server, rather than to the individual data servers. This approach | metadata server, rather than to the individual data servers. This | |||
| is most useful when the pNFS server is implemented on top of a | approach is most useful when the pNFS server is implemented on top | |||
| clustered file system. In such an implementation, sending | of a clustered file system. In such an implementation, sending | |||
| COMMIT's to multiple data servers may result in repeated writes of | COMMIT's to multiple data servers may result in repeated writes of | |||
| metadata blocks as each individual COMMIT is executed, to the | metadata blocks as each individual COMMIT is executed, to the | |||
| detriment of write performance. Sending a single COMMIT to the | detriment of write performance. Sending a single COMMIT to the | |||
| metadata server can provide more efficiency when there exists a | metadata server can provide more efficiency when there exists a | |||
| clustered file system capable of implementing such a co-ordinated | clustered file system capable of implementing such a co-ordinated | |||
| COMMIT. | COMMIT. | |||
| If nfl_util & NFL4_UFLG_COMMIT_THRU_MDS is TRUE, then in order to | If nfl_util & NFL4_UFLG_COMMIT_THRU_MDS is TRUE, then in order to | |||
| maintain the current NFSv4.1 commit and recovery model, the data | maintain the current NFSv4.1 commit and recovery model, the data | |||
| servers MUST return a common writeverf verifier in all WRITE | servers MUST return a common writeverf verifier in all WRITE | |||
| skipping to change at page 314, line 32 | skipping to change at page 315, line 32 | |||
| Table B.1 | Table B.1 | |||
| Table B.2 is normally not part of the nfs4_cs_prep profile as it is | Table B.2 is normally not part of the nfs4_cs_prep profile as it is | |||
| primarily for dealing with case-insensitive comparisons. However, if | primarily for dealing with case-insensitive comparisons. However, if | |||
| the NFSv4.1 file server supports the case_insensitive file system | the NFSv4.1 file server supports the case_insensitive file system | |||
| attribute, and if case_insensitive is true, the NFSv4.1 server MUST | attribute, and if case_insensitive is true, the NFSv4.1 server MUST | |||
| use Table B.2 (in addition to Table B1) when processing utf8str_cs | use Table B.2 (in addition to Table B1) when processing utf8str_cs | |||
| strings, and the NFSv4.1 client MUST assume Table B.2 (in addition to | strings, and the NFSv4.1 client MUST assume Table B.2 (in addition to | |||
| Table B.1) are being used. | Table B.1) are being used. | |||
| If the case_preserving attribute is present and set to false, then | If the case_preserving attribute is present and set to FALSE, then | |||
| the NFSv4.1 server MUST use table B.2 to map case when processing | the NFSv4.1 server MUST use table B.2 to map case when processing | |||
| utf8str_cs strings. Whether the server maps from lower to upper case | utf8str_cs strings. Whether the server maps from lower to upper case | |||
| or the upper to lower case is an implementation dependency. | or the upper to lower case is an implementation dependency. | |||
| 14.1.4. Normalization used by nfs4_cs_prep | 14.1.4. Normalization used by nfs4_cs_prep | |||
| The nfs4_cs_prep profile does not specify a normalization form. A | The nfs4_cs_prep profile does not specify a normalization form. A | |||
| later revision of this specification may specify a particular | later revision of this specification may specify a particular | |||
| normalization form. Therefore, the server and client can expect that | normalization form. Therefore, the server and client can expect that | |||
| they may receive unnormalized characters within protocol requests and | they may receive unnormalized characters within protocol requests and | |||
| skipping to change at page 342, line 35 | skipping to change at page 343, line 35 | |||
| | GETFH | NFS4ERR_FHEXPIRED, NFS4ERR_MOVED, | | | GETFH | NFS4ERR_FHEXPIRED, NFS4ERR_MOVED, | | |||
| | | NFS4ERR_NOFILEHANDLE, | | | | NFS4ERR_NOFILEHANDLE, | | |||
| | | NFS4ERR_OP_NOT_IN_SESSION, NFS4ERR_STALE | | | | NFS4ERR_OP_NOT_IN_SESSION, NFS4ERR_STALE | | |||
| | ILLEGAL | NFS4ERR_BADXDR NFS4ERR_OP_ILLEGAL | | | ILLEGAL | NFS4ERR_BADXDR NFS4ERR_OP_ILLEGAL | | |||
| | LAYOUTCOMMIT | NFS4ERR_ACCESS, NFS4ERR_ADMIN_REVOKED, | | | LAYOUTCOMMIT | NFS4ERR_ACCESS, NFS4ERR_ADMIN_REVOKED, | | |||
| | | NFS4ERR_ATTRNOTSUPP, NFS4ERR_BADIOMODE, | | | | NFS4ERR_ATTRNOTSUPP, NFS4ERR_BADIOMODE, | | |||
| | | NFS4ERR_BADLAYOUT, NFS4ERR_BADXDR, | | | | NFS4ERR_BADLAYOUT, NFS4ERR_BADXDR, | | |||
| | | NFS4ERR_DEADSESSION, NFS4ERR_DELAY, | | | | NFS4ERR_DEADSESSION, NFS4ERR_DELAY, | | |||
| | | NFS4ERR_EXPIRED, NFS4ERR_FBIG, | | | | NFS4ERR_EXPIRED, NFS4ERR_FBIG, | | |||
| | | NFS4ERR_FHEXPIRED, NFS4ERR_GRACE, | | | | NFS4ERR_FHEXPIRED, NFS4ERR_GRACE, | | |||
| | | NFS4ERR_IO, NFS4ERR_ISDIR NFS4ERR_MOVED, | | | | NFS4ERR_INVAL, NFS4ERR_IO, NFS4ERR_ISDIR | | |||
| | | NFS4ERR_NOFILEHANDLE, NFS4ERR_NOTSUPP, | | | | NFS4ERR_MOVED, NFS4ERR_NOFILEHANDLE, | | |||
| | | NFS4ERR_NO_GRACE, | | | | NFS4ERR_NOTSUPP, NFS4ERR_NO_GRACE, | | |||
| | | NFS4ERR_OP_NOT_IN_SESSION, | | | | NFS4ERR_OP_NOT_IN_SESSION, | | |||
| | | NFS4ERR_RECLAIM_BAD, | | | | NFS4ERR_RECLAIM_BAD, | | |||
| | | NFS4ERR_RECLAIM_CONFLICT, | | | | NFS4ERR_RECLAIM_CONFLICT, | | |||
| | | NFS4ERR_REP_TOO_BIG, | | | | NFS4ERR_REP_TOO_BIG, | | |||
| | | NFS4ERR_REP_TOO_BIG_TO_CACHE, | | | | NFS4ERR_REP_TOO_BIG_TO_CACHE, | | |||
| | | NFS4ERR_REQ_TOO_BIG, NFS4ERR_SERVERFAULT, | | | | NFS4ERR_REQ_TOO_BIG, NFS4ERR_SERVERFAULT, | | |||
| | | NFS4ERR_STALE, NFS4ERR_SYMLINK, | | | | NFS4ERR_STALE, NFS4ERR_SYMLINK, | | |||
| | | NFS4ERR_TOO_MANY_OPS, | | | | NFS4ERR_TOO_MANY_OPS, | | |||
| | | NFS4ERR_UNKNOWN_LAYOUTTYPE, | | | | NFS4ERR_UNKNOWN_LAYOUTTYPE, | | |||
| | | NFS4ERR_WRONG_CRED | | | | NFS4ERR_WRONG_CRED | | |||
| skipping to change at page 361, line 38 | skipping to change at page 362, line 38 | |||
| | NFS4ERR_INVAL | ACCESS, BACKCHANNEL_CTL, | | | NFS4ERR_INVAL | ACCESS, BACKCHANNEL_CTL, | | |||
| | | BIND_CONN_TO_SESSION, | | | | BIND_CONN_TO_SESSION, | | |||
| | | CB_GETATTR, CB_LAYOUTRECALL, | | | | CB_GETATTR, CB_LAYOUTRECALL, | | |||
| | | CB_NOTIFY, CB_PUSH_DELEG, | | | | CB_NOTIFY, CB_PUSH_DELEG, | | |||
| | | CB_RECALLABLE_OBJ_AVAIL, | | | | CB_RECALLABLE_OBJ_AVAIL, | | |||
| | | CB_RECALL_ANY, CREATE, | | | | CB_RECALL_ANY, CREATE, | | |||
| | | CREATE_SESSION, DELEGRETURN, | | | | CREATE_SESSION, DELEGRETURN, | | |||
| | | EXCHANGE_ID, GETATTR, | | | | EXCHANGE_ID, GETATTR, | | |||
| | | GETDEVICEINFO, GETDEVICELIST, | | | | GETDEVICEINFO, GETDEVICELIST, | | |||
| | | GET_DIR_DELEGATION, | | | | GET_DIR_DELEGATION, | | |||
| | | LAYOUTGET, LAYOUTRETURN, | | | | LAYOUTCOMMIT, LAYOUTGET, | | |||
| | | LINK, LOCK, LOCKT, LOCKU, | | | | LAYOUTRETURN, LINK, LOCK, | | |||
| | | LOOKUP, NVERIFY, OPEN, | | | | LOCKT, LOCKU, LOOKUP, | | |||
| | | NVERIFY, OPEN, | | ||||
| | | OPEN_DOWNGRADE, READ, | | | | OPEN_DOWNGRADE, READ, | | |||
| | | READDIR, READLINK, | | | | READDIR, READLINK, | | |||
| | | RECLAIM_COMPLETE, REMOVE, | | | | RECLAIM_COMPLETE, REMOVE, | | |||
| | | RENAME, SECINFO, | | | | RENAME, SECINFO, | | |||
| | | SECINFO_NO_NAME, SETATTR, | | | | SECINFO_NO_NAME, SETATTR, | | |||
| | | VERIFY, WANT_DELEGATION, | | | | VERIFY, WANT_DELEGATION, | | |||
| | | WRITE | | | | WRITE | | |||
| | NFS4ERR_IO | ACCESS, COMMIT, CREATE, | | | NFS4ERR_IO | ACCESS, COMMIT, CREATE, | | |||
| | | GETATTR, GETDEVICELIST, | | | | GETATTR, GETDEVICELIST, | | |||
| | | GET_DIR_DELEGATION, | | | | GET_DIR_DELEGATION, | | |||
| skipping to change at page 426, line 48 | skipping to change at page 427, line 48 | |||
| In absence of a persistent session, the client invokes exclusive | In absence of a persistent session, the client invokes exclusive | |||
| create by setting the how parameter to EXCLUSIVE4 or EXCLUSIVE4_1. | create by setting the how parameter to EXCLUSIVE4 or EXCLUSIVE4_1. | |||
| In these cases, the client provides a verifier that can reasonably be | In these cases, the client provides a verifier that can reasonably be | |||
| expected to be unique. A combination of a client identifier, perhaps | expected to be unique. A combination of a client identifier, perhaps | |||
| the client network address, and a unique number generated by the | the client network address, and a unique number generated by the | |||
| client, perhaps the RPC transaction identifier, may be appropriate. | client, perhaps the RPC transaction identifier, may be appropriate. | |||
| If the object does not exist, the server creates the object and | If the object does not exist, the server creates the object and | |||
| stores the verifier in stable storage. For file systems that do not | stores the verifier in stable storage. For file systems that do not | |||
| provide a mechanism for the storage of arbitrary file attributes, the | provide a mechanism for the storage of arbitrary file attributes, the | |||
| server may use one or more elements of the object meta-data to store | server may use one or more elements of the object metadata to store | |||
| the verifier. The verifier must be stored in stable storage to | the verifier. The verifier must be stored in stable storage to | |||
| prevent erroneous failure on retransmission of the request. It is | prevent erroneous failure on retransmission of the request. It is | |||
| assumed that an exclusive create is being performed because exclusive | assumed that an exclusive create is being performed because exclusive | |||
| semantics are critical to the application. Because of the expected | semantics are critical to the application. Because of the expected | |||
| usage, exclusive CREATE does not rely solely on the server's reply | usage, exclusive CREATE does not rely solely on the server's reply | |||
| cache for storage of the verifier. A nonpersistent reply cache does | cache for storage of the verifier. A nonpersistent reply cache does | |||
| not survive a crash and the session and reply cache may be deleted | not survive a crash and the session and reply cache may be deleted | |||
| after a network partition that exceeds the lease time, thus opening | after a network partition that exceeds the lease time, thus opening | |||
| failure windows. | failure windows. | |||
| skipping to change at page 485, line 31 | skipping to change at page 486, line 31 | |||
| uses (which will be either what the client offered, or what the | uses (which will be either what the client offered, or what the | |||
| server is insisting on). return the value used to the client. These | server is insisting on). return the value used to the client. These | |||
| parameters have the following interpretation. | parameters have the following interpretation. | |||
| csa_flags: | csa_flags: | |||
| The csa_flags field contains a list of the following flag bits: | The csa_flags field contains a list of the following flag bits: | |||
| CREATE_SESSION4_FLAG_PERSIST: | CREATE_SESSION4_FLAG_PERSIST: | |||
| If CREATE_SESSION4_FLAG_PERSIST is set, the client desires | If CREATE_SESSION4_FLAG_PERSIST is set, the client wants the | |||
| server support for persistent reply cache. For sessions in | server to provide a persistent reply cache. For sessions in | |||
| which only idempotent operations will be used (e.g. a read-only | which only idempotent operations will be used (e.g. a read-only | |||
| session), clients SHOULD NOT set CREATE_SESSION4_FLAG_PERSIST. | session), clients SHOULD NOT set CREATE_SESSION4_FLAG_PERSIST. | |||
| If the server does not or cannot provide a persistent reply | If the server does not or cannot provide a persistent reply | |||
| cache, the server MUST NOT set CREATE_SESSION4_FLAG_PERSIST in | cache, the server MUST NOT set CREATE_SESSION4_FLAG_PERSIST in | |||
| the field csr_flags. | the field csr_flags. | |||
| If the server is a pNFS metadata server, for reasons described | If the server is a pNFS metadata server, for reasons described | |||
| in Section 12.5.2 it SHOULD support | in Section 12.5.2 it SHOULD support | |||
| CREATE_SESSION4_FLAG_PERSIST if it supports the layout_hint | CREATE_SESSION4_FLAG_PERSIST if it supports the layout_hint | |||
| (Section 5.11.4) attribute. | (Section 5.11.4) attribute. | |||
| skipping to change at page 493, line 20 | skipping to change at page 494, line 20 | |||
| 18.37.2. RESULT | 18.37.2. RESULT | |||
| struct DESTROY_SESSION4res { | struct DESTROY_SESSION4res { | |||
| nfsstat4 dsr_status; | nfsstat4 dsr_status; | |||
| }; | }; | |||
| 18.37.3. DESCRIPTION | 18.37.3. DESCRIPTION | |||
| The DESTROY_SESSION operation closes the session and discards the | The DESTROY_SESSION operation closes the session and discards the | |||
| session's its reply cache, if any. Any remaining connections | session's reply cache, if any. Any remaining connections associated | |||
| associated with the session are immediately disassociated and it not | with the session are immediately disassociated and it not associated | |||
| associated with out sessions, MAY be closed by the server. Locks, | with out sessions, MAY be closed by the server. Locks, delegations, | |||
| delegations, layouts, wants, and the lease, which are all tied to the | layouts, wants, and the lease, which are all tied to the client ID, | |||
| client ID, are not affected by DESTROY_SESSION. | are not affected by DESTROY_SESSION. | |||
| DESTROY_SESSION MUST be invoked on a connection that is associated | DESTROY_SESSION MUST be invoked on a connection that is associated | |||
| with the session being destroyed. In addition if SP4_MACH_CRED state | with the session being destroyed. In addition if SP4_MACH_CRED state | |||
| protection was specified when the client ID was created, the | protection was specified when the client ID was created, the | |||
| RPCSEC_GSS principal that created the session MUST be the one that | RPCSEC_GSS principal that created the session MUST be the one that | |||
| destroys the session, using RPCSEC_GSS privacy or integrity. If | destroys the session, using RPCSEC_GSS privacy or integrity. If | |||
| SP4_SSV state protection was specified when the client ID was | SP4_SSV state protection was specified when the client ID was | |||
| created, RPCSEC_GSS using the SSV mechanism (Section 2.10.8) MUST be | created, RPCSEC_GSS using the SSV mechanism (Section 2.10.8) MUST be | |||
| used, with integrity or privacy. | used, with integrity or privacy. | |||
| If the COMPOUND request starts with SEQUENCE, and if the sessions | If the COMPOUND request starts with SEQUENCE, and if the sessions | |||
| referred to by SEQUENCE and DESTROY_SESSION are the same, then | referred to by SEQUENCE and DESTROY_SESSION are the same, then | |||
| o DESTROY_SESSION MUST be the final operation in the COMPOUND | o DESTROY_SESSION MUST be the final operation in the COMPOUND | |||
| request. | request. | |||
| o It is advisable to not place DESTROY_SESSION in a COMPOUND request | o It is advisable to not place DESTROY_SESSION in a COMPOUND request | |||
| with other state-modifying operations, because the DESTROY_SESSION | with other state-modifying operations, because the DESTROY_SESSION | |||
| will destroy reply cache. | will destroy the reply cache. | |||
| DESTROY_SESSION MAY be the only operation in a COMPOUND request. | DESTROY_SESSION MAY be the only operation in a COMPOUND request. | |||
| Because the session is destroyed, a client that retries the request | Because the session is destroyed, a client that retries the request | |||
| may receive an error in reply to the retry, even though the original | may receive an error in reply to the retry, even though the original | |||
| request was successful. | request was successful. | |||
| If there is a backchannel on the session and the server has | If there is a backchannel on the session and the server has | |||
| outstanding CB_COMPOUND operations for the session which have not | outstanding CB_COMPOUND operations for the session which have not | |||
| been replied to, then the server MAY refuse to destroy the session | been replied to, then the server MAY refuse to destroy the session | |||
| skipping to change at page 504, line 32 | skipping to change at page 505, line 32 | |||
| void; | void; | |||
| }; | }; | |||
| 18.42.3. DESCRIPTION | 18.42.3. DESCRIPTION | |||
| Commits changes in the layout represented by the current filehandle, | Commits changes in the layout represented by the current filehandle, | |||
| client ID (derived from the sessionid in the preceding SEQUENCE | client ID (derived from the sessionid in the preceding SEQUENCE | |||
| operation), byte range, and stateid. Since layouts are sub- | operation), byte range, and stateid. Since layouts are sub- | |||
| dividable, a smaller portion of a layout, retrieved via LAYOUTGET, | dividable, a smaller portion of a layout, retrieved via LAYOUTGET, | |||
| may be committed. The region being committed is specified through | may be committed. The region being committed is specified through | |||
| the byte range (loca_offset and loca_length). | the byte range (loca_offset and loca_length). This region MUST | |||
| overlap with one or more existing layouts previously granted via | ||||
| LAYOUTGET (Section 18.43), each with an iomode of LAYOUTIOMODE4_RW. | ||||
| The LAYOUTCOMMIT operation indicates that the client has completed | The LAYOUTCOMMIT operation indicates that the client has completed | |||
| writes using a layout obtained by a previous LAYOUTGET. The client | writes using a layout obtained by a previous LAYOUTGET. The client | |||
| may have only written a subset of the data range it previously | may have only written a subset of the data range it previously | |||
| requested. LAYOUTCOMMIT allows it to commit or discard provisionally | requested. LAYOUTCOMMIT allows it to commit or discard provisionally | |||
| allocated space and to update the server with a new end of file. The | allocated space and to update the server with a new end of file. The | |||
| layout referenced by LAYOUTCOMMIT is still valid after the operation | layout referenced by LAYOUTCOMMIT is still valid after the operation | |||
| completes and can be continued to be referenced by the client ID, | completes and can be continued to be referenced by the client ID, | |||
| filehandle, byte range, layout type, and stateid. | filehandle, byte range, layout type, and stateid. | |||
| If the loca_reclaim field is set to TRUE, this indicates that the | If the loca_reclaim field is set to TRUE, this indicates that the | |||
| client is attempting to commit changes to a layout after the reboot | client is attempting to commit changes to a layout after the reboot | |||
| of the metadata server during the metadata server's recovery grace | of the metadata server during the metadata server's recovery grace | |||
| period. This type of request may be necessary when the client has | period (see Section 12.7.4). This type of request may be necessary | |||
| uncommitted writes to provisionally allocated regions of a file which | when the client has uncommitted writes to provisionally allocated | |||
| were sent to the storage devices before the reboot of the metadata | regions of a file which were sent to the storage devices before the | |||
| server. In this case the layout provided by the client MUST be a | reboot of the metadata server. In this case the layout provided by | |||
| subset of a writable layout that the client held immediately before | the client MUST be a subset of a writable layout that the client held | |||
| the reboot of the metadata server. The metadata server is free to | immediately before the reboot of the metadata server. The metadata | |||
| accept or reject this request based on its own internal metadata | server is free to accept or reject this request based on its own | |||
| consistency checks. If the metadata server finds that the layout | internal metadata consistency checks. If the metadata server finds | |||
| provided by the client does not pass its consistency checks, it MUST | that the layout provided by the client does not pass its consistency | |||
| reject the request with the status NFS4ERR_RECLAIM_BAD. The | checks, it MUST reject the request with the status | |||
| successful completion of the LAYOUTCOMMIT request with loca_reclaim | NFS4ERR_RECLAIM_BAD. The successful completion of the LAYOUTCOMMIT | |||
| set to TRUE does NOT provide the client with a layout for the file. | request with loca_reclaim set to TRUE does NOT provide the client | |||
| It simply commits the changes to the layout specified in the | with a layout for the file. It simply commits the changes to the | |||
| loca_layoutupdate field. To obtain a layout for the file the client | layout specified in the loca_layoutupdate field. To obtain a layout | |||
| must send a LAYOUTGET request to the server after the server's grace | for the file the client must send a LAYOUTGET request to the server | |||
| period has expired. If the metadata server receives a LAYOUTCOMMIT | after the server's grace period has expired. If the metadata server | |||
| request with loca_reclaim set to TRUE when the metadata server is not | receives a LAYOUTCOMMIT request with loca_reclaim set to TRUE when | |||
| in its recovery grace period, it MUST reject the request with the | the metadata server is not in its recovery grace period, it MUST | |||
| status NFS4ERR_NO_GRACE. | reject the request with the status NFS4ERR_NO_GRACE. | |||
| Setting the loca_reclaim field to TRUE is required if and only if the | Setting the loca_reclaim field to TRUE is required if and only if the | |||
| committed layout was acquired before the metadata server reboot. If | committed layout was acquired before the metadata server reboot. If | |||
| the client is committing a layout that was acquired during the | the client is committing a layout that was acquired during the | |||
| metadata server's grace period, it MUST set the "reclaim" field to | metadata server's grace period, it MUST set the "reclaim" field to | |||
| FALSE. | FALSE. | |||
| The loca_stateid is a layout stateid value as returned by previously | The loca_stateid is a layout stateid value as returned by previously | |||
| successful layout operations ( see Section 12.5.3). | successful layout operations ( see Section 12.5.3). | |||
| The loca_last_write_offset field specifies the offset of the last | The loca_last_write_offset field specifies the offset of the last | |||
| byte written by the client previous to the LAYOUTCOMMIT. Note that | byte written by the client previous to the LAYOUTCOMMIT. Note that | |||
| this value is never equal to the file's size (at most it is one byte | this value is never equal to the file's size (at most it is one byte | |||
| less than the file's size) and MUST be less than or equal to | less than the file's size) and MUST be less than or equal to | |||
| NFS4_MAXFILEOFF. The metadata server may use this information to | NFS4_MAXFILEOFF. Also, loca_last_write_offset MUST overlap the range | |||
| determine whether the file's size needs to be updated. If the | described by loca_offset and loca_length. The metadata server may | |||
| metadata server updates the file's size as the result of the | use this information to determine whether the file's size needs to be | |||
| LAYOUTCOMMIT operation, it must return the new size | updated. If the metadata server updates the file's size as the | |||
| result of the LAYOUTCOMMIT operation, it must return the new size | ||||
| (locr_newsize.ns_size) as part of the results. | (locr_newsize.ns_size) as part of the results. | |||
| The loca_time_modify field allows the client to suggest a | The loca_time_modify field allows the client to suggest a | |||
| modification time it would like the metadata server to set. The | modification time it would like the metadata server to set. The | |||
| metadata server may use the suggestion or it may use the time of the | metadata server may use the suggestion or it may use the time of the | |||
| LAYOUTCOMMIT operation to set the modification time. If the metadata | LAYOUTCOMMIT operation to set the modification time. If the metadata | |||
| server uses the client provided modification time, it should ensure | server uses the client provided modification time, it should ensure | |||
| time does not flow backwards. If the client wants to force the | time does not flow backwards. If the client wants to force the | |||
| metadata server to set an exact time, the client should use a SETATTR | metadata server to set an exact time, the client should use a SETATTR | |||
| operation in a compound right after LAYOUTCOMMIT. See Section 12.5.4 | operation in a compound right after LAYOUTCOMMIT. See Section 12.5.4 | |||
| skipping to change at page 508, line 11 | skipping to change at page 509, line 11 | |||
| The LAYOUTGET operation returns layout information for the specified | The LAYOUTGET operation returns layout information for the specified | |||
| byte range: a layout. To get a layout from a specific offset through | byte range: a layout. To get a layout from a specific offset through | |||
| the end-of-file, regardless of the file's length, a loga_length field | the end-of-file, regardless of the file's length, a loga_length field | |||
| with all bits set to 1 (one) should be used. If loga_length is zero, | with all bits set to 1 (one) should be used. If loga_length is zero, | |||
| or if a loga_length which is not all bits set to one is specified, | or if a loga_length which is not all bits set to one is specified, | |||
| and loga_length when added to loga_offset exceeds the maximum 64-bit | and loga_length when added to loga_offset exceeds the maximum 64-bit | |||
| unsigned integer value, the error NFS4ERR_INVAL will result. | unsigned integer value, the error NFS4ERR_INVAL will result. | |||
| The loga_minlength field specifies the minimum length of layout the | The loga_minlength field specifies the minimum length of layout the | |||
| server MUST return. If this requirement cannot be met, no layout | server MUST return with two exceptions: | |||
| must be returned; the error NFS4ERR_BADLAYOUT will be returned. | ||||
| 1. The argument loga_iomode was set to LAYOUTIOMODE_READ, and | ||||
| loga_offset plus loga_minlength goes past the end of the file. | ||||
| 2. The range from loga_offset through loga_offset + loga_minlength - | ||||
| 1 overlaps two or more striping patterns. In which case, | ||||
| logr_layout will contain two or more elements, and the sum of the | ||||
| lo_length fields of each element MUST be at least loga_minlength | ||||
| unless the first exception also applies. | ||||
| If this requirement cannot be met, the server MUST NOT return a | ||||
| layout and the error NFS4ERR_BADLAYOUT MUST be returned. | ||||
| The loga_stateid field specifies a valid stateid. If a layout is not | The loga_stateid field specifies a valid stateid. If a layout is not | |||
| currently held by the client, the loga_stateid field represents a | currently held by the client, the loga_stateid field represents a | |||
| stateid reflecting the correspondingly valid open, record lock, or | stateid reflecting the correspondingly valid open, record lock, or | |||
| delegation stateid. Once a layout is held by the client for the | delegation stateid. Once a layout is held by the client for the | |||
| file, the loga_stateid field is a stateid as returned from a previous | file, the loga_stateid field is a stateid as returned from a previous | |||
| LAYOUTGET or LAYOUTRETURN operation or provided by a CB_LAYOUTRECALL | LAYOUTGET or LAYOUTRETURN operation or provided by a CB_LAYOUTRECALL | |||
| operation (see Section 12.5.3). | operation (see Section 12.5.3). | |||
| The loga_maxcount field specifies the maximum layout size (in bytes) | The loga_maxcount field specifies the maximum layout size (in bytes) | |||
| skipping to change at page 508, line 39 | skipping to change at page 509, line 50 | |||
| then logr_layout will contain just one entry. Otherwise, if the | then logr_layout will contain just one entry. Otherwise, if the | |||
| requested range overlaps more than one striping pattern, logr_layout | requested range overlaps more than one striping pattern, logr_layout | |||
| will contain the required number of entries. The elements of | will contain the required number of entries. The elements of | |||
| logr_layout MUST be sorted in ascending order of the value of the | logr_layout MUST be sorted in ascending order of the value of the | |||
| lo_offset field of each element. There MUST be no gaps or overlaps | lo_offset field of each element. There MUST be no gaps or overlaps | |||
| in the range between two successive elements of logr_layout. The | in the range between two successive elements of logr_layout. The | |||
| lo_iomode field in each element of logr_layout MUST be the same. | lo_iomode field in each element of logr_layout MUST be the same. | |||
| The metadata server may adjust the range of the returned layout based | The metadata server may adjust the range of the returned layout based | |||
| on the usage implied by the loga_iomode. The client MUST be prepared | on the usage implied by the loga_iomode. The client MUST be prepared | |||
| to get a layout that does not align exactly with its request. The | to get a layout that does not align exactly with its request. See | |||
| lo_length field in each element of logr_layout SHOULD be at least as | ||||
| long as loga_minlength or the server SHOULD reject the request. See | ||||
| Section 12.5.2 for more details. | Section 12.5.2 for more details. | |||
| The metadata server may also return a layout with an lo_iomode other | The metadata server may also return a layout with an lo_iomode other | |||
| than that requested by the client. If it does so, it must ensure | than that requested by the client. If it does so, it MUST ensure | |||
| that the lo_iomode is more permissive than the loga_iomode requested. | that the lo_iomode is more permissive than the loga_iomode requested. | |||
| For example, this behavior allows an implementation to upgrade read- | For example, this behavior allows an implementation to upgrade read- | |||
| only requests to read/write requests at its discretion, within the | only requests to read/write requests at its discretion, within the | |||
| limits of the layout type specific protocol. A lo_iomode of either | limits of the layout type specific protocol. A lo_iomode of either | |||
| LAYOUTIOMODE4_READ or LAYOUTIOMODE4_RW must be returned. | LAYOUTIOMODE4_READ or LAYOUTIOMODE4_RW MUST be returned. | |||
| The logr_return_on_close result field is a directive to return the | The logr_return_on_close result field is a directive to return the | |||
| layout before closing the file. When the server sets this return | layout before closing the file. When the server sets this return | |||
| value to TRUE, it must be prepared to recall the layout in the case | value to TRUE, it MUST be prepared to recall the layout in the case | |||
| the client fails to return the layout before close. For the server | the client fails to return the layout before close. For the server | |||
| that knows a layout must be returned before a close of the file, this | that knows a layout must be returned before a close of the file, this | |||
| return value can be used to communicate the desired behavior to the | return value can be used to communicate the desired behavior to the | |||
| client and thus remove one extra step from the client's and server's | client and thus remove one extra step from the client's and server's | |||
| interaction. | interaction. | |||
| The logr_stateid, as with all stateid processing, is returned to the | The logr_stateid, as with all stateid processing, is returned to the | |||
| client for use in subsequent layout related operations. See | client for use in subsequent layout related operations. See | |||
| Section 8.2 for a further discussion. | Section 8.2 for a further discussion. | |||
| skipping to change at page 509, line 36 | skipping to change at page 510, line 44 | |||
| If layouts are not supported for the requested file or its containing | If layouts are not supported for the requested file or its containing | |||
| file system the server SHOULD return NFS4ERR_LAYOUTUNAVAILABLE. If | file system the server SHOULD return NFS4ERR_LAYOUTUNAVAILABLE. If | |||
| the layout type is not supported, the metadata server should return | the layout type is not supported, the metadata server should return | |||
| NFS4ERR_UNKNOWN_LAYOUTTYPE. If layouts are supported but no layout | NFS4ERR_UNKNOWN_LAYOUTTYPE. If layouts are supported but no layout | |||
| matches the client provided layout identification, the server should | matches the client provided layout identification, the server should | |||
| return NFS4ERR_BADLAYOUT. If an invalid loga_iomode is specified, or | return NFS4ERR_BADLAYOUT. If an invalid loga_iomode is specified, or | |||
| a loga_iomode of LAYOUTIOMODE4_ANY is specified, the server should | a loga_iomode of LAYOUTIOMODE4_ANY is specified, the server should | |||
| return NFS4ERR_BADIOMODE. | return NFS4ERR_BADIOMODE. | |||
| If the layout for the file is unavailable due to transient | If the layout for the file is unavailable due to transient | |||
| conditions, e.g. file sharing prohibits layouts, the server must | conditions, e.g. file sharing prohibits layouts, the server MUST | |||
| return NFS4ERR_LAYOUTTRYLATER. | return NFS4ERR_LAYOUTTRYLATER. | |||
| If the layout request is rejected due to an overlapping layout | If the layout request is rejected due to an overlapping layout | |||
| recall, the server must return NFS4ERR_RECALLCONFLICT. See | recall, the server MUST return NFS4ERR_RECALLCONFLICT. See | |||
| Section 12.5.5.2 for details. | Section 12.5.5.2 for details. | |||
| If the layout conflicts with a mandatory byte range lock held on the | If the layout conflicts with a mandatory byte range lock held on the | |||
| file, and if the storage devices have no method of enforcing | file, and if the storage devices have no method of enforcing | |||
| mandatory locks, other than through the restriction of layouts, the | mandatory locks, other than through the restriction of layouts, the | |||
| metadata server should return NFS4ERR_LOCKED. | metadata server should return NFS4ERR_LOCKED. | |||
| If client sets loga_signal_layout_avail to TRUE, then it is | If client sets loga_signal_layout_avail to TRUE, then it is | |||
| registering with the client a "want" for a layout in the event the | registering with the client a "want" for a layout in the event the | |||
| layout cannot be obtained due to resource exhaustion. If the server | layout cannot be obtained due to resource exhaustion. If the server | |||
| skipping to change at page 514, line 22 | skipping to change at page 515, line 22 | |||
| layout. See Section 12.5.5 for more details. | layout. See Section 12.5.5 for more details. | |||
| If the LAYOUTRETURN request sets the lora_reclaim field to TRUE after | If the LAYOUTRETURN request sets the lora_reclaim field to TRUE after | |||
| the metadata server's grace period, NFS4ERR_NO_GRACE is returned. | the metadata server's grace period, NFS4ERR_NO_GRACE is returned. | |||
| If the LAYOUTRETURN request sets the lora_reclaim field to TRUE and | If the LAYOUTRETURN request sets the lora_reclaim field to TRUE and | |||
| lr_returntype is set to LAYOUTRETURN4_FSID or LAYOUTRETURN4_ALL, | lr_returntype is set to LAYOUTRETURN4_FSID or LAYOUTRETURN4_ALL, | |||
| NFS4ERR_INVAL is returned. | NFS4ERR_INVAL is returned. | |||
| If the operation specified lr_returntype of LAYOUTRETURN4_FILE, then | If the operation specified lr_returntype of LAYOUTRETURN4_FILE, then | |||
| the lorr_stateid will represent the layout stateid as updated for | lrs_stateid will represent the layout stateid as updated for this | |||
| this operation's processing; the current stateid will also be updated | operation's processing; the current stateid will also be updated to | |||
| to match the returned value. If the last byte of any layout for the | match the returned value. If the last byte of any layout for the | |||
| current file, client ID, and layout type is being returned and there | current file, client ID, and layout type is being returned and there | |||
| are not remaining pending CB_LAYOUTRECALL operations for which a | are no remaining pending CB_LAYOUTRECALL operations for which a | |||
| LAYOUTRETURN operation must be done as a completing operation, this | LAYOUTRETURN operation must be done as a completing operation, | |||
| stateid value may be the special stateid consisting of all zeros. | lrs_present MUST be FALSE, and thus no stateid will be returned. | |||
| On success, the current filehandle retains its value. | On success, the current filehandle retains its value. | |||
| The server MAY require that the principal, security flavor, and if | The server MAY require that the principal, security flavor, and if | |||
| applicable, the GSS mechanism, combination that acquired the layout | applicable, the GSS mechanism, combination that acquired the layout | |||
| also be the one to send LAYOUTRETURN. This might not be possible if | also be the one to send LAYOUTRETURN. This might not be possible if | |||
| credentials for the principal are no longer available. The server | credentials for the principal are no longer available. The server | |||
| MAY allow the machine credential or SSV credential (see | MAY allow the machine credential or SSV credential (see | |||
| Section 18.35) to send LAYOUTRETURN. | Section 18.35) to send LAYOUTRETURN. | |||
| skipping to change at page 518, line 28 | skipping to change at page 519, line 28 | |||
| a request outstanding for; it could be equal to sa_slotid. The | a request outstanding for; it could be equal to sa_slotid. The | |||
| server returns two "highest_slotid" values: sr_highest_slotid, and | server returns two "highest_slotid" values: sr_highest_slotid, and | |||
| sr_target_highest_slotid. The former is the highest slot id the | sr_target_highest_slotid. The former is the highest slot id the | |||
| server will accept in future SEQUENCE operation, and SHOULD NOT be | server will accept in future SEQUENCE operation, and SHOULD NOT be | |||
| less than the value of sa_highest_slotid. (but see Section 2.10.5.1 | less than the value of sa_highest_slotid. (but see Section 2.10.5.1 | |||
| for an exception). The latter is the highest slot id the server | for an exception). The latter is the highest slot id the server | |||
| would prefer the client use on a future SEQUENCE operation. | would prefer the client use on a future SEQUENCE operation. | |||
| If sa_cachethis is TRUE, then the client is requesting that the | If sa_cachethis is TRUE, then the client is requesting that the | |||
| server cache the entire reply in the server's reply cache; therefore | server cache the entire reply in the server's reply cache; therefore | |||
| the server MUST cache the reply (see Section 2.10.5.1.2). The server | the server MUST cache the reply (see Section 2.10.5.1.3). The server | |||
| MAY cache the reply if sa_cachethis is FALSE. If the server does not | MAY cache the reply if sa_cachethis is FALSE. If the server does not | |||
| cache the entire reply, it MUST still record that it executed the | cache the entire reply, it MUST still record that it executed the | |||
| request at the specified slot and sequence id. | request at the specified slot and sequence id. | |||
| The response to the SEQUENCE operation contains a word of status | The response to the SEQUENCE operation contains a word of status | |||
| flags (sr_status_flags) that can provide to the client information | flags (sr_status_flags) that can provide to the client information | |||
| related to the status of the client's lock state and communications | related to the status of the client's lock state and communications | |||
| paths. Note that any status bits relating to lock state MAY be reset | paths. Note that any status bits relating to lock state MAY be reset | |||
| when lock state is lost due to a server reboot (even if the session | when lock state is lost due to a server reboot (even if the session | |||
| is persistent across reboots; session persistence does not imply lock | is persistent across reboots; session persistence does not imply lock | |||
| skipping to change at page 520, line 36 | skipping to change at page 521, line 36 | |||
| transferred to one or more new servers. This condition will | transferred to one or more new servers. This condition will | |||
| continue until the client receives an NFS4ERR_MOVED error and the | continue until the client receives an NFS4ERR_MOVED error and the | |||
| server receives the subsequent GETATTR for the fs_locations or | server receives the subsequent GETATTR for the fs_locations or | |||
| fs_locations_info attribute for an access to each file system for | fs_locations_info attribute for an access to each file system for | |||
| which a lease has been moved to a new server. See | which a lease has been moved to a new server. See | |||
| Section 11.7.7.1. | Section 11.7.7.1. | |||
| SEQ4_STATUS_RESTART_RECLAIM_NEEDED | SEQ4_STATUS_RESTART_RECLAIM_NEEDED | |||
| When set indicates that due to server restart or reboot the client | When set indicates that due to server restart or reboot the client | |||
| must reclaim locking state. Until the client sends a global | must reclaim locking state. Until the client sends a global | |||
| RECLAIM_COMPLETE (Section 18.51, every SEQUENCE operation will | RECLAIM_COMPLETE (Section 18.51), every SEQUENCE operation will | |||
| return SEQ4_STATUS_RESTART_RECLAIM_NEEDED. | return SEQ4_STATUS_RESTART_RECLAIM_NEEDED. | |||
| SEQ4_STATUS_BACKCHANNEL_FAULT | SEQ4_STATUS_BACKCHANNEL_FAULT | |||
| The server has encountered an unrecoverable fault with the | The server has encountered an unrecoverable fault with the | |||
| backchannel (e.g. it has lost track of the sequence id for a slot | backchannel (e.g. it has lost track of the sequence id for a slot | |||
| in the backchannel). The client MUST stop sending more requests | in the backchannel). The client MUST stop sending more requests | |||
| on the session's fore channel, wait for all outstanding requests | on the session's fore channel, wait for all outstanding requests | |||
| to complete on the fore and back channel, and then destroy the | to complete on the fore and back channel, and then destroy the | |||
| session. | session. | |||
| skipping to change at page 525, line 48 | skipping to change at page 526, line 48 | |||
| o Special stateids are always considered invalid (they result in the | o Special stateids are always considered invalid (they result in the | |||
| error code NFS4ERR_BAD_STATEID). | error code NFS4ERR_BAD_STATEID). | |||
| All stateids are interpreted as being associated with the client for | All stateids are interpreted as being associated with the client for | |||
| the current session. Any possible association with a previous | the current session. Any possible association with a previous | |||
| instance of the client (as stale stateids) is not considered. | instance of the client (as stale stateids) is not considered. | |||
| The errors which are validly returned within the status_code array | The errors which are validly returned within the status_code array | |||
| are: NFS4ERR_OK, NFS4ERR_BAD_STATEID, NFS4ERR_OLD_STATEID, | are: NFS4ERR_OK, NFS4ERR_BAD_STATEID, NFS4ERR_OLD_STATEID, | |||
| NFS4ERR_EXPIRED, NFS4ERR_ADMIN_REVOKED, and NFS4ERR_DELEG_REVOKED. | NFS4ERR_EXPIRED, NFS4ERR_ADMIN_REVOKED, and NFS4ERR_DELEG_REVOKED. | |||
| [[Comment.5: _LAYOUT_REVOKED]]. | [[Comment.4: _LAYOUT_REVOKED]]. | |||
| 18.48.4. IMPLEMENTATION | 18.48.4. IMPLEMENTATION | |||
| See Section 8.2.2 and Section 8.2.4 for a discussion of stateid | See Section 8.2.2 and Section 8.2.4 for a discussion of stateid | |||
| structure, lifetime, and validation. | structure, lifetime, and validation. | |||
| 18.49. Operation 56: WANT_DELEGATION - Request Delegation | 18.49. Operation 56: WANT_DELEGATION - Request Delegation | |||
| 18.49.1. ARGUMENT | 18.49.1. ARGUMENT | |||
| skipping to change at page 530, line 44 | skipping to change at page 531, line 44 | |||
| }; | }; | |||
| 18.51.3. DESCRIPTION | 18.51.3. DESCRIPTION | |||
| A RECLAIM_COMPLETE operation must be used to indicate that the client | A RECLAIM_COMPLETE operation must be used to indicate that the client | |||
| has reclaimed all of the locking state that it will recover, when it | has reclaimed all of the locking state that it will recover, when it | |||
| is recovering state due to either a server restart or the transfer of | is recovering state due to either a server restart or the transfer of | |||
| a file system to another server. There are two types of | a file system to another server. There are two types of | |||
| RECLAIM_COMPLETE operations: | RECLAIM_COMPLETE operations: | |||
| o When one_fs is false, a global RECLAIM_COMPLETE is being done. | o When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being done. | |||
| This indicates that recovery of all locks that the client held on | This indicates that recovery of all locks that the client held on | |||
| the previous server instance have been completed. | the previous server instance have been completed. | |||
| o When one_fs is true, a file system-specific RECLAIM_COMPLETE is | o When rca_one_fs is TRUE, a file system-specific RECLAIM_COMPLETE | |||
| being done. This indicates that recovery of locks for a single fs | is being done. This indicates that recovery of locks for a single | |||
| (the one designated by the current filehandle) due to a file | fs (the one designated by the current filehandle) due to a file | |||
| system transition have been completed. Presence of a current | system transition have been completed. Presence of a current | |||
| filehandle is only required when one_fs is true. | filehandle is only required when rca_one_fs is true. | |||
| Once a RECLAIM_COMPLETE is done, there can be no further reclaim | Once a RECLAIM_COMPLETE is done, there can be no further reclaim | |||
| operations for locks whose scope is defined as having completed | operations for locks whose scope is defined as having completed | |||
| recovery. Once the client sends RECLAIM_COMPLETE, the server will | recovery. Once the client sends RECLAIM_COMPLETE, the server will | |||
| not allow the client to do subsequent reclaims of locking state for | not allow the client to do subsequent reclaims of locking state for | |||
| that scope and will return NFS4ERR_NO_GRACE, if these are attempted. | that scope and will return NFS4ERR_NO_GRACE, if these are attempted. | |||
| Whenever a client establishes a new client ID and before it does the | Whenever a client establishes a new client ID and before it does the | |||
| first non-reclaim operation that obtains a lock, it MUST do a global | first non-reclaim operation that obtains a lock, it MUST do a global | |||
| RECLAIM_COMPLETE, even if there are no locks to reclaim. If non- | RECLAIM_COMPLETE, even if there are no locks to reclaim. If non- | |||
| reclaim locking operations are done before the RECLAIM_COMPLETE, a | reclaim locking operations are done before the RECLAIM_COMPLETE, a | |||
| NFS4ERR_GRACE will be returned. | NFS4ERR_GRACE will be returned. | |||
| Similarly, when the client accesses a file system on a new server, | Similarly, when the client accesses a file system on a new server, | |||
| before it sends the first non-reclaim operation that obtains a lock | before it sends the first non-reclaim operation that obtains a lock | |||
| on this new server, it must do a RECLAIM_COMPLETE with one_fs true | on this new server, it must do a RECLAIM_COMPLETE with rca_one_fs | |||
| and current filehandle within that file system, even if there are no | true and current filehandle within that file system, even if there | |||
| locks to reclaim. If non-reclaim locking operations are done on that | are no locks to reclaim. If non-reclaim locking operations are done | |||
| file system before the RECLAIM_COMPLETE, a NFS4ERR_GRACE will be | on that file system before the RECLAIM_COMPLETE, a NFS4ERR_GRACE will | |||
| returned. | be returned. | |||
| Any locks not reclaimed at the point at which RECLAIM_COMPLETE is | Any locks not reclaimed at the point at which RECLAIM_COMPLETE is | |||
| done become non-reclaimable. The client MUST NOT attempt to reclaim | done become non-reclaimable. The client MUST NOT attempt to reclaim | |||
| them, either during the current server instance or in any subsequent | them, either during the current server instance or in any subsequent | |||
| server instance, or on another server to which responsibility for | server instance, or on another server to which responsibility for | |||
| that file system is transferred. If the client were to do so, it | that file system is transferred. If the client were to do so, it | |||
| would be violating the protocol by representing itself as owning | would be violating the protocol by representing itself as owning | |||
| locks that it does not own, and so has no right to reclaim. See | locks that it does not own, and so has no right to reclaim. See | |||
| Section 8.4.3 for a discussion of edge conditions related to lock | Section 8.4.3 for a discussion of edge conditions related to lock | |||
| reclaim. | reclaim. | |||
| skipping to change at page 533, line 6 | skipping to change at page 534, line 6 | |||
| 18.52.4. IMPLEMENTATION | 18.52.4. IMPLEMENTATION | |||
| A client will probably not send an operation with code OP_ILLEGAL but | A client will probably not send an operation with code OP_ILLEGAL but | |||
| if it does, the response will be ILLEGAL4res just as it would be with | if it does, the response will be ILLEGAL4res just as it would be with | |||
| any other invalid operation code. Note that if the server gets an | any other invalid operation code. Note that if the server gets an | |||
| illegal operation code that is not OP_ILLEGAL, and if the server | illegal operation code that is not OP_ILLEGAL, and if the server | |||
| checks for legal operation codes during the XDR decode phase, then | checks for legal operation codes during the XDR decode phase, then | |||
| the ILLEGAL4res would not be returned. | the ILLEGAL4res would not be returned. | |||
| 19. NFSv44.1 Callback Procedures | 19. NFSv4.1 Callback Procedures | |||
| The procedures used for callbacks are defined in the following | The procedures used for callbacks are defined in the following | |||
| sections. In the interest of clarity, the terms "client" and | sections. In the interest of clarity, the terms "client" and | |||
| "server" refer to NFS clients and servers, despite the fact that for | "server" refer to NFS clients and servers, despite the fact that for | |||
| an individual callback RPC, the sense of these terms would be | an individual callback RPC, the sense of these terms would be | |||
| precisely the opposite. | precisely the opposite. | |||
| 19.1. Procedure 0: CB_NULL - No Operation | 19.1. Procedure 0: CB_NULL - No Operation | |||
| 19.1.1. ARGUMENTS | 19.1.1. ARGUMENTS | |||
| skipping to change at page 549, line 49 | skipping to change at page 550, line 49 | |||
| The server may decide that it cannot hold all of the state for | The server may decide that it cannot hold all of the state for | |||
| recallable objects, such as delegations and layouts, without running | recallable objects, such as delegations and layouts, without running | |||
| out of resources. In such a case, it is free to recall individual | out of resources. In such a case, it is free to recall individual | |||
| objects to reduce the load but this would be far from optimal. | objects to reduce the load but this would be far from optimal. | |||
| Because the general purpose of such recallable objects as delegations | Because the general purpose of such recallable objects as delegations | |||
| is to eliminate client interaction with the server, the server cannot | is to eliminate client interaction with the server, the server cannot | |||
| interpret lack of recent use as indicating that the object is no | interpret lack of recent use as indicating that the object is no | |||
| longer useful. The absence of visible use may be the result of a | longer useful. The absence of visible use may be the result of a | |||
| large number of potential operations eliminated. In the case of | large number of potential operations eliminated. In the case of | |||
| layouts, the layout will be used explicitly but the meta-data server | layouts, the layout will be used explicitly but the metadata server | |||
| does not have direct knowledge of such use. | does not have direct knowledge of such use. | |||
| In order to implement an effective reclaim scheme for such objects, | In order to implement an effective reclaim scheme for such objects, | |||
| the server's knowledge of available resources must be used to | the server's knowledge of available resources must be used to | |||
| determine when objects must be recalled with the clients selecting | determine when objects must be recalled with the clients selecting | |||
| the actual objects to be returned. | the actual objects to be returned. | |||
| Server implementations may differ in their resource allocation | Server implementations may differ in their resource allocation | |||
| requirements. For example, one server may share resources among all | requirements. For example, one server may share resources among all | |||
| classes of recallable objects whereas another may use separate | classes of recallable objects whereas another may use separate | |||
| skipping to change at page 553, line 9 | skipping to change at page 554, line 9 | |||
| slots, and if applicable, transport credits (e.g. RDMA credits for | slots, and if applicable, transport credits (e.g. RDMA credits for | |||
| connections associated with the operations channel) to the server. | connections associated with the operations channel) to the server. | |||
| CB_RECALL_SLOT specifies rsa_target_highest_slotid, the target | CB_RECALL_SLOT specifies rsa_target_highest_slotid, the target | |||
| highest_slot the server wants for the session. The client, should | highest_slot the server wants for the session. The client, should | |||
| then work toward reducing the highest_slot to the target. | then work toward reducing the highest_slot to the target. | |||
| If the session has only non-RDMA connections associated with its | If the session has only non-RDMA connections associated with its | |||
| operations channel, then the client need only wait for all | operations channel, then the client need only wait for all | |||
| outstanding requests with a slotid > rsa_target_highest_slotid to | outstanding requests with a slotid > rsa_target_highest_slotid to | |||
| complete, then send a single COMPOUND consisting of a single SEQUENCE | complete, then send a single COMPOUND consisting of a single SEQUENCE | |||
| operation, with the sa_highslot field set to | operation, with the sa_highestslot field set to | |||
| rsa_target_highest_slotid. If there are RDMA-based connections | rsa_target_highest_slotid. If there are RDMA-based connections | |||
| associated with operation channel, then the client needs to also send | associated with operation channel, then the client needs to also send | |||
| enough zero-length RDMA Sends to take the total RDMA credit count to | enough zero-length RDMA Sends to take the total RDMA credit count to | |||
| rsa_target_highest_slotid + 1 or below. | rsa_target_highest_slotid + 1 or below. | |||
| 20.8.4. IMPLEMENTATION | 20.8.4. IMPLEMENTATION | |||
| If the client fails to reduce highest slot it has on the fore channel | If the client fails to reduce highest slot it has on the fore channel | |||
| to what the server requests, the server can force the issue by | to what the server requests, the server can force the issue by | |||
| asserting flow control on the receive side of all connections bound | asserting flow control on the receive side of all connections bound | |||
| skipping to change at page 554, line 36 | skipping to change at page 555, line 36 | |||
| contents include the session to which this request belongs, slot id | contents include the session to which this request belongs, slot id | |||
| and sequence id used by the server to implement session request | and sequence id used by the server to implement session request | |||
| control and exactly once semantics, and exchanged slot maximums which | control and exactly once semantics, and exchanged slot maximums which | |||
| are used to adjust the size of the reply cache. This operation MUST | are used to adjust the size of the reply cache. This operation MUST | |||
| appear once as the first operation in each CB_COMPOUND request or a | appear once as the first operation in each CB_COMPOUND request or a | |||
| protocol error must result. See Section 18.46.3 for a description of | protocol error must result. See Section 18.46.3 for a description of | |||
| how slots are processed. | how slots are processed. | |||
| If csa_cachethis is TRUE, then the server is requesting that the | If csa_cachethis is TRUE, then the server is requesting that the | |||
| client cache the reply in the callback reply cache. The client MUST | client cache the reply in the callback reply cache. The client MUST | |||
| cache the reply (see Section 2.10.5.1.2). | cache the reply (see Section 2.10.5.1.3). | |||
| The csa_referring_call_lists array is the list of COMPOUND requests, | The csa_referring_call_lists array is the list of COMPOUND requests, | |||
| identified by sessionid, slot id and sequencid. These are requests | identified by sessionid, slot id and sequencid. These are requests | |||
| that the client previously sent to the server. These previous | that the client previously sent to the server. These previous | |||
| requests created state that some operation(s) in the in the same | requests created state that some operation(s) in the in the same | |||
| CB_COMPOUND as the csa_referring_call_lists is identifying. A | CB_COMPOUND as the csa_referring_call_lists is identifying. A | |||
| sessionid is included because leased state is tied to a client ID, | sessionid is included because leased state is tied to a client ID, | |||
| and a client ID can have multiple sessions. See Section 2.10.5.3. | and a client ID can have multiple sessions. See Section 2.10.5.3. | |||
| The value of csa_sequenceid argument relative to the cached sequence | The value of csa_sequenceid argument relative to the cached sequence | |||
| End of changes. 126 change blocks. | ||||
| 353 lines changed or deleted | 441 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||