draft-ietf-nfsv4-minorversion1-PAv2.txt   draft-ietf-nfsv4-minorversion1-PAv3.txt 
NFSv4 S. Shepler NFSv4 S. Shepler
Internet-Draft M. Eisler Internet-Draft M. Eisler
Intended status: Standards Track D. Noveck Intended status: Standards Track D. Noveck
Expires: October 7, 2009 Editors Expires: October 13, 2009 Editors
April 05, 2009 April 11, 2009
NFS Version 4 Minor Version 1 NFS Version 4 Minor Version 1
draft-ietf-nfsv4-minorversion1-PAv2.txt draft-ietf-nfsv4-minorversion1-PAv3.txt
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 1, line 33 skipping to change at page 1, line 33
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on October 7, 2009. This Internet-Draft will expire on October 13, 2009.
Copyright Notice Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 52 skipping to change at page 3, line 52
2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 42 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 42
2.10.1. Motivation and Overview . . . . . . . . . . . . . . 42 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 42
2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 44 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 44
2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 45 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 45
2.10.4. Server Scope . . . . . . . . . . . . . . . . . . . . 46 2.10.4. Server Scope . . . . . . . . . . . . . . . . . . . . 46
2.10.5. Trunking . . . . . . . . . . . . . . . . . . . . . . 49 2.10.5. Trunking . . . . . . . . . . . . . . . . . . . . . . 49
2.10.6. Exactly Once Semantics . . . . . . . . . . . . . . . 52 2.10.6. Exactly Once Semantics . . . . . . . . . . . . . . . 52
2.10.7. RDMA Considerations . . . . . . . . . . . . . . . . 65 2.10.7. RDMA Considerations . . . . . . . . . . . . . . . . 65
2.10.8. Sessions Security . . . . . . . . . . . . . . . . . 68 2.10.8. Sessions Security . . . . . . . . . . . . . . . . . 68
2.10.9. The Secret State Verifier (SSV) GSS Mechanism . . . 73 2.10.9. The Secret State Verifier (SSV) GSS Mechanism . . . 73
2.10.10. Session Mechanics - Steady State . . . . . . . . . . 78 2.10.10. Security Considerations for RPCSEC_GSS when using
2.10.11. Session Inactivity Timer . . . . . . . . . . . . . . 80 the SSV Mechanism . . . . . . . . . . . . . . . . . 78
2.10.12. Session Mechanics - Recovery . . . . . . . . . . . . 80 2.10.11. Session Mechanics - Steady State . . . . . . . . . . 79
2.10.13. Parallel NFS and Sessions . . . . . . . . . . . . . 85 2.10.12. Session Inactivity Timer . . . . . . . . . . . . . . 81
3. Protocol Constants and Data Types . . . . . . . . . . . . . . 85 2.10.13. Session Mechanics - Recovery . . . . . . . . . . . . 81
3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 85 2.10.14. Parallel NFS and Sessions . . . . . . . . . . . . . 86
3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 86 3. Protocol Constants and Data Types . . . . . . . . . . . . . . 86
3.3. Structured Data Types . . . . . . . . . . . . . . . . . 88 3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 87
4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 96 3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 87
4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 96 3.3. Structured Data Types . . . . . . . . . . . . . . . . . 89
4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 97 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 97 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 98
4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 97 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 98
4.2.1. General Properties of a Filehandle . . . . . . . . . 98 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 98
4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 99 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 99
4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 99 4.2.1. General Properties of a Filehandle . . . . . . . . . 99
4.3. One Method of Constructing a Volatile Filehandle . . . . 100 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 100
4.4. Client Recovery from Filehandle Expiration . . . . . . . 101 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 100
5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 102 4.3. One Method of Constructing a Volatile Filehandle . . . . 102
5.1. REQUIRED Attributes . . . . . . . . . . . . . . . . . . 103 4.4. Client Recovery from Filehandle Expiration . . . . . . . 102
5.2. RECOMMENDED Attributes . . . . . . . . . . . . . . . . . 103 5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 103
5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 104 5.1. REQUIRED Attributes . . . . . . . . . . . . . . . . . . 104
5.4. Classification of Attributes . . . . . . . . . . . . . . 105 5.2. RECOMMENDED Attributes . . . . . . . . . . . . . . . . . 104
5.5. Set-Only and Get-Only Attributes . . . . . . . . . . . . 106 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 105
5.6. REQUIRED Attributes - List and Definition References . . 106 5.4. Classification of Attributes . . . . . . . . . . . . . . 106
5.5. Set-Only and Get-Only Attributes . . . . . . . . . . . . 107
5.6. REQUIRED Attributes - List and Definition References . . 108
5.7. RECOMMENDED Attributes - List and Definition 5.7. RECOMMENDED Attributes - List and Definition
References . . . . . . . . . . . . . . . . . . . . . . . 107 References . . . . . . . . . . . . . . . . . . . . . . . 108
5.8. Attribute Definitions . . . . . . . . . . . . . . . . . 109 5.8. Attribute Definitions . . . . . . . . . . . . . . . . . 110
5.8.1. Definitions of REQUIRED Attributes . . . . . . . . . 109 5.8.1. Definitions of REQUIRED Attributes . . . . . . . . . 110
5.8.2. Definitions of Uncategorized RECOMMENDED 5.8.2. Definitions of Uncategorized RECOMMENDED
Attributes . . . . . . . . . . . . . . . . . . . . . 111 Attributes . . . . . . . . . . . . . . . . . . . . . 112
5.9. Interpreting owner and owner_group . . . . . . . . . . . 118 5.9. Interpreting owner and owner_group . . . . . . . . . . . 119
5.10. Character Case Attributes . . . . . . . . . . . . . . . 120 5.10. Character Case Attributes . . . . . . . . . . . . . . . 121
5.11. Directory Notification Attributes . . . . . . . . . . . 120 5.11. Directory Notification Attributes . . . . . . . . . . . 121
5.12. pNFS Attribute Definitions . . . . . . . . . . . . . . . 120 5.12. pNFS Attribute Definitions . . . . . . . . . . . . . . . 122
5.13. Retention Attributes . . . . . . . . . . . . . . . . . . 122 5.13. Retention Attributes . . . . . . . . . . . . . . . . . . 123
6. Access Control Attributes . . . . . . . . . . . . . . . . . . 125 6. Access Control Attributes . . . . . . . . . . . . . . . . . . 126
6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 125 6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.2. File Attributes Discussion . . . . . . . . . . . . . . . 126 6.2. File Attributes Discussion . . . . . . . . . . . . . . . 127
6.2.1. Attribute 12: acl . . . . . . . . . . . . . . . . . 126 6.2.1. Attribute 12: acl . . . . . . . . . . . . . . . . . 127
6.2.2. Attribute 58: dacl . . . . . . . . . . . . . . . . . 142 6.2.2. Attribute 58: dacl . . . . . . . . . . . . . . . . . 143
6.2.3. Attribute 59: sacl . . . . . . . . . . . . . . . . . 142 6.2.3. Attribute 59: sacl . . . . . . . . . . . . . . . . . 143
6.2.4. Attribute 33: mode . . . . . . . . . . . . . . . . . 142 6.2.4. Attribute 33: mode . . . . . . . . . . . . . . . . . 143
6.2.5. Attribute 74: mode_set_masked . . . . . . . . . . . 142 6.2.5. Attribute 74: mode_set_masked . . . . . . . . . . . 143
6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 143 6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 144
6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 143 6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 144
6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 144 6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 145
6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 145 6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 146
6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 146 6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 147
6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 147 6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 148
6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 148 6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 149
7. Single-server Namespace . . . . . . . . . . . . . . . . . . . 152 7. Single-server Namespace . . . . . . . . . . . . . . . . . . . 153
7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 152 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 153
7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 152 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 153
7.3. Server Pseudo File System . . . . . . . . . . . . . . . 153 7.3. Server Pseudo File System . . . . . . . . . . . . . . . 154
7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 153 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 154
7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 153 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 154
7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 154 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 155
7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 154 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 155
7.8. Security Policy and Namespace Presentation . . . . . . . 155 7.8. Security Policy and Namespace Presentation . . . . . . . 156
8. State Management . . . . . . . . . . . . . . . . . . . . . . 156 8. State Management . . . . . . . . . . . . . . . . . . . . . . 157
8.1. Client and Session ID . . . . . . . . . . . . . . . . . 157 8.1. Client and Session ID . . . . . . . . . . . . . . . . . 158
8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 157 8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 158
8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 158 8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 159
8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 159 8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 160
8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 160 8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 161
8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 161 8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 162
8.2.5. Stateid Use for I/O Operations . . . . . . . . . . . 164 8.2.5. Stateid Use for I/O Operations . . . . . . . . . . . 165
8.2.6. Stateid Use for SETATTR Operations . . . . . . . . . 165 8.2.6. Stateid Use for SETATTR Operations . . . . . . . . . 166
8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 166 8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 167
8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 168 8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 169
8.4.1. Client Failure and Recovery . . . . . . . . . . . . 168 8.4.1. Client Failure and Recovery . . . . . . . . . . . . 169
8.4.2. Server Failure and Recovery . . . . . . . . . . . . 169 8.4.2. Server Failure and Recovery . . . . . . . . . . . . 170
8.4.3. Network Partitions and Recovery . . . . . . . . . . 174 8.4.3. Network Partitions and Recovery . . . . . . . . . . 175
8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 179 8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 180
8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 180 8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 181
8.7. Clocks, Propagation Delay, and Calculating Lease 8.7. Clocks, Propagation Delay, and Calculating Lease
Expiration . . . . . . . . . . . . . . . . . . . . . . . 180 Expiration . . . . . . . . . . . . . . . . . . . . . . . 181
8.8. Obsolete Locking Infrastructure From NFSv4.0 . . . . . . 181 8.8. Obsolete Locking Infrastructure From NFSv4.0 . . . . . . 182
9. File Locking and Share Reservations . . . . . . . . . . . . . 182 9. File Locking and Share Reservations . . . . . . . . . . . . . 183
9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 182 9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 183
9.1.1. State-owner Definition . . . . . . . . . . . . . . . 182 9.1.1. State-owner Definition . . . . . . . . . . . . . . . 183
9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 183 9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 184
9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 186 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 187
9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 186 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 187
9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 187 9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 188
9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 187 9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 188
9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 188 9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 189
9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 189 9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 190
9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 190 9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 191
9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 190 9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 191
9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 191 9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 192
9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 192 9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 193
10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 192 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 193
10.1. Performance Challenges for Client-Side Caching . . . . . 193 10.1. Performance Challenges for Client-Side Caching . . . . . 194
10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 194 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 195
10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 196 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 197
10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 198 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 199
10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 198 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 199
10.3.2. Data Caching and File Locking . . . . . . . . . . . 200 10.3.2. Data Caching and File Locking . . . . . . . . . . . 201
10.3.3. Data Caching and Mandatory File Locking . . . . . . 201 10.3.3. Data Caching and Mandatory File Locking . . . . . . 202
10.3.4. Data Caching and File Identity . . . . . . . . . . . 202 10.3.4. Data Caching and File Identity . . . . . . . . . . . 203
10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 203 10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 204
10.4.1. Open Delegation and Data Caching . . . . . . . . . . 205 10.4.1. Open Delegation and Data Caching . . . . . . . . . . 206
10.4.2. Open Delegation and File Locks . . . . . . . . . . . 206 10.4.2. Open Delegation and File Locks . . . . . . . . . . . 207
10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 207 10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 208
10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 210 10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 211
10.4.5. Clients that Fail to Honor Delegation Recalls . . . 212 10.4.5. Clients that Fail to Honor Delegation Recalls . . . 213
10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 212 10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 213
10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 213 10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 214
10.5. Data Caching and Revocation . . . . . . . . . . . . . . 214 10.5. Data Caching and Revocation . . . . . . . . . . . . . . 215
10.5.1. Revocation Recovery for Write Open Delegation . . . 214 10.5.1. Revocation Recovery for Write Open Delegation . . . 215
10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 215 10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 216
10.7. Data and Metadata Caching and Memory Mapped Files . . . 217 10.7. Data and Metadata Caching and Memory Mapped Files . . . 218
10.8. Name and Directory Caching without Directory 10.8. Name and Directory Caching without Directory
Delegations . . . . . . . . . . . . . . . . . . . . . . 219 Delegations . . . . . . . . . . . . . . . . . . . . . . 220
10.8.1. Name Caching . . . . . . . . . . . . . . . . . . . . 219 10.8.1. Name Caching . . . . . . . . . . . . . . . . . . . . 220
10.8.2. Directory Caching . . . . . . . . . . . . . . . . . 221 10.8.2. Directory Caching . . . . . . . . . . . . . . . . . 222
10.9. Directory Delegations . . . . . . . . . . . . . . . . . 222 10.9. Directory Delegations . . . . . . . . . . . . . . . . . 223
10.9.1. Introduction to Directory Delegations . . . . . . . 222 10.9.1. Introduction to Directory Delegations . . . . . . . 223
10.9.2. Directory Delegation Design . . . . . . . . . . . . 223 10.9.2. Directory Delegation Design . . . . . . . . . . . . 224
10.9.3. Attributes in Support of Directory Notifications . . 224 10.9.3. Attributes in Support of Directory Notifications . . 225
10.9.4. Directory Delegation Recall . . . . . . . . . . . . 224 10.9.4. Directory Delegation Recall . . . . . . . . . . . . 225
10.9.5. Directory Delegation Recovery . . . . . . . . . . . 225 10.9.5. Directory Delegation Recovery . . . . . . . . . . . 226
11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 225 11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 226
11.1. Location Attributes . . . . . . . . . . . . . . . . . . 225 11.1. Location Attributes . . . . . . . . . . . . . . . . . . 226
11.2. File System Presence or Absence . . . . . . . . . . . . 226 11.2. File System Presence or Absence . . . . . . . . . . . . 227
11.3. Getting Attributes for an Absent File System . . . . . . 227 11.3. Getting Attributes for an Absent File System . . . . . . 228
11.3.1. GETATTR Within an Absent File System . . . . . . . . 227 11.3.1. GETATTR Within an Absent File System . . . . . . . . 228
11.3.2. READDIR and Absent File Systems . . . . . . . . . . 228 11.3.2. READDIR and Absent File Systems . . . . . . . . . . 229
11.4. Uses of Location Information . . . . . . . . . . . . . . 229 11.4. Uses of Location Information . . . . . . . . . . . . . . 230
11.4.1. File System Replication . . . . . . . . . . . . . . 230 11.4.1. File System Replication . . . . . . . . . . . . . . 231
11.4.2. File System Migration . . . . . . . . . . . . . . . 230 11.4.2. File System Migration . . . . . . . . . . . . . . . 231
11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 232 11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 233
11.5. Location Entries and Server Identity . . . . . . . . . . 233 11.5. Location Entries and Server Identity . . . . . . . . . . 234
11.6. Additional Client-side Considerations . . . . . . . . . 234 11.6. Additional Client-side Considerations . . . . . . . . . 235
11.7. Effecting File System Transitions . . . . . . . . . . . 234 11.7. Effecting File System Transitions . . . . . . . . . . . 235
11.7.1. File System Transitions and Simultaneous Access . . 236 11.7.1. File System Transitions and Simultaneous Access . . 237
11.7.2. Simultaneous Use and Transparent Transitions . . . . 236 11.7.2. Simultaneous Use and Transparent Transitions . . . . 237
11.7.3. Filehandles and File System Transitions . . . . . . 239 11.7.3. Filehandles and File System Transitions . . . . . . 240
11.7.4. Fileids and File System Transitions . . . . . . . . 239 11.7.4. Fileids and File System Transitions . . . . . . . . 240
11.7.5. Fsids and File System Transitions . . . . . . . . . 241 11.7.5. Fsids and File System Transitions . . . . . . . . . 242
11.7.6. The Change Attribute and File System Transitions . . 241 11.7.6. The Change Attribute and File System Transitions . . 242
11.7.7. Lock State and File System Transitions . . . . . . . 242 11.7.7. Lock State and File System Transitions . . . . . . . 243
11.7.8. Write Verifiers and File System Transitions . . . . 246 11.7.8. Write Verifiers and File System Transitions . . . . 247
11.7.9. Readdir Cookies and Verifiers and File System 11.7.9. Readdir Cookies and Verifiers and File System
Transitions . . . . . . . . . . . . . . . . . . . . 246 Transitions . . . . . . . . . . . . . . . . . . . . 247
11.7.10. File System Data and File System Transitions . . . . 246 11.7.10. File System Data and File System Transitions . . . . 247
11.8. Effecting File System Referrals . . . . . . . . . . . . 248 11.8. Effecting File System Referrals . . . . . . . . . . . . 249
11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 248 11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 249
11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 252 11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 253
11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 255 11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 256
11.10. The Attribute fs_locations_info . . . . . . . . . . . . 258 11.10. The Attribute fs_locations_info . . . . . . . . . . . . 259
11.10.1. The fs_locations_server4 Structure . . . . . . . . . 261 11.10.1. The fs_locations_server4 Structure . . . . . . . . . 262
11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 267 11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 268
11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 268 11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 269
11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 269 11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 270
12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 273 12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 274
12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 273 12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 274
12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 274 12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 275
12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 275 12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 276
12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 275 12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 276
12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 275 12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 276
12.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 275 12.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 276
12.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 276 12.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 277
12.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 276 12.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 277
12.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 277 12.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 278
12.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 277 12.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 278
12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 278 12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 279
12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 278 12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 279
12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 280 12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 281
12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 281 12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 282
12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 281 12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 282
12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 281 12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 282
12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 282 12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 283
12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 283 12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 284
12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 284 12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 285
12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 287 12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 288
12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 295 12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 296
12.5.7. Metadata Server Write Propagation . . . . . . . . . 296 12.5.7. Metadata Server Write Propagation . . . . . . . . . 297
12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 296 12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 297
12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 297 12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 298
12.7.1. Recovery from Client Restart . . . . . . . . . . . . 298 12.7.1. Recovery from Client Restart . . . . . . . . . . . . 299
12.7.2. Dealing with Lease Expiration on the Client . . . . 298 12.7.2. Dealing with Lease Expiration on the Client . . . . 299
12.7.3. Dealing with Loss of Layout State on the Metadata 12.7.3. Dealing with Loss of Layout State on the Metadata
Server . . . . . . . . . . . . . . . . . . . . . . . 299 Server . . . . . . . . . . . . . . . . . . . . . . . 300
12.7.4. Recovery from Metadata Server Restart . . . . . . . 300 12.7.4. Recovery from Metadata Server Restart . . . . . . . 301
12.7.5. Operations During Metadata Server Grace Period . . . 302 12.7.5. Operations During Metadata Server Grace Period . . . 303
12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 302 12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 303
12.8. Metadata and Storage Device Roles . . . . . . . . . . . 302 12.8. Metadata and Storage Device Roles . . . . . . . . . . . 303
12.9. Security Considerations for pNFS . . . . . . . . . . . . 303 12.9. Security Considerations for pNFS . . . . . . . . . . . . 304
13. NFSv4.1 as a Storage Protocol in pNFS: the File Layout Type . 304 13. NFSv4.1 as a Storage Protocol in pNFS: the File Layout Type . 305
13.1. Client ID and Session Considerations . . . . . . . . . . 304 13.1. Client ID and Session Considerations . . . . . . . . . . 305
13.1.1. Sessions Considerations for Data Servers . . . . . . 307 13.1.1. Sessions Considerations for Data Servers . . . . . . 308
13.2. File Layout Definitions . . . . . . . . . . . . . . . . 307 13.2. File Layout Definitions . . . . . . . . . . . . . . . . 308
13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 308 13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 309
13.4. Interpreting the File Layout . . . . . . . . . . . . . . 312 13.4. Interpreting the File Layout . . . . . . . . . . . . . . 313
13.4.1. Determining the Stripe Unit Number . . . . . . . . . 312 13.4.1. Determining the Stripe Unit Number . . . . . . . . . 313
13.4.2. Interpreting the File Layout Using Sparse Packing . 312 13.4.2. Interpreting the File Layout Using Sparse Packing . 313
13.4.3. Interpreting the File Layout Using Dense Packing . . 315 13.4.3. Interpreting the File Layout Using Dense Packing . . 316
13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 317 13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 318
13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 319 13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 320
13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 320 13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 321
13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 322 13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 323
13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 324 13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 325
13.9. Metadata and Data Server State Coordination . . . . . . 324 13.9. Metadata and Data Server State Coordination . . . . . . 325
13.9.1. Global Stateid Requirements . . . . . . . . . . . . 324 13.9.1. Global Stateid Requirements . . . . . . . . . . . . 325
13.9.2. Data Server State Propagation . . . . . . . . . . . 325 13.9.2. Data Server State Propagation . . . . . . . . . . . 326
13.10. Data Server Component File Size . . . . . . . . . . . . 327 13.10. Data Server Component File Size . . . . . . . . . . . . 328
13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 328 13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 329
13.12. Security Considerations for the File Layout Type . . . . 328 13.12. Security Considerations for the File Layout Type . . . . 329
14. Internationalization . . . . . . . . . . . . . . . . . . . . 329 14. Internationalization . . . . . . . . . . . . . . . . . . . . 330
14.1. Stringprep profile for the utf8str_cs type . . . . . . . 330 14.1. Stringprep profile for the utf8str_cs type . . . . . . . 331
14.2. Stringprep profile for the utf8str_cis type . . . . . . 332 14.2. Stringprep profile for the utf8str_cis type . . . . . . 333
14.3. Stringprep profile for the utf8str_mixed type . . . . . 333 14.3. Stringprep profile for the utf8str_mixed type . . . . . 334
14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 334 14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 335
14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 335 14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 336
15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 335 15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 336
15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 336 15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 337
15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 338 15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 339
15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 340 15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 341
15.1.3. Compound Structure Errors . . . . . . . . . . . . . 341 15.1.3. Compound Structure Errors . . . . . . . . . . . . . 342
15.1.4. File System Errors . . . . . . . . . . . . . . . . . 343 15.1.4. File System Errors . . . . . . . . . . . . . . . . . 344
15.1.5. State Management Errors . . . . . . . . . . . . . . 345 15.1.5. State Management Errors . . . . . . . . . . . . . . 346
15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 345 15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 346
15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 346 15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 347
15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 347 15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 348
15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 348 15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 349
15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 349 15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 350
15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 350 15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 351
15.1.12. Session Management Errors . . . . . . . . . . . . . 351 15.1.12. Session Management Errors . . . . . . . . . . . . . 352
15.1.13. Client Management Errors . . . . . . . . . . . . . . 352 15.1.13. Client Management Errors . . . . . . . . . . . . . . 353
15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 353 15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 354
15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 353 15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 354
15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 354 15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 355
15.2. Operations and their valid errors . . . . . . . . . . . 355 15.2. Operations and their valid errors . . . . . . . . . . . 356
15.3. Callback operations and their valid errors . . . . . . . 371 15.3. Callback operations and their valid errors . . . . . . . 372
15.4. Errors and the operations that use them . . . . . . . . 373 15.4. Errors and the operations that use them . . . . . . . . 374
16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 388 16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 389
16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 388 16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 389
16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 389 16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 390
17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 400 17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 401
18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 403 18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 404
18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 403 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 404
18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 409 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 410
18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 410 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 411
18.4. Operation 6: CREATE - Create a Non-Regular File Object . 413 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 414
18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting 18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting
Recovery . . . . . . . . . . . . . . . . . . . . . . . . 416 Recovery . . . . . . . . . . . . . . . . . . . . . . . . 417
18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 417 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 418
18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 417 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 418
18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 419 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 420
18.9. Operation 11: LINK - Create Link to a File . . . . . . . 420 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 421
18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 423 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 424
18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 427 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 428
18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 428 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 429
18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 430 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 431
18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 431 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 432
18.15. Operation 17: NVERIFY - Verify Difference in 18.15. Operation 17: NVERIFY - Verify Difference in
Attributes . . . . . . . . . . . . . . . . . . . . . . . 433 Attributes . . . . . . . . . . . . . . . . . . . . . . . 434
18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 434 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 435
18.17. Operation 19: OPENATTR - Open Named Attribute 18.17. Operation 19: OPENATTR - Open Named Attribute
Directory . . . . . . . . . . . . . . . . . . . . . . . 453 Directory . . . . . . . . . . . . . . . . . . . . . . . 454
18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 454 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 455
18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 456 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 457
18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 456 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 457
18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 458 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 459
18.22. Operation 25: READ - Read from File . . . . . . . . . . 459 18.22. Operation 25: READ - Read from File . . . . . . . . . . 460
18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 461 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 462
18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 465 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 466
18.25. Operation 28: REMOVE - Remove File System Object . . . . 466 18.25. Operation 28: REMOVE - Remove File System Object . . . . 467
18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 468 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 469
18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 472 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 473
18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 473 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 474
18.29. Operation 33: SECINFO - Obtain Available Security . . . 474 18.29. Operation 33: SECINFO - Obtain Available Security . . . 475
18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 478 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 479
18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 481 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 482
18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 482 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 483
18.33. Operation 40: BACKCHANNEL_CTL - Backchannel Control . . 486 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel Control . . 487
18.34. Operation 41: BIND_CONN_TO_SESSION - Associate 18.34. Operation 41: BIND_CONN_TO_SESSION - Associate
Connection with Session . . . . . . . . . . . . . . . . 488 Connection with Session . . . . . . . . . . . . . . . . 489
18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 491 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 492
18.36. Operation 43: CREATE_SESSION - Create New Session and 18.36. Operation 43: CREATE_SESSION - Create New Session and
Confirm Client ID . . . . . . . . . . . . . . . . . . . 509 Confirm Client ID . . . . . . . . . . . . . . . . . . . 510
18.37. Operation 44: DESTROY_SESSION - Destroy a Session . . . 519 18.37. Operation 44: DESTROY_SESSION - Destroy a Session . . . 520
18.38. Operation 45: FREE_STATEID - Free Stateid with No 18.38. Operation 45: FREE_STATEID - Free Stateid with No
Locks . . . . . . . . . . . . . . . . . . . . . . . . . 521 Locks . . . . . . . . . . . . . . . . . . . . . . . . . 522
18.39. Operation 46: GET_DIR_DELEGATION - Get a directory 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory
delegation . . . . . . . . . . . . . . . . . . . . . . . 521 delegation . . . . . . . . . . . . . . . . . . . . . . . 522
18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 525 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 526
18.41. Operation 48: GETDEVICELIST - Get All Device Mappings 18.41. Operation 48: GETDEVICELIST - Get All Device Mappings
for a File System . . . . . . . . . . . . . . . . . . . 528 for a File System . . . . . . . . . . . . . . . . . . . 529
18.42. Operation 49: LAYOUTCOMMIT - Commit Writes Made Using 18.42. Operation 49: LAYOUTCOMMIT - Commit Writes Made Using
a Layout . . . . . . . . . . . . . . . . . . . . . . . . 529 a Layout . . . . . . . . . . . . . . . . . . . . . . . . 530
18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 533 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 534
18.44. Operation 51: LAYOUTRETURN - Release Layout 18.44. Operation 51: LAYOUTRETURN - Release Layout
Information . . . . . . . . . . . . . . . . . . . . . . 543 Information . . . . . . . . . . . . . . . . . . . . . . 544
18.45. Operation 52: SECINFO_NO_NAME - Get Security on 18.45. Operation 52: SECINFO_NO_NAME - Get Security on
Unnamed Object . . . . . . . . . . . . . . . . . . . . . 547 Unnamed Object . . . . . . . . . . . . . . . . . . . . . 548
18.46. Operation 53: SEQUENCE - Supply Per-Procedure 18.46. Operation 53: SEQUENCE - Supply Per-Procedure
Sequencing and Control . . . . . . . . . . . . . . . . . 548 Sequencing and Control . . . . . . . . . . . . . . . . . 549
18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 554 18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 555
18.48. Operation 55: TEST_STATEID - Test Stateids for 18.48. Operation 55: TEST_STATEID - Test Stateids for
Validity . . . . . . . . . . . . . . . . . . . . . . . . 556 Validity . . . . . . . . . . . . . . . . . . . . . . . . 558
18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 558 18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 559
18.50. Operation 57: DESTROY_CLIENTID - Destroy a Client ID . . 562 18.50. Operation 57: DESTROY_CLIENTID - Destroy a Client ID . . 563
18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims
Finished . . . . . . . . . . . . . . . . . . . . . . . . 562 Finished . . . . . . . . . . . . . . . . . . . . . . . . 563
18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 565 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 566
19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 565 19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 566
19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 566 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 567
19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 566 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 567
20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 570 20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 571
20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 570 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 571
20.2. Operation 4: CB_RECALL - Recall a Delegation . . . . . . 571 20.2. Operation 4: CB_RECALL - Recall a Delegation . . . . . . 572
20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from 20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from
Client . . . . . . . . . . . . . . . . . . . . . . . . . 572 Client . . . . . . . . . . . . . . . . . . . . . . . . . 573
20.4. Operation 6: CB_NOTIFY - Notify Client of Directory 20.4. Operation 6: CB_NOTIFY - Notify Client of Directory
Changes . . . . . . . . . . . . . . . . . . . . . . . . 576 Changes . . . . . . . . . . . . . . . . . . . . . . . . 577
20.5. Operation 7: CB_PUSH_DELEG - Offer Previously 20.5. Operation 7: CB_PUSH_DELEG - Offer Previously
Requested Delegation to Client . . . . . . . . . . . . . 580 Requested Delegation to Client . . . . . . . . . . . . . 581
20.6. Operation 8: CB_RECALL_ANY - Keep Any N Recallable 20.6. Operation 8: CB_RECALL_ANY - Keep Any N Recallable
Objects . . . . . . . . . . . . . . . . . . . . . . . . 581 Objects . . . . . . . . . . . . . . . . . . . . . . . . 582
20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal 20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal
Resources for Recallable Objects . . . . . . . . . . . . 584 Resources for Recallable Objects . . . . . . . . . . . . 585
20.8. Operation 10: CB_RECALL_SLOT - Change Flow Control 20.8. Operation 10: CB_RECALL_SLOT - Change Flow Control
Limits . . . . . . . . . . . . . . . . . . . . . . . . . 585 Limits . . . . . . . . . . . . . . . . . . . . . . . . . 586
20.9. Operation 11: CB_SEQUENCE - Supply Backchannel 20.9. Operation 11: CB_SEQUENCE - Supply Backchannel
Sequencing and Control . . . . . . . . . . . . . . . . . 586 Sequencing and Control . . . . . . . . . . . . . . . . . 587
20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending
Delegation Wants . . . . . . . . . . . . . . . . . . . . 588 Delegation Wants . . . . . . . . . . . . . . . . . . . . 589
20.11. Operation 13: CB_NOTIFY_LOCK - Notify Client of 20.11. Operation 13: CB_NOTIFY_LOCK - Notify Client of
Possible Lock Availability . . . . . . . . . . . . . . . 589 Possible Lock Availability . . . . . . . . . . . . . . . 590
20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify Client of 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify Client of
Device ID Changes . . . . . . . . . . . . . . . . . . . 591 Device ID Changes . . . . . . . . . . . . . . . . . . . 592
20.13. Operation 10044: CB_ILLEGAL - Illegal Callback 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback
Operation . . . . . . . . . . . . . . . . . . . . . . . 593 Operation . . . . . . . . . . . . . . . . . . . . . . . 594
21. Security Considerations . . . . . . . . . . . . . . . . . . . 593 21. Security Considerations . . . . . . . . . . . . . . . . . . . 594
22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 595 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 596
22.1. Named Attribute Definitions . . . . . . . . . . . . . . 595 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 596
22.1.1. Initial Registry . . . . . . . . . . . . . . . . . . 596 22.1.1. Initial Registry . . . . . . . . . . . . . . . . . . 597
22.1.2. Updating Registrations . . . . . . . . . . . . . . . 596 22.1.2. Updating Registrations . . . . . . . . . . . . . . . 597
22.2. Device ID Notifications . . . . . . . . . . . . . . . . 596 22.2. Device ID Notifications . . . . . . . . . . . . . . . . 597
22.2.1. Initial Registry . . . . . . . . . . . . . . . . . . 597 22.2.1. Initial Registry . . . . . . . . . . . . . . . . . . 598
22.2.2. Updating Registrations . . . . . . . . . . . . . . . 598 22.2.2. Updating Registrations . . . . . . . . . . . . . . . 599
22.3. Object Recall Types . . . . . . . . . . . . . . . . . . 598 22.3. Object Recall Types . . . . . . . . . . . . . . . . . . 599
22.3.1. Initial Registry . . . . . . . . . . . . . . . . . . 599 22.3.1. Initial Registry . . . . . . . . . . . . . . . . . . 600
22.3.2. Updating Registrations . . . . . . . . . . . . . . . 599 22.3.2. Updating Registrations . . . . . . . . . . . . . . . 600
22.4. Layout Types . . . . . . . . . . . . . . . . . . . . . . 599 22.4. Layout Types . . . . . . . . . . . . . . . . . . . . . . 600
22.4.1. Initial Registry . . . . . . . . . . . . . . . . . . 600 22.4.1. Initial Registry . . . . . . . . . . . . . . . . . . 601
22.4.2. Updating Registrations . . . . . . . . . . . . . . . 601 22.4.2. Updating Registrations . . . . . . . . . . . . . . . 602
22.4.3. Guidelines for Writing Layout Type Specifications . 601 22.4.3. Guidelines for Writing Layout Type Specifications . 602
22.5. Path Variable Definitions . . . . . . . . . . . . . . . 602 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 603
22.5.1. Path Variables Registry . . . . . . . . . . . . . . 602 22.5.1. Path Variables Registry . . . . . . . . . . . . . . 603
22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable . . . . 604 22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable . . . . 605
22.5.3. Values for the ${ietf.org:OS_TYPE} Variable . . . . 605 22.5.3. Values for the ${ietf.org:OS_TYPE} Variable . . . . 606
23. References . . . . . . . . . . . . . . . . . . . . . . . . . 606 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 607
23.1. Normative References . . . . . . . . . . . . . . . . . . 606 23.1. Normative References . . . . . . . . . . . . . . . . . . 607
23.2. Informative References . . . . . . . . . . . . . . . . . 608 23.2. Informative References . . . . . . . . . . . . . . . . . 609
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 610 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 611
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 612 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 613
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 613 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 614
1. Introduction 1. Introduction
1.1. The NFS Version 4 Minor Version 1 Protocol 1.1. The NFS Version 4 Minor Version 1 Protocol
The NFS version 4 minor version 1 (NFSv4.1) protocol is the second The NFS version 4 minor version 1 (NFSv4.1) protocol is the second
minor version of the NFS version 4 (NFSv4) protocol. The first minor minor version of the NFS version 4 (NFSv4) protocol. The first minor
version, NFSv4.0 is described in [29]. It generally follows the version, NFSv4.0 is described in [30]. It generally follows the
guidelines for minor versioning model listed in Section 10 of RFC guidelines for minor versioning model listed in Section 10 of RFC
3530. However, it diverges from guidelines 11 ("a client and server 3530. However, it diverges from guidelines 11 ("a client and server
that supports minor version X must support minor versions 0 through that supports minor version X must support minor versions 0 through
X-1"), and 12 ("no features may be introduced as mandatory in a minor X-1"), and 12 ("no features may be introduced as mandatory in a minor
version"). These divergences are due to the introduction of the version"). These divergences are due to the introduction of the
sessions model for managing non-idempotent operations and the sessions model for managing non-idempotent operations and the
RECLAIM_COMPLETE operation. These two new features are RECLAIM_COMPLETE operation. These two new features are
infrastructural in nature and simplify implementation of existing and infrastructural in nature and simplify implementation of existing and
other new features. Making them anything but REQUIRED would add other new features. Making them anything but REQUIRED would add
undue complexity to protocol definition and implementation. NFSv4.1 undue complexity to protocol definition and implementation. NFSv4.1
skipping to change at page 12, line 45 skipping to change at page 12, line 45
o describe the NFSv4.0 protocol, except where needed to contrast o describe the NFSv4.0 protocol, except where needed to contrast
with NFSv4.1. with NFSv4.1.
o modify the specification of the NFSv4.0 protocol. o modify the specification of the NFSv4.0 protocol.
o clarify the NFSv4.0 protocol. o clarify the NFSv4.0 protocol.
1.3. NFSv4 Goals 1.3. NFSv4 Goals
The NFSv4 protocol is a further revision of the NFS protocol defined The NFSv4 protocol is a further revision of the NFS protocol defined
already by NFSv3 [30]. It retains the essential characteristics of already by NFSv3 [31]. It retains the essential characteristics of
previous versions: easy recovery; independence of transport previous versions: easy recovery; independence of transport
protocols, operating systems and file systems; simplicity; and good protocols, operating systems and file systems; simplicity; and good
performance. NFSv4 has the following goals: performance. NFSv4 has the following goals:
o Improved access and good performance on the Internet. o Improved access and good performance on the Internet.
The protocol is designed to transit firewalls easily, perform well The protocol is designed to transit firewalls easily, perform well
where latency is high and bandwidth is low, and scale to very where latency is high and bandwidth is low, and scale to very
large numbers of clients per server. large numbers of clients per server.
skipping to change at page 18, line 27 skipping to change at page 18, line 27
filehandles. filehandles.
1.6.3.2. File Attributes 1.6.3.2. File Attributes
The NFSv4.1 protocol has a rich and extensible file object attribute The NFSv4.1 protocol has a rich and extensible file object attribute
structure, which is divided into REQUIRED, RECOMMENDED, and named structure, which is divided into REQUIRED, RECOMMENDED, and named
attributes (see Section 5). attributes (see Section 5).
Several (but not all) of the REQUIRED attributes are derived from the Several (but not all) of the REQUIRED attributes are derived from the
attributes of NFSv3 (see the definition of the fattr3 data type in attributes of NFSv3 (see the definition of the fattr3 data type in
[30]). An example of a REQUIRED attribute is the file object's type [31]). An example of a REQUIRED attribute is the file object's type
(Section 5.8.1.2) so that regular files can be distinguished from (Section 5.8.1.2) so that regular files can be distinguished from
directories (also known as folders in some operating environments) directories (also known as folders in some operating environments)
and other types of objects. REQUIRED attributes are discussed in and other types of objects. REQUIRED attributes are discussed in
Section 5.1. Section 5.1.
An example of three RECOMMENDED attributes are acl, sacl, and dacl. An example of three RECOMMENDED attributes are acl, sacl, and dacl.
These attributes define an Access Control List (ACL) on a file object These attributes define an Access Control List (ACL) on a file object
(Section 6). An ACL provides directory and file access control (Section 6). An ACL provides directory and file access control
beyond the model used in NFSv3. The ACL definition allows for beyond the model used in NFSv3. The ACL definition allows for
specification of specific sets of permissions for individual users specification of specific sets of permissions for individual users
skipping to change at page 21, line 25 skipping to change at page 21, line 25
o Data retention (Section 5.13). o Data retention (Section 5.13).
o Identification of the implementation of the NFS client and server o Identification of the implementation of the NFS client and server
(Section 18.35). (Section 18.35).
o Support for notification of the availability of byte-range locks o Support for notification of the availability of byte-range locks
(see the new OPEN4_RESULT_MAY_NOTIFY_LOCK reply flag in (see the new OPEN4_RESULT_MAY_NOTIFY_LOCK reply flag in
Section 18.16 and see Section 20.11). Section 18.16 and see Section 20.11).
o In NFSv4.1, LIPKEY and SPKM-3 are not required security mechanisms o In NFSv4.1, LIPKEY and SPKM-3 are not required security mechanisms
[31]. [32].
2. Core Infrastructure 2. Core Infrastructure
2.1. Introduction 2.1. Introduction
NFSv4.1 relies on core infrastructure common to nearly every NFSv4.1 relies on core infrastructure common to nearly every
operation. This core infrastructure is described in the remainder of operation. This core infrastructure is described in the remainder of
this section. this section.
2.2. RPC and XDR 2.2. RPC and XDR
skipping to change at page 23, line 47 skipping to change at page 23, line 47
------------------------------------------------------------------ ------------------------------------------------------------------
390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes 390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes
390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes 390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes
390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes 390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes
Note that the number and name of the pseudo flavor is presented here Note that the number and name of the pseudo flavor is presented here
as a mapping aid to the implementor. Because the NFSv4.1 protocol as a mapping aid to the implementor. Because the NFSv4.1 protocol
includes a method to negotiate security and it understands the GSS- includes a method to negotiate security and it understands the GSS-
API mechanism, the pseudo flavor is not needed. The pseudo flavor is API mechanism, the pseudo flavor is not needed. The pseudo flavor is
needed for the NFSv3 since the security negotiation is done via the needed for the NFSv3 since the security negotiation is done via the
MOUNT protocol as described in [32]. MOUNT protocol as described in [33].
At the time NFSv4.1 was specified, AES with HMAC-SHA1 was a REQUIRED At the time NFSv4.1 was specified, AES with HMAC-SHA1 was a REQUIRED
algorithm set for Kerberos V5. In contrast, when NFSv4.0 was algorithm set for Kerberos V5. In contrast, when NFSv4.0 was
specified, weaker algorithm sets were REQUIRED for Kerberos V5, and specified, weaker algorithm sets were REQUIRED for Kerberos V5, and
were REQUIRED in the NFSv4.0 specification, because the Kerberos V5 were REQUIRED in the NFSv4.0 specification, because the Kerberos V5
specification at the time did not specify stronger algorithms. The specification at the time did not specify stronger algorithms. The
NFSv4.1 specification does not specify REQUIRED algorithms for NFSv4.1 specification does not specify REQUIRED algorithms for
Kerberos V5, and instead, the implementor is expected to track the Kerberos V5, and instead, the implementor is expected to track the
evolution of the Kerberos V5 standard if and when stronger algorithms evolution of the Kerberos V5 standard if and when stronger algorithms
are specified. are specified.
skipping to change at page 27, line 17 skipping to change at page 27, line 17
same string. The implementor is cautioned from an approach that same string. The implementor is cautioned from an approach that
requires the string to be recorded in a local file because this requires the string to be recorded in a local file because this
precludes the use of the implementation in an environment where precludes the use of the implementation in an environment where
there is no local disk and all file access is from an NFSv4.1 there is no local disk and all file access is from an NFSv4.1
server. server.
o The string should be the same for each server network address that o The string should be the same for each server network address that
the client accesses. This way, if a server has multiple the client accesses. This way, if a server has multiple
interfaces, the client can trunk traffic over multiple network interfaces, the client can trunk traffic over multiple network
paths as described in Section 2.10.5. (Note: the precise opposite paths as described in Section 2.10.5. (Note: the precise opposite
was advised in the NFSv4.0 specification [29].) was advised in the NFSv4.0 specification [30].)
o The algorithm for generating the string should not assume that the o The algorithm for generating the string should not assume that the
client's network address will not change, unless the client client's network address will not change, unless the client
implementation knows it is using statically assigned network implementation knows it is using statically assigned network
addresses. This includes changes between client incarnations and addresses. This includes changes between client incarnations and
even changes while the client is still running in its current even changes while the client is still running in its current
incarnation. Thus with dynamic address assignment, if the client incarnation. Thus with dynamic address assignment, if the client
includes just the client's network address in the co_ownerid includes just the client's network address in the co_ownerid
string, there is a real risk that after the client gives up the string, there is a real risk that after the client gives up the
network address, another client, using a similar algorithm for network address, another client, using a similar algorithm for
skipping to change at page 29, line 19 skipping to change at page 29, line 19
2.4.1. Upgrade from NFSv4.0 to NFSv4.1 2.4.1. Upgrade from NFSv4.0 to NFSv4.1
To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a
client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established
using the SETCLIENTID operation of NFSv4.0. A server that does so using the SETCLIENTID operation of NFSv4.0. A server that does so
will allow an upgraded client to avoid waiting until the lease (i.e. will allow an upgraded client to avoid waiting until the lease (i.e.
the lease established by the NFSv4.0 instance client) expires. This the lease established by the NFSv4.0 instance client) expires. This
requires the client_owner4 be constructed the same way as the requires the client_owner4 be constructed the same way as the
nfs_client_id4. If the latter's contents included the server's nfs_client_id4. If the latter's contents included the server's
network address (per the recommendations of the NFSv4.0 specification network address (per the recommendations of the NFSv4.0 specification
[29]), and the NFSv4.1 client does not wish to use a client ID that [30]), and the NFSv4.1 client does not wish to use a client ID that
prevents trunking, it should send two EXCHANGE_ID operations. The prevents trunking, it should send two EXCHANGE_ID operations. The
first EXCHANGE_ID will have a client_owner4 equal to the first EXCHANGE_ID will have a client_owner4 equal to the
nfs_client_id4. This will clear the state created by the NFSv4.0 nfs_client_id4. This will clear the state created by the NFSv4.0
client. The second EXCHANGE_ID will not have the server's network client. The second EXCHANGE_ID will not have the server's network
address. The state created for the second EXCHANGE_ID will not have address. The state created for the second EXCHANGE_ID will not have
to wait for lease expiration, because there will be no state to to wait for lease expiration, because there will be no state to
expire. expire.
2.4.2. Server Release of Client ID 2.4.2. Server Release of Client ID
skipping to change at page 29, line 49 skipping to change at page 29, line 49
to unilaterally release the client ID in order to conserve resources. to unilaterally release the client ID in order to conserve resources.
If the client contacts the server after this release, the server MUST If the client contacts the server after this release, the server MUST
ensure the client receives the appropriate error so that it will use ensure the client receives the appropriate error so that it will use
the EXCHANGE_ID/CREATE_SESSION sequence to establish a new client ID. the EXCHANGE_ID/CREATE_SESSION sequence to establish a new client ID.
The server ought to be very hesitant to release a client ID since the The server ought to be very hesitant to release a client ID since the
resulting work on the client to recover from such an event will be resulting work on the client to recover from such an event will be
the same burden as if the server had failed and restarted. Typically the same burden as if the server had failed and restarted. Typically
a server would not release a client ID unless there had been no a server would not release a client ID unless there had been no
activity from that client for many minutes. As long as there are activity from that client for many minutes. As long as there are
sessions, opens, locks, delegations, layouts, or wants, the server sessions, opens, locks, delegations, layouts, or wants, the server
MUST NOT release the client ID. See Section 2.10.12.1.4 for a MUST NOT release the client ID. See Section 2.10.13.1.4 for a
discussion on releasing inactive sessions. discussion on releasing inactive sessions.
2.4.3. Resolving Client Owner Conflicts 2.4.3. Resolving Client Owner Conflicts
When the server gets an EXCHANGE_ID for a client owner that currently When the server gets an EXCHANGE_ID for a client owner that currently
has no state, or that has state, but the lease has expired, the has no state, or that has state, but the lease has expired, the
server MUST allow the EXCHANGE_ID, and confirm the new client ID if server MUST allow the EXCHANGE_ID, and confirm the new client ID if
followed by the appropriate CREATE_SESSION. followed by the appropriate CREATE_SESSION.
When the server gets an EXCHANGE_ID for a new incarnation of a client When the server gets an EXCHANGE_ID for a new incarnation of a client
skipping to change at page 37, line 14 skipping to change at page 37, line 14
2.7. Minor Versioning 2.7. Minor Versioning
To address the requirement of an NFS protocol that can evolve as the To address the requirement of an NFS protocol that can evolve as the
need arises, the NFSv4.1 protocol contains the rules and framework to need arises, the NFSv4.1 protocol contains the rules and framework to
allow for future minor changes or versioning. allow for future minor changes or versioning.
The base assumption with respect to minor versioning is that any The base assumption with respect to minor versioning is that any
future accepted minor version will be documented in one or more future accepted minor version will be documented in one or more
standards track RFCs. Minor version zero of the NFSv4 protocol is standards track RFCs. Minor version zero of the NFSv4 protocol is
represented by [29], and minor version one is represented by this represented by [30], and minor version one is represented by this
document [[Comment.1: RFC Editor: change "document" to "RFC" when we document [[Comment.1: RFC Editor: change "document" to "RFC" when we
publish]]. The COMPOUND and CB_COMPOUND procedures support the publish]]. The COMPOUND and CB_COMPOUND procedures support the
encoding of the minor version being requested by the client. encoding of the minor version being requested by the client.
The following items represent the basic rules for the development of The following items represent the basic rules for the development of
minor versions. Note that a future minor version may modify or add minor versions. Note that a future minor version may modify or add
to the following rules as part of the minor version definition. to the following rules as part of the minor version definition.
1. Procedures are not added or deleted 1. Procedures are not added or deleted
skipping to change at page 40, line 23 skipping to change at page 40, line 23
2.9. Transport Layers 2.9. Transport Layers
2.9.1. REQUIRED and RECOMMENDED Properties of Transports 2.9.1. REQUIRED and RECOMMENDED Properties of Transports
NFSv4.1 works over RDMA and non-RDMA-based transports with the NFSv4.1 works over RDMA and non-RDMA-based transports with the
following attributes: following attributes:
o The transport supports reliable delivery of data, which NFSv4.1 o The transport supports reliable delivery of data, which NFSv4.1
requires but neither NFSv4.1 nor RPC has facilities for ensuring. requires but neither NFSv4.1 nor RPC has facilities for ensuring.
[33] [34]
o The transport delivers data in the order it was sent. Ordered o The transport delivers data in the order it was sent. Ordered
delivery simplifies detection of transmit errors, and simplifies delivery simplifies detection of transmit errors, and simplifies
the sending of arbitrary sized requests and responses, via the the sending of arbitrary sized requests and responses, via the
record marking protocol [3]. record marking protocol [3].
Where an NFSv4.1 implementation supports operation over the IP Where an NFSv4.1 implementation supports operation over the IP
network protocol, any transport used between NFS and IP MUST be among network protocol, any transport used between NFS and IP MUST be among
the IETF-approved congestion control transport protocols. At the the IETF-approved congestion control transport protocols. At the
time this document was written, the only two transports that had the time this document was written, the only two transports that had the
skipping to change at page 42, line 31 skipping to change at page 42, line 31
contents must not be blindly used when replies are sent from it, contents must not be blindly used when replies are sent from it,
and credit information appropriate to the channel must be and credit information appropriate to the channel must be
refreshed by the RPC layer. refreshed by the RPC layer.
In addition, as described in Section 2.10.6.2, while a session is In addition, as described in Section 2.10.6.2, while a session is
active, the NFSv4.1 requester MUST NOT stop waiting for a reply. active, the NFSv4.1 requester MUST NOT stop waiting for a reply.
2.9.3. Ports 2.9.3. Ports
Historically, NFSv3 servers have listened over TCP port 2049. The Historically, NFSv3 servers have listened over TCP port 2049. The
registered port 2049 [34] for the NFS protocol should be the default registered port 2049 [35] for the NFS protocol should be the default
configuration. NFSv4.1 clients SHOULD NOT use the RPC binding configuration. NFSv4.1 clients SHOULD NOT use the RPC binding
protocols as described in [35]. protocols as described in [36].
2.10. Session 2.10. Session
NFSv4.1 clients and servers MUST support and MUST use the session NFSv4.1 clients and servers MUST support and MUST use the session
feature as described in this section. feature as described in this section.
2.10.1. Motivation and Overview 2.10.1. Motivation and Overview
Previous versions and minor versions of NFS have suffered from the Previous versions and minor versions of NFS have suffered from the
following: following:
skipping to change at page 56, line 39 skipping to change at page 56, line 39
Given that well formulated XIDs continue to be required, this begs Given that well formulated XIDs continue to be required, this begs
the question why SEQUENCE and CB_SEQUENCE replies have a session ID, the question why SEQUENCE and CB_SEQUENCE replies have a session ID,
slot ID and sequence ID? Having the session ID in the reply means slot ID and sequence ID? Having the session ID in the reply means
the requester does not have to use the XID to lookup the session ID, the requester does not have to use the XID to lookup the session ID,
which would be necessary if the connection were associated with which would be necessary if the connection were associated with
multiple sessions. Having the slot ID and sequence ID in the reply multiple sessions. Having the slot ID and sequence ID in the reply
means the requester does not have to use the XID to lookup the slot means the requester does not have to use the XID to lookup the slot
ID and sequence ID. Furthermore, since the XID is only 32 bits, it ID and sequence ID. Furthermore, since the XID is only 32 bits, it
is too small to guarantee the re-association of a reply with its is too small to guarantee the re-association of a reply with its
request ([36]); having session ID, slot ID, and sequence ID in the request ([37]); having session ID, slot ID, and sequence ID in the
reply allows the client to validate that the reply in fact belongs to reply allows the client to validate that the reply in fact belongs to
the matched request. the matched request.
The SEQUENCE (and CB_SEQUENCE) operation also carries a The SEQUENCE (and CB_SEQUENCE) operation also carries a
"highest_slotid" value which carries additional requester slot usage "highest_slotid" value which carries additional requester slot usage
information. The requester MUST always indicate the slot ID information. The requester MUST always indicate the slot ID
representing the outstanding request with the highest-numbered slot representing the outstanding request with the highest-numbered slot
value. The requester should in all cases provide the most value. The requester should in all cases provide the most
conservative value possible, although it can be increased somewhat conservative value possible, although it can be increased somewhat
above the actual instantaneous usage to maintain some minimum or above the actual instantaneous usage to maintain some minimum or
skipping to change at page 57, line 32 skipping to change at page 57, line 32
requester is permitted to use on a subsequent SEQUENCE or requester is permitted to use on a subsequent SEQUENCE or
CB_SEQUENCE operation. The replier's enforced highest_slotid CB_SEQUENCE operation. The replier's enforced highest_slotid
SHOULD be no less than the highest_slotid the requester indicated SHOULD be no less than the highest_slotid the requester indicated
in the SEQUENCE or CB_SEQUENCE arguments. in the SEQUENCE or CB_SEQUENCE arguments.
A requester can be intransigent with respect to lowering its A requester can be intransigent with respect to lowering its
highest_slotid argument to a Sequence operation, i.e. the highest_slotid argument to a Sequence operation, i.e. the
requester continues to ignore the target highest_slotid in the requester continues to ignore the target highest_slotid in the
response to a Sequence operation, and continues to set its response to a Sequence operation, and continues to set its
highest_slotid argument to be higher than the target highest_slotid argument to be higher than the target
highest_slotid. This can be considered particularily egregious highest_slotid. This can be considered particularly egregious
behavior when the replier knows there are no outstanding requests behavior when the replier knows there are no outstanding requests
with slot IDs higher than its target highest_slotid. When faced with slot IDs higher than its target highest_slotid. When faced
with such intransigence, the replier is free to take more forceful with such intransigence, the replier is free to take more forceful
action, and MAY reply with a new enforced highest_slotid that is action, and MAY reply with a new enforced highest_slotid that is
less than its previous enforced highest_slotid. Thereafter, if less than its previous enforced highest_slotid. Thereafter, if
the requester continues to send requests with a highest_slotid the requester continues to send requests with a highest_slotid
that is greater than the replier's new enforced highest_slotid, that is greater than the replier's new enforced highest_slotid,
the server MAY return NFS4ERR_BAD_HIGH_SLOT, unless the slot ID in the server MAY return NFS4ERR_BAD_HIGH_SLOT, unless the slot ID in
the request is greater than the new enforced highest_slotid, and the request is greater than the new enforced highest_slotid, and
the request is a retry. the request is a retry.
skipping to change at page 59, line 26 skipping to change at page 59, line 26
cache entry for the slot whenever an error is returned from SEQUENCE cache entry for the slot whenever an error is returned from SEQUENCE
or CB_SEQUENCE. or CB_SEQUENCE.
2.10.6.1.3. Optional Reply Caching 2.10.6.1.3. Optional Reply Caching
On a per-request basis the requester can choose to direct the replier On a per-request basis the requester can choose to direct the replier
to cache the reply to all operations after the first operation to cache the reply to all operations after the first operation
(SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis
fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it
would not direct the replier to cache the entire reply is that the would not direct the replier to cache the entire reply is that the
request is composed of all idempotent operations [33]. Caching the request is composed of all idempotent operations [34]. Caching the
reply may offer little benefit. If the reply is too large (see reply may offer little benefit. If the reply is too large (see
Section 2.10.6.4), it may not be cacheable anyway. Even if the reply Section 2.10.6.4), it may not be cacheable anyway. Even if the reply
to idempotent request is small enough to cache, unnecessarily caching to idempotent request is small enough to cache, unnecessarily caching
the reply slows down the server and increases RPC latency. the reply slows down the server and increases RPC latency.
Whether the requester requests the reply to be cached or not has no Whether the requester requests the reply to be cached or not has no
effect on the slot processing. If the results of SEQUENCE or effect on the slot processing. If the results of SEQUENCE or
CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be
incremented by one. If a requester does not direct the replier to incremented by one. If a requester does not direct the replier to
cache the reply, the replier MUST do one of following: cache the reply, the replier MUST do one of following:
skipping to change at page 65, line 33 skipping to change at page 65, line 33
view the problem is as a single transaction consisting of each view the problem is as a single transaction consisting of each
operation in the COMPOUND followed by storing the result in operation in the COMPOUND followed by storing the result in
persistent storage, then finally a transaction commit. If there is a persistent storage, then finally a transaction commit. If there is a
failure before the transaction is committed, then the server rolls failure before the transaction is committed, then the server rolls
back the transaction. If server itself fails, then when it restarts, back the transaction. If server itself fails, then when it restarts,
its recovery logic could roll back the transaction before starting its recovery logic could roll back the transaction before starting
the NFSv4.1 server. the NFSv4.1 server.
While the description of the implementation for atomic execution of While the description of the implementation for atomic execution of
the request and caching of the reply is beyond the scope of this the request and caching of the reply is beyond the scope of this
document, an example implementation for NFSv2 [37] is described in document, an example implementation for NFSv2 [38] is described in
[38]. [39].
2.10.7. RDMA Considerations 2.10.7. RDMA Considerations
A complete discussion of the operation of RPC-based protocols over A complete discussion of the operation of RPC-based protocols over
RDMA transports is in [8]. A discussion of the operation of NFSv4, RDMA transports is in [8]. A discussion of the operation of NFSv4,
including NFSv4.1, over RDMA is in [9]. Where RDMA is considered, including NFSv4.1, over RDMA is in [9]. Where RDMA is considered,
this specification assumes the use of such a layering; it addresses this specification assumes the use of such a layering; it addresses
only the upper layer issues relevant to making best use of RPC/RDMA. only the upper layer issues relevant to making best use of RPC/RDMA.
2.10.7.1. RDMA Connection Resources 2.10.7.1. RDMA Connection Resources
skipping to change at page 68, line 45 skipping to change at page 68, line 45
2.10.8.2. Backchannel RPC Security 2.10.8.2. Backchannel RPC Security
When the NFSv4.1 client establishes the backchannel, it informs the When the NFSv4.1 client establishes the backchannel, it informs the
server of the security flavors and principals to use when sending server of the security flavors and principals to use when sending
requests. If the security flavor is RPCSEC_GSS, the client expresses requests. If the security flavor is RPCSEC_GSS, the client expresses
the principal in the form of an established RPCSEC_GSS context. The the principal in the form of an established RPCSEC_GSS context. The
server is free to use any of the flavor/principal combinations the server is free to use any of the flavor/principal combinations the
client offers, but it MUST NOT use unoffered combinations. This way, client offers, but it MUST NOT use unoffered combinations. This way,
the client need not provide a target GSS principal for the the client need not provide a target GSS principal for the
backchannel as it did with NFSv4.0, nor the server have to implement backchannel as it did with NFSv4.0, nor the server have to implement
an RPCSEC_GSS initiator as it did with NFSv4.0 [29]. an RPCSEC_GSS initiator as it did with NFSv4.0 [30].
The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL
(Section 18.33) operations allow the client to specify flavor/ (Section 18.33) operations allow the client to specify flavor/
principal combinations. principal combinations.
Also note that the SP4_SSV state protection mode (see Section 18.35 Also note that the SP4_SSV state protection mode (see Section 18.35
and Section 2.10.8.3) has the side benefit of providing SSV-derived and Section 2.10.8.3) has the side benefit of providing SSV-derived
RPCSEC_GSS contexts (Section 2.10.9). RPCSEC_GSS contexts (Section 2.10.9).
2.10.8.3. Protection from Unauthorized State Changes 2.10.8.3. Protection from Unauthorized State Changes
skipping to change at page 78, line 10 skipping to change at page 78, line 10
imply that the SSV or its GSS context have expired. imply that the SSV or its GSS context have expired.
The client MUST establish an SSV via SET_SSV before the SSV GSS The client MUST establish an SSV via SET_SSV before the SSV GSS
context can be used to emit tokens from GSS_Wrap() and GSS_GetMIC(). context can be used to emit tokens from GSS_Wrap() and GSS_GetMIC().
If SET_SSV has not been successfully called, attempts to emit tokens If SET_SSV has not been successfully called, attempts to emit tokens
MUST fail. MUST fail.
The SSV mechanism does not support replay detection and sequencing in The SSV mechanism does not support replay detection and sequencing in
its tokens because RPCSEC_GSS does not use those features (See its tokens because RPCSEC_GSS does not use those features (See
Section 5.2.2 "Context Creation Requests" in [4]). Section 5.2.2 "Context Creation Requests" in [4]). However,
Section 2.10.10 discusses special considerations for the SSV
mechanism when used with RPCSEC_GSS.
2.10.10. Session Mechanics - Steady State 2.10.10. Security Considerations for RPCSEC_GSS when using the SSV
Mechanism
2.10.10.1. Obligations of the Server When a client ID is created with SP4_SSV state protection (see
Section 18.35), the client is permitted to associate multiple
RPCSEC_GSS handles with the single SSV GSS context (see
Section 2.10.9). Because of the way RPCSEC_GSS (both version 1 and
version 2, see [4] and [12]) calculate the verifier of the reply,
special care must be taken by the implementation of the NFSv4.1
client to prevent attacks by a man-in-the-middle. The verifier of an
RPCSEC_GSS reply is the output of GSS_GetMIC() applied to the input
value of the seq_num field of the RPCSEC_GSS credential (data type
rpc_gss_cred_ver_1_t) (see Section 5.3.3.2 of [4]). If multiple
RPCSEC_GSS handles share the same GSS context, then if one handle is
used to send a request with the same seq_num value as another handle,
an attacker could block the reply, and replace it with the verifier
used for the other handle.
There are multiple ways to prevent the attack on the SSV RPCSEC_GSS
verifier in the reply. The simplest is believed to be as follows.
o Each time one or more new SSV RPCSEC_GSS handles are created via
EXCHANGE_ID, the client SHOULD send a SET_SSV operation to modify
the SSV. By changing the SSV, the new handles will not result in
the re-use of an SSV RPCSEC_GSS verifier in a reply.
o When a requester decides to use N SSV RPCSEC_GSS handles, it
SHOULD assign a unique and non-overlapping range of seq_nums to
each SSV RPCSEC_GSS handle. The size of each range SHOULD be
equal to MAXSEQ / N (see Section 5 of [4] for the definition of
MAXSEQ). When an SSV RPCSEC_GSS handle reaches its maximum, it
SHOULD force the replier to destroy the handle by sending a NULL
RPC request with seq_num set to MAXSEQ + 1 (see Section 5.3.3.3 of
[4]).
o When the requester wants to increase or decrease N, it SHOULD
force the replier to destroy all N handles by sending a NULL RPC
request on each handle with seq_num set to MAXSEQ + 1. If the
requester is the client, it SHOULD send a SET_SSV operation before
using new handles. If the requester is the server, then the
client SHOULD send a SET_SSV operation when it detects that the
server has forced it to destroy a backchannel's SSV RPCSEC_GSS
handle. By sending a SET_SSV operation, the SSV will change, and
so the attacker will be unavailable to successfully replay a
previous verifier in a reply to the requester.
Note that if the replier carefully creates the SSV RPCSEC_GSS
handles, the related risk of a man-in-the-middle splicing a forged
SSV RPCSEC_GSS credential with a verifier for another handle does not
exist. This is because the verifier in an RPCSEC_GSS request is
computed from input that includes both the RPCSEC_GSS handle and
seq_num (see Section 5.3.1 of [4]). Provided the replier takes care
to avoid re-using the value of an RPCSEC_GSS handle that it creates,
such as by including a generation number in the handle, the man-in-
the-middle will not be able to successfully replay a previous
verifier in the request to a replier.
2.10.11. Session Mechanics - Steady State
2.10.11.1. Obligations of the Server
The server has the primary obligation to monitor the state of The server has the primary obligation to monitor the state of
backchannel resources that the client has created for the server backchannel resources that the client has created for the server
(RPCSEC_GSS contexts and backchannel connections). If these (RPCSEC_GSS contexts and backchannel connections). If these
resources vanish, the server takes action as specified in resources vanish, the server takes action as specified in
Section 2.10.12.2. Section 2.10.13.2.
2.10.10.2. Obligations of the Client 2.10.11.2. Obligations of the Client
The client SHOULD honor the following obligations in order to utilize The client SHOULD honor the following obligations in order to utilize
the session: the session:
o Keep a necessary session from going idle on the server. A client o Keep a necessary session from going idle on the server. A client
that requires a session, but nonetheless is not sending operations that requires a session, but nonetheless is not sending operations
risks having the server destroy the session. This is because risks having the server destroy the session. This is because
sessions consume resources, and resource limitations may force the sessions consume resources, and resource limitations may force the
server to cull an inactive session. A server MAY consider a server to cull an inactive session. A server MAY consider a
session to be inactive if the client has not used the session session to be inactive if the client has not used the session
before the session inactivity timer (Section 2.10.11) has expired. before the session inactivity timer (Section 2.10.12) has expired.
o Destroy the session when not needed. If a client has multiple o Destroy the session when not needed. If a client has multiple
sessions, one of which has no requests waiting for replies, and sessions, one of which has no requests waiting for replies, and
has been idle for some period of time, it SHOULD destroy the has been idle for some period of time, it SHOULD destroy the
session. session.
o Maintain GSS contexts for the backchannel. If the client requires o Maintain GSS contexts and RPCSEC_GSS handles for the backchannel.
the server to use the RPCSEC_GSS security flavor for callbacks, If the client requires the server to use the RPCSEC_GSS security
then it needs to be sure the contexts handed to the server via flavor for callbacks, then it needs to be sure the RPCSEC_GSS
BACKCHANNEL_CTL are unexpired. handles and/or their GSS contexts that are handed to the server
via BACKCHANNEL_CTL or CREATE_SESSION are unexpired.
o Preserve a connection for a backchannel. The server requires a o Preserve a connection for a backchannel. The server requires a
backchannel in order to gracefully recall recallable state, or backchannel in order to gracefully recall recallable state, or
notify the client of certain events. Note that if the connection notify the client of certain events. Note that if the connection
is not being used for the fore channel, there is no way for the is not being used for the fore channel, there is no way for the
client tell if the connection is still alive (e.g., the server client tell if the connection is still alive (e.g., the server
restarted without sending a disconnect). The onus is on the restarted without sending a disconnect). The onus is on the
server, not the client, to determine if the backchannel's server, not the client, to determine if the backchannel's
connection is alive, and to indicate in the response to a SEQUENCE connection is alive, and to indicate in the response to a SEQUENCE
operation when the last connection associated with a session's operation when the last connection associated with a session's
backchannel has disconnected. backchannel has disconnected.
2.10.10.3. Steps the Client Takes To Establish a Session 2.10.11.3. Steps the Client Takes To Establish a Session
If the client does not have a client ID, the client sends EXCHANGE_ID If the client does not have a client ID, the client sends EXCHANGE_ID
to establish a client ID. If it opts for SP4_MACH_CRED or SP4_SSV to establish a client ID. If it opts for SP4_MACH_CRED or SP4_SSV
protection, in the spo_must_enforce list of operations, it SHOULD at protection, in the spo_must_enforce list of operations, it SHOULD at
minimum specify: CREATE_SESSION, DESTROY_SESSION, minimum specify: CREATE_SESSION, DESTROY_SESSION,
BIND_CONN_TO_SESSION, BACKCHANNEL_CTL, and DESTROY_CLIENTID. If opts BIND_CONN_TO_SESSION, BACKCHANNEL_CTL, and DESTROY_CLIENTID. If opts
for SP4_SSV protection, the client needs to ask for SSV-based for SP4_SSV protection, the client needs to ask for SSV-based
RPCSEC_GSS handles. RPCSEC_GSS handles.
The client uses the client ID to send a CREATE_SESSION on a The client uses the client ID to send a CREATE_SESSION on a
skipping to change at page 80, line 5 skipping to change at page 81, line 14
If the client wants to use additional connections for the If the client wants to use additional connections for the
backchannel, then it needs to call BIND_CONN_TO_SESSION on each backchannel, then it needs to call BIND_CONN_TO_SESSION on each
connection it wants to use with the session. If the client wants to connection it wants to use with the session. If the client wants to
use additional connections for the fore channel, then it needs to use additional connections for the fore channel, then it needs to
call BIND_CONN_TO_SESSION if it specified SP4_SSV or SP4_MACH_CRED call BIND_CONN_TO_SESSION if it specified SP4_SSV or SP4_MACH_CRED
state protection when the client ID was created. state protection when the client ID was created.
At this point the session has reached steady state. At this point the session has reached steady state.
2.10.11. Session Inactivity Timer 2.10.12. Session Inactivity Timer
The server MAY maintain a session inactivity timer for each session. The server MAY maintain a session inactivity timer for each session.
If the session inactivity timer expires, then the server MAY destroy If the session inactivity timer expires, then the server MAY destroy
the session. To avoid losing a session due to inactivity, the client the session. To avoid losing a session due to inactivity, the client
MUST renew the session inactivity timer. The length of session MUST renew the session inactivity timer. The length of session
inactivity timer MUST NOT be less than the lease_time attribute inactivity timer MUST NOT be less than the lease_time attribute
(Section 5.8.1.11). As with lease renewal (Section 8.3), when the (Section 5.8.1.11). As with lease renewal (Section 8.3), when the
server receives a SEQUENCE operation, it resets the session server receives a SEQUENCE operation, it resets the session
inactivity timer, and MUST NOT allow the timer to expire while the inactivity timer, and MUST NOT allow the timer to expire while the
rest of the operations in the COMPOUND procedure's request are still rest of the operations in the COMPOUND procedure's request are still
executing. Once the last operation has finished, the server MUST set executing. Once the last operation has finished, the server MUST set
the session inactivity timer to expire no sooner that the sum of the the session inactivity timer to expire no sooner that the sum of the
current time and the value of the lease_time attribute. current time and the value of the lease_time attribute.
2.10.12. Session Mechanics - Recovery 2.10.13. Session Mechanics - Recovery
2.10.12.1. Events Requiring Client Action 2.10.13.1. Events Requiring Client Action
The following events require client action to recover. The following events require client action to recover.
2.10.12.1.1. RPCSEC_GSS Context Loss by Callback Path 2.10.13.1.1. RPCSEC_GSS Context Loss by Callback Path
If all RPCSEC_GSS contexts granted by the client to the server for If all RPCSEC_GSS handles granted by the client to the server for
callback use have expired, the client MUST establish a new context callback use have expired, the client MUST establish a new handle via
via BACKCHANNEL_CTL. The sr_status_flags field of the SEQUENCE BACKCHANNEL_CTL. The sr_status_flags field of the SEQUENCE results
results indicates when callback contexts are nearly expired, or fully indicates when callback handles are nearly expired, or fully expired
expired (see Section 18.46.3). (see Section 18.46.3).
2.10.12.1.2. Connection Loss 2.10.13.1.2. Connection Loss
If the client loses the last connection of the session, and if wants If the client loses the last connection of the session, and if wants
to retain the session, then it needs to create a new connection, and to retain the session, then it needs to create a new connection, and
if, when the client ID was created, BIND_CONN_TO_SESSION was if, when the client ID was created, BIND_CONN_TO_SESSION was
specified in the spo_must_enforce list, the client MUST use specified in the spo_must_enforce list, the client MUST use
BIND_CONN_TO_SESSION to associate the connection with the session. BIND_CONN_TO_SESSION to associate the connection with the session.
If there was a request outstanding at the time the of connection If there was a request outstanding at the time the of connection
loss, then if client wants to continue to use the session it MUST loss, then if client wants to continue to use the session it MUST
retry the request, as described in Section 2.10.6.2. Note that it is retry the request, as described in Section 2.10.6.2. Note that it is
skipping to change at page 81, line 11 skipping to change at page 82, line 20
disconnect. disconnect.
If the connection that was lost was the last one associated with the If the connection that was lost was the last one associated with the
backchannel, and the client wants to retain the backchannel and/or backchannel, and the client wants to retain the backchannel and/or
not put recallable state subject to revocation, the client needs to not put recallable state subject to revocation, the client needs to
reconnect, and if it does, it MUST associate the connection to the reconnect, and if it does, it MUST associate the connection to the
session and backchannel via BIND_CONN_TO_SESSION. The server SHOULD session and backchannel via BIND_CONN_TO_SESSION. The server SHOULD
indicate when it has no callback connection via the sr_status_flags indicate when it has no callback connection via the sr_status_flags
result from SEQUENCE. result from SEQUENCE.
2.10.12.1.3. Backchannel GSS Context Loss 2.10.13.1.3. Backchannel GSS Context Loss
Via the sr_status_flags result of the SEQUENCE operation or other Via the sr_status_flags result of the SEQUENCE operation or other
means, the client will learn if some or all of the RPCSEC_GSS means, the client will learn if some or all of the RPCSEC_GSS
contexts it assigned to the backchannel have been lost. If the contexts it assigned to the backchannel have been lost. If the
client wants to the retain the backchannel and/or not put recallable client wants to the retain the backchannel and/or not put recallable
state subjection to revocation, the client needs to use state subjection to revocation, the client needs to use
BACKCHANNEL_CTL to assign new contexts. BACKCHANNEL_CTL to assign new contexts.
2.10.12.1.4. Loss of Session 2.10.13.1.4. Loss of Session
The replier might lose a record of the session. Causes include: The replier might lose a record of the session. Causes include:
o Replier failure and restart o Replier failure and restart
o A catastrophe that causes the reply cache to be corrupted or lost o A catastrophe that causes the reply cache to be corrupted or lost
on the media it was stored on. This applies even if the replier on the media it was stored on. This applies even if the replier
indicated in the CREATE_SESSION results that it would persist the indicated in the CREATE_SESSION results that it would persist the
cache. cache.
skipping to change at page 84, line 6 skipping to change at page 85, line 15
state imply loss of session state, because the session depends on the state imply loss of session state, because the session depends on the
client ID; loss of client ID however does imply loss of session, client ID; loss of client ID however does imply loss of session,
lock, open, delegation, and layout state. See Section 8.4.2. A lock, open, delegation, and layout state. See Section 8.4.2. A
session can survive a server restart, but lock recovery may still be session can survive a server restart, but lock recovery may still be
needed. needed.
It is possible CREATE_SESSION will fail with NFS4ERR_STALE_CLIENTID It is possible CREATE_SESSION will fail with NFS4ERR_STALE_CLIENTID
(e.g. the server restarts and does not preserve client ID state). If (e.g. the server restarts and does not preserve client ID state). If
so, the client needs to call EXCHANGE_ID, followed by CREATE_SESSION. so, the client needs to call EXCHANGE_ID, followed by CREATE_SESSION.
2.10.12.2. Events Requiring Server Action 2.10.13.2. Events Requiring Server Action
The following events require server action to recover. The following events require server action to recover.
2.10.12.2.1. Client Crash and Restart 2.10.13.2.1. Client Crash and Restart
As described in Section 18.35, a restarted client sends EXCHANGE_ID As described in Section 18.35, a restarted client sends EXCHANGE_ID
in such a way it causes the server to delete any sessions it had. in such a way it causes the server to delete any sessions it had.
2.10.12.2.2. Client Crash with No Restart 2.10.13.2.2. Client Crash with No Restart
If a client crashes and never comes back, it will never send If a client crashes and never comes back, it will never send
EXCHANGE_ID with its old client owner. Thus the server has session EXCHANGE_ID with its old client owner. Thus the server has session
state that will never be used again. After an extended period of state that will never be used again. After an extended period of
time and if the server has resource constraints, it MAY destroy the time and if the server has resource constraints, it MAY destroy the
old session as well as locking state. old session as well as locking state.
2.10.12.2.3. Extended Network Partition 2.10.13.2.3. Extended Network Partition
To the server, the extended network partition may be no different To the server, the extended network partition may be no different
from a client crash with no restart (see Section 2.10.12.2.2). from a client crash with no restart (see Section 2.10.13.2.2).
Unless the server can discern that there is a network partition, it Unless the server can discern that there is a network partition, it
is free to treat the situation as if the client has crashed is free to treat the situation as if the client has crashed
permanently. permanently.
2.10.12.2.4. Backchannel Connection Loss 2.10.13.2.4. Backchannel Connection Loss
If there were callback requests outstanding at the time of a If there were callback requests outstanding at the time of a
connection loss, then the server MUST retry the request, as described connection loss, then the server MUST retry the request, as described
in Section 2.10.6.2. Note that it is not necessary to retry requests in Section 2.10.6.2. Note that it is not necessary to retry requests
over a connection with the same source network address or the same over a connection with the same source network address or the same
destination network address as the lost connection. As long as the destination network address as the lost connection. As long as the
session ID, slot ID, and sequence ID in the retry match that of the session ID, slot ID, and sequence ID in the retry match that of the
original request, the callback target will recognize the request as a original request, the callback target will recognize the request as a
retry even if it did see the request prior to disconnect. retry even if it did see the request prior to disconnect.
If the connection lost is the last one associated with the If the connection lost is the last one associated with the
backchannel, then the server MUST indicate that in the backchannel, then the server MUST indicate that in the
sr_status_flags field of every SEQUENCE reply until the backchannel sr_status_flags field of every SEQUENCE reply until the backchannel
is reestablished. There are two situations each of which use is reestablished. There are two situations each of which use
different status flags: no connectivity for the session's different status flags: no connectivity for the session's
backchannel, and no connectivity for any session backchannel of the backchannel, and no connectivity for any session backchannel of the
client. See Section 18.46 for a description of the appropriate flags client. See Section 18.46 for a description of the appropriate flags
in sr_status_flags. in sr_status_flags.
2.10.12.2.5. GSS Context Loss 2.10.13.2.5. GSS Context Loss
The server SHOULD monitor when the number RPCSEC_GSS contexts The server SHOULD monitor when the number RPCSEC_GSS contexts
assigned to the backchannel reaches one, and when that one context is assigned to the backchannel reaches one, and when that one context is
near expiry (i.e. between one and two periods of lease time), near expiry (i.e. between one and two periods of lease time),
indicate so in the sr_status_flags field of all SEQUENCE replies. indicate so in the sr_status_flags field of all SEQUENCE replies.
The server MUST indicate when the all of the backchannel's assigned The server MUST indicate when all of the backchannel's assigned
RPCSEC_GSS contexts have expired in the sr_status_flags field of all RPCSEC_GSS handles have expired via the sr_status_flags field of all
SEQUENCE replies. SEQUENCE replies.
2.10.13. Parallel NFS and Sessions 2.10.14. Parallel NFS and Sessions
A client and server can potentially be a non-pNFS implementation, a A client and server can potentially be a non-pNFS implementation, a
metadata server implementation, a data server implementation, or two metadata server implementation, a data server implementation, or two
or three types of implementations. The EXCHGID4_FLAG_USE_NON_PNFS, or three types of implementations. The EXCHGID4_FLAG_USE_NON_PNFS,
EXCHGID4_FLAG_USE_PNFS_MDS, and EXCHGID4_FLAG_USE_PNFS_DS flags (not EXCHGID4_FLAG_USE_PNFS_MDS, and EXCHGID4_FLAG_USE_PNFS_DS flags (not
mutually exclusive) are passed in the EXCHANGE_ID arguments and mutually exclusive) are passed in the EXCHANGE_ID arguments and
results to allow the client to indicate how it wants to use sessions results to allow the client to indicate how it wants to use sessions
created under the client ID, and to allow the server to indicate how created under the client ID, and to allow the server to indicate how
it will allow the sessions to be used. See Section 13.1 for pNFS it will allow the sessions to be used. See Section 13.1 for pNFS
sessions considerations. sessions considerations.
3. Protocol Constants and Data Types 3. Protocol Constants and Data Types
The syntax and semantics to describe the data types of the NFSv4.1 The syntax and semantics to describe the data types of the NFSv4.1
protocol are defined in the XDR RFC4506 [2] and RPC RFC1831 [3] protocol are defined in the XDR RFC4506 [2] and RPC RFC1831 [3]
documents. The next sections build upon the XDR data types to define documents. The next sections build upon the XDR data types to define
constants, types and structures specific to this protocol. The full constants, types and structures specific to this protocol. The full
list of XDR data types is in [12]. list of XDR data types is in [13].
3.1. Basic Constants 3.1. Basic Constants
const NFS4_FHSIZE = 128; const NFS4_FHSIZE = 128;
const NFS4_VERIFIER_SIZE = 8; const NFS4_VERIFIER_SIZE = 8;
const NFS4_OPAQUE_LIMIT = 1024; const NFS4_OPAQUE_LIMIT = 1024;
const NFS4_SESSIONID_SIZE = 16; const NFS4_SESSIONID_SIZE = 16;
const NFS4_INT64_MAX = 0x7fffffffffffffff; const NFS4_INT64_MAX = 0x7fffffffffffffff;
const NFS4_UINT64_MAX = 0xffffffffffffffff; const NFS4_UINT64_MAX = 0xffffffffffffffff;
skipping to change at page 87, line 47 skipping to change at page 89, line 16
| | Case-insensitive UTF-8 string. | | | Case-insensitive UTF-8 string. |
| utf8str_cs | typedef utf8string utf8str_cs; | | utf8str_cs | typedef utf8string utf8str_cs; |
| | Case-sensitive UTF-8 string. | | | Case-sensitive UTF-8 string. |
| utf8str_mixed | typedef utf8string utf8str_mixed; | | utf8str_mixed | typedef utf8string utf8str_mixed; |
| | UTF-8 strings with a case sensitive prefix and a | | | UTF-8 strings with a case sensitive prefix and a |
| | case insensitive suffix. | | | case insensitive suffix. |
| component4 | typedef utf8str_cs component4; | | component4 | typedef utf8str_cs component4; |
| | Represents path name components. | | | Represents path name components. |
| linktext4 | typedef utf8str_cs linktext4; | | linktext4 | typedef utf8str_cs linktext4; |
| | Symbolic link contents ("symbolic link" is | | | Symbolic link contents ("symbolic link" is |
| | defined in an Open Group [13] standard). | | | defined in an Open Group [14] standard). |
| pathname4 | typedef component4 pathname4<>; | | pathname4 | typedef component4 pathname4<>; |
| | Represents path name for fs_locations. | | | Represents path name for fs_locations. |
| verifier4 | typedef opaque verifier4[NFS4_VERIFIER_SIZE]; | | verifier4 | typedef opaque verifier4[NFS4_VERIFIER_SIZE]; |
| | Verifier used for various operations (COMMIT, | | | Verifier used for various operations (COMMIT, |
| | CREATE, EXCHANGE_ID, OPEN, READDIR, WRITE) | | | CREATE, EXCHANGE_ID, OPEN, READDIR, WRITE) |
| | NFS4_VERIFIER_SIZE is defined as 8. | | | NFS4_VERIFIER_SIZE is defined as 8. |
+---------------+---------------------------------------------------+ +---------------+---------------------------------------------------+
End of Base Data Types End of Base Data Types
skipping to change at page 90, line 48 skipping to change at page 92, line 15
3.3.9. netaddr4 3.3.9. netaddr4
struct netaddr4 { struct netaddr4 {
/* see struct rpcb in RFC 1833 */ /* see struct rpcb in RFC 1833 */
string na_r_netid<>; /* network id */ string na_r_netid<>; /* network id */
string na_r_addr<>; /* universal address */ string na_r_addr<>; /* universal address */
}; };
The netaddr4 data type is used to identify network transport The netaddr4 data type is used to identify network transport
endpoints. The r_netid and r_addr fields respectively contain a endpoints. The r_netid and r_addr fields respectively contain a
netid and uaddr. The netid and uaddr concepts are defined in [14]. netid and uaddr. The netid and uaddr concepts are defined in [15].
The netid and uaddr formats for TCP over IPv4 and TCP over IPv6 are The netid and uaddr formats for TCP over IPv4 and TCP over IPv6 are
defined in [14], specifically Tables 2 and 3 and Sections 4.2.3.3 and defined in [15], specifically Tables 2 and 3 and Sections 4.2.3.3 and
4.2.3.4. 4.2.3.4.
3.3.10. state_owner4 3.3.10. state_owner4
struct state_owner4 { struct state_owner4 {
clientid4 clientid; clientid4 clientid;
opaque owner<NFS4_OPAQUE_LIMIT>; opaque owner<NFS4_OPAQUE_LIMIT>;
}; };
typedef state_owner4 open_owner4; typedef state_owner4 open_owner4;
skipping to change at page 92, line 33 skipping to change at page 93, line 48
The layouttype4 data type is 32 bits in length. The range The layouttype4 data type is 32 bits in length. The range
represented by the layout type is split into three parts. Type 0x0 represented by the layout type is split into three parts. Type 0x0
is reserved. Types within the range 0x00000001-0x7FFFFFFF are is reserved. Types within the range 0x00000001-0x7FFFFFFF are
globally unique and are assigned according to the description in globally unique and are assigned according to the description in
Section 22.4; they are maintained by IANA. Types within the range Section 22.4; they are maintained by IANA. Types within the range
0x80000000-0xFFFFFFFF are site specific and for private use only. 0x80000000-0xFFFFFFFF are site specific and for private use only.
The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file
layout type, as defined in Section 13, is to be used. The layout type, as defined in Section 13, is to be used. The
LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as
defined in [39], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME defined in [40], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME
enumeration specifies that the block/volume layout, as defined in enumeration specifies that the block/volume layout, as defined in
[40], is to be used. [41], is to be used.
3.3.14. deviceid4 3.3.14. deviceid4
const NFS4_DEVICEID4_SIZE = 16; const NFS4_DEVICEID4_SIZE = 16;
typedef opaque deviceid4[NFS4_DEVICEID4_SIZE]; typedef opaque deviceid4[NFS4_DEVICEID4_SIZE];
Layout information includes device IDs that specify a storage device Layout information includes device IDs that specify a storage device
through a compact handle. Addressing and type information is through a compact handle. Addressing and type information is
obtained with the GETDEVICEINFO operation. Device IDs are not obtained with the GETDEVICEINFO operation. Device IDs are not
skipping to change at page 96, line 50 skipping to change at page 98, line 10
for a file system object. The contents of the filehandle are opaque for a file system object. The contents of the filehandle are opaque
to the client. Therefore, the server is responsible for translating to the client. Therefore, the server is responsible for translating
the filehandle to an internal representation of the file system the filehandle to an internal representation of the file system
object. object.
4.1. Obtaining the First Filehandle 4.1. Obtaining the First Filehandle
The operations of the NFS protocol are defined in terms of one or The operations of the NFS protocol are defined in terms of one or
more filehandles. Therefore, the client needs a filehandle to more filehandles. Therefore, the client needs a filehandle to
initiate communication with the server. With the NFSv3 protocol initiate communication with the server. With the NFSv3 protocol
(RFC1813 [30]), there exists an ancillary protocol to obtain this (RFC1813 [31]), there exists an ancillary protocol to obtain this
first filehandle. The MOUNT protocol, RPC program number 100005, first filehandle. The MOUNT protocol, RPC program number 100005,
provides the mechanism of translating a string based file system path provides the mechanism of translating a string based file system path
name to a filehandle which can then be used by the NFS protocols. name to a filehandle which can then be used by the NFS protocols.
The MOUNT protocol has deficiencies in the area of security and use The MOUNT protocol has deficiencies in the area of security and use
via firewalls. This is one reason that the use of the public via firewalls. This is one reason that the use of the public
filehandle was introduced in RFC2054 [41] and RFC2055 [42]. With the filehandle was introduced in RFC2054 [42] and RFC2055 [43]. With the
use of the public filehandle in combination with the LOOKUP operation use of the public filehandle in combination with the LOOKUP operation
in the NFSv3 protocol, it has been demonstrated that the MOUNT in the NFSv3 protocol, it has been demonstrated that the MOUNT
protocol is unnecessary for viable interaction between NFS client and protocol is unnecessary for viable interaction between NFS client and
server. server.
Therefore, the NFSv4.1 protocol will not use an ancillary protocol Therefore, the NFSv4.1 protocol will not use an ancillary protocol
for translation from string based path names to a filehandle. Two for translation from string based path names to a filehandle. Two
special filehandles will be used as starting points for the NFS special filehandles will be used as starting points for the NFS
client. client.
skipping to change at page 107, line 8 skipping to change at page 108, line 13
MUST return NFS4ERR_INVAL. MUST return NFS4ERR_INVAL.
5.6. REQUIRED Attributes - List and Definition References 5.6. REQUIRED Attributes - List and Definition References
The list of REQUIRED attributes appears in Table 2. The meaning of The list of REQUIRED attributes appears in Table 2. The meaning of
the columns of the table are: the columns of the table are:
o Name: the name of attribute o Name: the name of attribute
o Id: the number assigned to the attribute. In the event of o Id: the number assigned to the attribute. In the event of
conflicts between the assigned number and [12], the latter is conflicts between the assigned number and [13], the latter is
likely authoritative, but should be resolved with Errata to this likely authoritative, but should be resolved with Errata to this
document and/or [12]. See [43] for the Errata process. document and/or [13]. See [44] for the Errata process.
o Data Type: The XDR data type of the attribute. o Data Type: The XDR data type of the attribute.
o Acc: Access allowed to the attribute. R means read-only (GETATTR o Acc: Access allowed to the attribute. R means read-only (GETATTR
may retrieve, SETATTR may not set). W means write-only (SETATTR may retrieve, SETATTR may not set). W means write-only (SETATTR
may set, GETATTR may not retrieve). R W means read/write (GETATTR may set, GETATTR may not retrieve). R W means read/write (GETATTR
may retrieve, SETATTR may set). may retrieve, SETATTR may set).
o Defined in: the section of this specification that describes the o Defined in: the section of this specification that describes the
attribute. attribute.
skipping to change at page 108, line 35 skipping to change at page 109, line 40
| layout_alignment | 66 | uint32_t | R | Section 5.12.2 | | layout_alignment | 66 | uint32_t | R | Section 5.12.2 |
| layout_blksize | 65 | uint32_t | R | Section 5.12.3 | | layout_blksize | 65 | uint32_t | R | Section 5.12.3 |
| layout_hint | 63 | layouthint4 | W | Section 5.12.4 | | layout_hint | 63 | layouthint4 | W | Section 5.12.4 |
| layout_type | 64 | layouttype4<> | R | Section 5.12.5 | | layout_type | 64 | layouttype4<> | R | Section 5.12.5 |
| maxfilesize | 27 | uint64_t | R | Section 5.8.2.17 | | maxfilesize | 27 | uint64_t | R | Section 5.8.2.17 |
| maxlink | 28 | uint32_t | R | Section 5.8.2.18 | | maxlink | 28 | uint32_t | R | Section 5.8.2.18 |
| maxname | 29 | uint32_t | R | Section 5.8.2.19 | | maxname | 29 | uint32_t | R | Section 5.8.2.19 |
| maxread | 30 | uint64_t | R | Section 5.8.2.20 | | maxread | 30 | uint64_t | R | Section 5.8.2.20 |
| maxwrite | 31 | uint64_t | R | Section 5.8.2.21 | | maxwrite | 31 | uint64_t | R | Section 5.8.2.21 |
| mdsthreshold | 68 | mdsthreshold4 | R | Section 5.12.6 | | mdsthreshold | 68 | mdsthreshold4 | R | Section 5.12.6 |
| mimetype | 32 | utf8<> | R W | Section 5.8.2.22 | | mimetype | 32 | utf8str_cs | R W | Section 5.8.2.22 |
| mode | 33 | mode4 | R W | Section 6.2.4 | | mode | 33 | mode4 | R W | Section 6.2.4 |
| mode_set_masked | 74 | mode_masked4 | W | Section 6.2.5 | | mode_set_masked | 74 | mode_masked4 | W | Section 6.2.5 |
| mounted_on_fileid | 55 | uint64_t | R | Section 5.8.2.23 | | mounted_on_fileid | 55 | uint64_t | R | Section 5.8.2.23 |
| no_trunc | 34 | bool | R | Section 5.8.2.24 | | no_trunc | 34 | bool | R | Section 5.8.2.24 |
| numlinks | 35 | uint32_t | R | Section 5.8.2.25 | | numlinks | 35 | uint32_t | R | Section 5.8.2.25 |
| owner | 36 | utf8<> | R W | Section 5.8.2.26 | | owner | 36 | utf8str_mixed | R W | Section 5.8.2.26 |
| owner_group | 37 | utf8<> | R W | Section 5.8.2.27 | | owner_group | 37 | utf8str_mixed | R W | Section 5.8.2.27 |
| quota_avail_hard | 38 | uint64_t | R | Section 5.8.2.28 | | quota_avail_hard | 38 | uint64_t | R | Section 5.8.2.28 |
| quota_avail_soft | 39 | uint64_t | R | Section 5.8.2.29 | | quota_avail_soft | 39 | uint64_t | R | Section 5.8.2.29 |
| quota_used | 40 | uint64_t | R | Section 5.8.2.30 | | quota_used | 40 | uint64_t | R | Section 5.8.2.30 |
| rawdev | 41 | specdata4 | R | Section 5.8.2.31 | | rawdev | 41 | specdata4 | R | Section 5.8.2.31 |
| retentevt_get | 71 | retention_get4 | R | Section 5.13.3 | | retentevt_get | 71 | retention_get4 | R | Section 5.13.3 |
| retentevt_set | 72 | retention_set4 | W | Section 5.13.4 | | retentevt_set | 72 | retention_set4 | W | Section 5.13.4 |
| retention_get | 69 | retention_get4 | R | Section 5.13.1 | | retention_get | 69 | retention_get4 | R | Section 5.13.1 |
| retention_hold | 73 | uint64_t | R W | Section 5.13.5 | | retention_hold | 73 | uint64_t | R W | Section 5.13.5 |
| retention_set | 70 | retention_set4 | W | Section 5.13.2 | | retention_set | 70 | retention_set4 | W | Section 5.13.2 |
| sacl | 59 | nfsacl41 | R W | Section 6.2.3 | | sacl | 59 | nfsacl41 | R W | Section 6.2.3 |
skipping to change at page 117, line 22 skipping to change at page 118, line 26
to the Windows operating environment. to the Windows operating environment.
5.8.2.37. Attribute 47: time_access 5.8.2.37. Attribute 47: time_access
The time_access attribute represents the time of last access to the The time_access attribute represents the time of last access to the
object by a read that was satisfied by the server. The notion of object by a read that was satisfied by the server. The notion of
what is an "access" depends on server's operating environment and/or what is an "access" depends on server's operating environment and/or
the server's file system semantics. For example, for servers obeying the server's file system semantics. For example, for servers obeying
POSIX semantics, time_access would be updated only by the READ and POSIX semantics, time_access would be updated only by the READ and
READDIR operations and not any of the operations that modify the READDIR operations and not any of the operations that modify the
content of the object [15], [16], [17]. Of course, setting the content of the object [16], [17], [18]. Of course, setting the
corresponding time_access_set attribute is another way to modify the corresponding time_access_set attribute is another way to modify the
time_access attribute. time_access attribute.
Whenever the file object resides on a writable file system, the Whenever the file object resides on a writable file system, the
server should make best efforts to record time_access into stable server should make best efforts to record time_access into stable
storage. However, to mitigate the performance effects of doing so, storage. However, to mitigate the performance effects of doing so,
and most especially whenever the server is satisfying the read of the and most especially whenever the server is satisfying the read of the
object's content from its cache, the server MAY cache access time object's content from its cache, the server MAY cache access time
updates and lazily write them to stable storage. It is also updates and lazily write them to stable storage. It is also
acceptable to give administrators of the server the option to disable acceptable to give administrators of the server the option to disable
skipping to change at page 118, line 23 skipping to change at page 119, line 27
5.8.2.44. Attribute 54: time_modify_set 5.8.2.44. Attribute 54: time_modify_set
Set the time of last modification to the object. SETATTR use only. Set the time of last modification to the object. SETATTR use only.
5.9. Interpreting owner and owner_group 5.9. Interpreting owner and owner_group
The RECOMMENDED attributes "owner" and "owner_group" (and also users The RECOMMENDED attributes "owner" and "owner_group" (and also users
and groups within the "acl" attribute) are represented in terms of a and groups within the "acl" attribute) are represented in terms of a
UTF-8 string. To avoid a representation that is tied to a particular UTF-8 string. To avoid a representation that is tied to a particular
underlying implementation at the client or server, the use of the underlying implementation at the client or server, the use of the
UTF-8 string has been chosen. Note that section 6.1 of RFC2624 [44] UTF-8 string has been chosen. Note that section 6.1 of RFC2624 [45]
provides additional rationale. It is expected that the client and provides additional rationale. It is expected that the client and
server will have their own local representation of owner and server will have their own local representation of owner and
owner_group that is used for local storage or presentation to the end owner_group that is used for local storage or presentation to the end
user. Therefore, it is expected that when these attributes are user. Therefore, it is expected that when these attributes are
transferred between the client and server that the local transferred between the client and server that the local
representation is translated to a syntax of the form "user@ representation is translated to a syntax of the form "user@
dns_domain". This will allow for a client and server that do not use dns_domain". This will allow for a client and server that do not use
the same local representation the ability to translate to a common the same local representation the ability to translate to a common
syntax that can be interpreted by both. syntax that can be interpreted by both.
skipping to change at page 120, line 19 skipping to change at page 121, line 23
The owner string "nobody" may be used to designate an anonymous user, The owner string "nobody" may be used to designate an anonymous user,
which will be associated with a file created by a security principal which will be associated with a file created by a security principal
that cannot be mapped through normal means to the owner attribute. that cannot be mapped through normal means to the owner attribute.
Users and implementations of NFSv4.1 SHOULD NOT use "nobody" to Users and implementations of NFSv4.1 SHOULD NOT use "nobody" to
designate a real user whose access is not anonymous. designate a real user whose access is not anonymous.
5.10. Character Case Attributes 5.10. Character Case Attributes
With respect to the case_insensitive and case_preserving attributes, With respect to the case_insensitive and case_preserving attributes,
each UCS-4 character (which UTF-8 encodes) can be mapped according to each UCS-4 character (which UTF-8 encodes) can be mapped according to
Appendix B.2 of RFC3454 [18]. For general character handling and Appendix B.2 of RFC3454 [19]. For general character handling and
internationalization issues, see Section 14. internationalization issues, see Section 14.
5.11. Directory Notification Attributes 5.11. Directory Notification Attributes
As described in Section 18.39, the client can request a minimum delay As described in Section 18.39, the client can request a minimum delay
for notifications of changes to attributes, but the server is free to for notifications of changes to attributes, but the server is free to
ignore what the client requests. The client can determine in advance ignore what the client requests. The client can determine in advance
what notification delays the server will accept by issuing a GETATTR what notification delays the server will accept by issuing a GETATTR
for either or both of two directory notification attributes. When for either or both of two directory notification attributes. When
the client calls the GET_DIR_DELEGATION operation and asks for the client calls the GET_DIR_DELEGATION operation and asks for
skipping to change at page 138, line 7 skipping to change at page 139, line 7
set unless the NFSv4.1 server has the means to set the set unless the NFSv4.1 server has the means to set the
ACE4_SYNCHRONIZE bit. The second copy will not have the ACE4_SYNCHRONIZE bit. The second copy will not have the
permission set unless the NFSv4.1 server has the means to permission set unless the NFSv4.1 server has the means to
retrieve the ACE4_SYNCHRONIZE bit. retrieve the ACE4_SYNCHRONIZE bit.
Server implementations need not provide the granularity of control Server implementations need not provide the granularity of control
that is implied by this list of masks. For example, POSIX-based that is implied by this list of masks. For example, POSIX-based
systems might not distinguish ACE4_APPEND_DATA (the ability to append systems might not distinguish ACE4_APPEND_DATA (the ability to append
to a file) from ACE4_WRITE_DATA (the ability to modify existing to a file) from ACE4_WRITE_DATA (the ability to modify existing
contents); both masks would be tied to a single "write" permission contents); both masks would be tied to a single "write" permission
[19]. When such a server returns attributes to the client, it would [20]. When such a server returns attributes to the client, it would
show both ACE4_APPEND_DATA and ACE4_WRITE_DATA if and only if the show both ACE4_APPEND_DATA and ACE4_WRITE_DATA if and only if the
write permission is enabled. write permission is enabled.
If a server receives a SETATTR request that it cannot accurately If a server receives a SETATTR request that it cannot accurately
implement, it should err in the direction of more restricted access, implement, it should err in the direction of more restricted access,
except in the previously discussed cases of execute and read. For except in the previously discussed cases of execute and read. For
example, suppose a server cannot distinguish overwriting data from example, suppose a server cannot distinguish overwriting data from
appending new data, as described in the previous paragraph. If a appending new data, as described in the previous paragraph. If a
client submits an ALLOW ACE where ACE4_APPEND_DATA is set but client submits an ALLOW ACE where ACE4_APPEND_DATA is set but
ACE4_WRITE_DATA is not (or vice versa), the server should either turn ACE4_WRITE_DATA is not (or vice versa), the server should either turn
skipping to change at page 156, line 17 skipping to change at page 157, line 17
clients should use strong security mechanisms to access the pseudo clients should use strong security mechanisms to access the pseudo
file system in order to prevent man-in-the-middle attacks. file system in order to prevent man-in-the-middle attacks.
8. State Management 8. State Management
Integrating locking into the NFS protocol necessarily causes it to be Integrating locking into the NFS protocol necessarily causes it to be
stateful. With the inclusion of such features as share reservations, stateful. With the inclusion of such features as share reservations,
file and directory delegations, recallable layouts, and support for file and directory delegations, recallable layouts, and support for
mandatory byte-range locking, the protocol becomes substantially more mandatory byte-range locking, the protocol becomes substantially more
dependent on proper management of state than the traditional dependent on proper management of state than the traditional
combination of NFS and NLM [45]. These features include expanded combination of NFS and NLM [46]. These features include expanded
locking facilities, which provide some measure of interclient locking facilities, which provide some measure of interclient
exclusion, but the state also offers features not readily providable exclusion, but the state also offers features not readily providable
using a stateless model. There are three components to making this using a stateless model. There are three components to making this
state manageable: state manageable:
o Clear division between client and server o Clear division between client and server
o Ability to reliably detect inconsistency in state between client o Ability to reliably detect inconsistency in state between client
and server and server
skipping to change at page 167, line 50 skipping to change at page 168, line 50
(sr_status_flags) returned by sequence and take the appropriate (sr_status_flags) returned by sequence and take the appropriate
action (see Section 18.46.3 for details). action (see Section 18.46.3 for details).
o The status bits SEQ4_STATUS_CB_PATH_DOWN and o The status bits SEQ4_STATUS_CB_PATH_DOWN and
SEQ4_STATUS_CB_PATH_DOWN_SESSION indicate problems with the SEQ4_STATUS_CB_PATH_DOWN_SESSION indicate problems with the
backchannel which the client may need to address in order to backchannel which the client may need to address in order to
receive callback requests. receive callback requests.
o The status bits SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING and o The status bits SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING and
SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED indicate problems with GSS SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED indicate problems with GSS
contexts for the backchannel which the client may have to address contexts or RPCSEC_GSS handles for the backchannel which the
to allow callback requests to be sent to it. client may have to address to allow callback requests to be sent
to it.
o The status bits SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED, o The status bits SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED,
SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED, SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED,
SEQ4_STATUS_ADMIN_STATE_REVOKED, and SEQ4_STATUS_ADMIN_STATE_REVOKED, and
SEQ4_STATUS_RECALLABLE_STATE_REVOKED notify the client of lock SEQ4_STATUS_RECALLABLE_STATE_REVOKED notify the client of lock
revocation events. When these bits are set, the client should use revocation events. When these bits are set, the client should use
TEST_STATEID to find what stateids have been revoked and use TEST_STATEID to find what stateids have been revoked and use
FREE_STATEID to acknowledge loss of the associated state. FREE_STATEID to acknowledge loss of the associated state.
o The status bit SEQ4_STATUS_LEASE_MOVE indicates that o The status bit SEQ4_STATUS_LEASE_MOVE indicates that
skipping to change at page 172, line 49 skipping to change at page 174, line 4
requests to be processed during the grace period, it MUST determine requests to be processed during the grace period, it MUST determine
that no lock subsequently reclaimed will be rejected and that no lock that no lock subsequently reclaimed will be rejected and that no lock
subsequently reclaimed would have prevented any I/O operation subsequently reclaimed would have prevented any I/O operation
processed during the grace period. processed during the grace period.
Clients should be prepared for the return of NFS4ERR_GRACE errors for Clients should be prepared for the return of NFS4ERR_GRACE errors for
non-reclaim lock and I/O requests. In this case the client should non-reclaim lock and I/O requests. In this case the client should
employ a retry mechanism for the request. A delay (on the order of employ a retry mechanism for the request. A delay (on the order of
several seconds) between retries should be used to avoid overwhelming several seconds) between retries should be used to avoid overwhelming
the server. Further discussion of the general issue is included in the server. Further discussion of the general issue is included in
[46]. The client must account for the server that can perform I/O [47]. The client must account for the server that can perform I/O
and non-reclaim locking requests within the grace period as well as and non-reclaim locking requests within the grace period as well as
those that cannot do so. those that cannot do so.
A reclaim-type locking request outside the server's grace period can A reclaim-type locking request outside the server's grace period can
only succeed if the server can guarantee that no conflicting lock or only succeed if the server can guarantee that no conflicting lock or
I/O request has been granted since restart. I/O request has been granted since restart.
A server may, upon restart, establish a new value for the lease A server may, upon restart, establish a new value for the lease
period. Therefore, clients should, once a new client ID is period. Therefore, clients should, once a new client ID is
established, refetch the lease_time attribute and use it as the basis established, refetch the lease_time attribute and use it as the basis
skipping to change at page 216, line 36 skipping to change at page 217, line 36
attributes obtained via GETATTR. attributes obtained via GETATTR.
A client may validate its cached version of attributes for a file by A client may validate its cached version of attributes for a file by
fetching just both the change and time_access attributes and assuming fetching just both the change and time_access attributes and assuming
that if the change attribute has the same value as it did when the that if the change attribute has the same value as it did when the
attributes were cached, then no attributes other than time_access attributes were cached, then no attributes other than time_access
have changed. The reason why time_access is also fetched is because have changed. The reason why time_access is also fetched is because
many servers operate in environments where the operation that updates many servers operate in environments where the operation that updates
change does not update time_access. For example, POSIX file change does not update time_access. For example, POSIX file
semantics do not update access time when a file is modified by the semantics do not update access time when a file is modified by the
write system call [17]. Therefore, the client that wants a current write system call [18]. Therefore, the client that wants a current
time_access value should fetch it with change during the attribute time_access value should fetch it with change during the attribute
cache validation processing and update its cached time_access. cache validation processing and update its cached time_access.
The client may maintain a cache of modified attributes for those The client may maintain a cache of modified attributes for those
attributes intimately connected with data of modified regular files attributes intimately connected with data of modified regular files
(size, time_modify, and change). Other than those three attributes, (size, time_modify, and change). Other than those three attributes,
the client MUST NOT maintain a cache of modified attributes. the client MUST NOT maintain a cache of modified attributes.
Instead, attribute changes are immediately sent to the server. Instead, attribute changes are immediately sent to the server.
In some operating environments, the equivalent to time_access is In some operating environments, the equivalent to time_access is
skipping to change at page 255, line 27 skipping to change at page 256, line 27
}; };
The fs_location4 data type is used to represent the location of a The fs_location4 data type is used to represent the location of a
file system by providing a server name and the path to the root of file system by providing a server name and the path to the root of
the file system within that server's namespace. When a set of the file system within that server's namespace. When a set of
servers have corresponding file systems at the same path within their servers have corresponding file systems at the same path within their
namespaces, an array of server names may be provided. An entry in namespaces, an array of server names may be provided. An entry in
the server array is a UTF-8 string and represents one of a the server array is a UTF-8 string and represents one of a
traditional DNS host name, IPv4 address, or IPv6 address, or a zero- traditional DNS host name, IPv4 address, or IPv6 address, or a zero-
length string. An IPv4 or IPv6 address is represented as a universal length string. An IPv4 or IPv6 address is represented as a universal
address (see Section 3.3.9 and [14]), minus the netid, and either address (see Section 3.3.9 and [15]), minus the netid, and either
with or without the trailing ".p1.p2" suffix that represents the port with or without the trailing ".p1.p2" suffix that represents the port
number. If the suffix is omitted, then the default port, 2049, number. If the suffix is omitted, then the default port, 2049,
SHOULD be assumed. A zero-length string SHOULD be used to indicate SHOULD be assumed. A zero-length string SHOULD be used to indicate
the current address being used for the RPC call. It is not a the current address being used for the RPC call. It is not a
requirement that all servers that share the same rootpath be listed requirement that all servers that share the same rootpath be listed
in one fs_location4 instance. The array of server names is provided in one fs_location4 instance. The array of server names is provided
for convenience. Servers that share the same rootpath may also be for convenience. Servers that share the same rootpath may also be
listed in separate fs_location4 entries in the fs_locations listed in separate fs_location4 entries in the fs_locations
attribute. attribute.
skipping to change at page 276, line 16 skipping to change at page 277, line 16
As noted in the Figure 1, the storage protocol is the method used by As noted in the Figure 1, the storage protocol is the method used by
the client to store and retrieve data directly from the storage the client to store and retrieve data directly from the storage
devices. devices.
The NFSv4.1 pNFS feature has been structured to allow for a variety The NFSv4.1 pNFS feature has been structured to allow for a variety
of storage protocols to be defined and used. One example storage of storage protocols to be defined and used. One example storage
protocol is NFSv4.1 itself (as documented in Section 13). Other protocol is NFSv4.1 itself (as documented in Section 13). Other
options for the storage protocol are described elsewhere and include: options for the storage protocol are described elsewhere and include:
o Block/volume protocols such as iSCSI ([47]), and FCP ([48]). The o Block/volume protocols such as iSCSI ([48]), and FCP ([49]). The
block/volume protocol support can be independent of the addressing block/volume protocol support can be independent of the addressing
structure of the block/volume protocol used, allowing more than structure of the block/volume protocol used, allowing more than
one protocol to access the same file data and enabling one protocol to access the same file data and enabling
extensibility to other block/volume protocols. See [40] for a extensibility to other block/volume protocols. See [41] for a
layout specification that allows pNFS to use block/volume storage layout specification that allows pNFS to use block/volume storage
protocols. protocols.
o Object protocols such as OSD over iSCSI or Fibre Channel [49]. o Object protocols such as OSD over iSCSI or Fibre Channel [50].
See [39] for a layout specification that allows pNFS to use object See [40] for a layout specification that allows pNFS to use object
storage protocols. storage protocols.
It is possible that various storage protocols are available to both It is possible that various storage protocols are available to both
client and server and it may be possible that a client and server do client and server and it may be possible that a client and server do
not have a matching storage protocol available to them. Because of not have a matching storage protocol available to them. Because of
this, the pNFS server MUST support normal NFSv4.1 access to any file this, the pNFS server MUST support normal NFSv4.1 access to any file
accessible by the pNFS feature; this will allow for continued accessible by the pNFS feature; this will allow for continued
interoperability between an NFSv4.1 client and server. interoperability between an NFSv4.1 client and server.
12.2.6. Control Protocol 12.2.6. Control Protocol
skipping to change at page 276, line 52 skipping to change at page 277, line 52
state required by the storage devices to perform client access state required by the storage devices to perform client access
control, and, depending on the storage protocol, the enforcement of control, and, depending on the storage protocol, the enforcement of
authentication and authorization so that restrictions that would be authentication and authorization so that restrictions that would be
enforced by the metadata server are also enforced by the storage enforced by the metadata server are also enforced by the storage
device. device.
A particular control protocol is not REQUIRED by NFSv4.1 but A particular control protocol is not REQUIRED by NFSv4.1 but
requirements are placed on the control protocol for maintaining requirements are placed on the control protocol for maintaining
attributes like modify time, the change attribute, and the end-of- attributes like modify time, the change attribute, and the end-of-
file (EOF) position. Note that if pNFS is layered over a clustered, file (EOF) position. Note that if pNFS is layered over a clustered,
parallel file system (e.g. PVFS [50]), the mechanisms that enable parallel file system (e.g. PVFS [51]), the mechanisms that enable
clustering and parallelism in that file system can be considered the clustering and parallelism in that file system can be considered the
control protocol. control protocol.
12.2.7. Layout Types 12.2.7. Layout Types
A layout describes the mapping of a file's data to the storage A layout describes the mapping of a file's data to the storage
devices that hold the data. A layout is said to belong to a specific devices that hold the data. A layout is said to belong to a specific
layout type (data type layouttype4, see Section 3.3.13). The layout layout type (data type layouttype4, see Section 3.3.13). The layout
type allows for variants to handle different storage protocols, such type allows for variants to handle different storage protocols, such
as those associated with block/volume [40], object [39], and file as those associated with block/volume [41], object [40], and file
(Section 13) layout types. A metadata server, along with its control (Section 13) layout types. A metadata server, along with its control
protocol, MUST support at least one layout type. A private sub-range protocol, MUST support at least one layout type. A private sub-range
of the layout type name space is also defined. Values from the of the layout type name space is also defined. Values from the
private layout type range MAY be used for internal testing or private layout type range MAY be used for internal testing or
experimentation. experimentation (see Section 3.3.13).
As an example, the organization of the file layout type could be an As an example, the organization of the file layout type could be an
array of tuples (e.g., device ID, filehandle), along with a array of tuples (e.g., device ID, filehandle), along with a
definition of how the data is stored across the devices (e.g., definition of how the data is stored across the devices (e.g.,
striping). A block/volume layout might be an array of tuples that striping). A block/volume layout might be an array of tuples that
store <device ID, block_number, block count> along with information store <device ID, block_number, block count> along with information
about block size and the associated file offset of the block number. about block size and the associated file offset of the block number.
An object layout might be an array of tuples <device ID, object ID> An object layout might be an array of tuples <device ID, object ID>
and an additional structure (i.e., the aggregation map) that defines and an additional structure (i.e., the aggregation map) that defines
how the logical byte sequence of the file data is serialized into the how the logical byte sequence of the file data is serialized into the
skipping to change at page 281, line 46 skipping to change at page 282, line 46
which a layout is held, does not necessarily conflict with the which a layout is held, does not necessarily conflict with the
holding of the layout that describes the file being modified. holding of the layout that describes the file being modified.
Therefore, it is the requirement of the storage protocol or layout Therefore, it is the requirement of the storage protocol or layout
type that determines the necessary behavior. For example, block/ type that determines the necessary behavior. For example, block/
volume layout types require that the layout's iomode agree with the volume layout types require that the layout's iomode agree with the
type of I/O being performed. type of I/O being performed.
Depending upon the layout type and storage protocol in use, storage Depending upon the layout type and storage protocol in use, storage
device access permissions may be granted by LAYOUTGET and may be device access permissions may be granted by LAYOUTGET and may be
encoded within the type-specific layout. For an example of storage encoded within the type-specific layout. For an example of storage
device access permissions, see an object based protocol such as [49]. device access permissions, see an object based protocol such as [50].
If access permissions are encoded within the layout, the metadata If access permissions are encoded within the layout, the metadata
server SHOULD recall the layout when those permissions become invalid server SHOULD recall the layout when those permissions become invalid
for any reason; for example when a file becomes unwritable or for any reason; for example when a file becomes unwritable or
inaccessible to a client. Note, clients are still required to inaccessible to a client. Note, clients are still required to
perform the appropriate access operations with open, lock and access perform the appropriate access operations with open, lock and access
as described above. The degree to which it is possible for the as described above. The degree to which it is possible for the
client to circumvent these access operations and the consequences of client to circumvent these access operations and the consequences of
doing so must be clearly specified by the individual layout type doing so must be clearly specified by the individual layout type
specifications. In addition, these specifications must be clear specifications. In addition, these specifications must be clear
about the requirements and non-requirements for the checking about the requirements and non-requirements for the checking
skipping to change at page 304, line 29 skipping to change at page 305, line 29
pNFS configuration. Such layout types SHOULD NOT be used when pNFS configuration. Such layout types SHOULD NOT be used when
client-only access checks do not provide sufficient assurance that client-only access checks do not provide sufficient assurance that
NFSv4.1 access control is being applied correctly. (This is not a NFSv4.1 access control is being applied correctly. (This is not a
problem for the file layout type described in Section 13 because the problem for the file layout type described in Section 13 because the
storage access protocol for LAYOUT4_NFSV4_1_FILES is NFSv4.1, and storage access protocol for LAYOUT4_NFSV4_1_FILES is NFSv4.1, and
thus the security model for storage device access via thus the security model for storage device access via
LAYOUT4_NFSv4_1_FILES is the sames as that of the metadata server.) LAYOUT4_NFSv4_1_FILES is the sames as that of the metadata server.)
For handling of access control specific to a layout, the reader For handling of access control specific to a layout, the reader
should examine the layout specification, such as the NFSv4.1/ should examine the layout specification, such as the NFSv4.1/
files-based layout (Section 13) of this document, the blocks layout files-based layout (Section 13) of this document, the blocks layout
[40], and objects layout [39]. [41], and objects layout [40].
13. NFSv4.1 as a Storage Protocol in pNFS: the File Layout Type 13. NFSv4.1 as a Storage Protocol in pNFS: the File Layout Type
This section describes the semantics and format of NFSv4.1 file-based This section describes the semantics and format of NFSv4.1 file-based
layouts for pNFS. NFSv4.1 file-based layouts uses the layouts for pNFS. NFSv4.1 file-based layouts uses the
LAYOUT4_NFSV4_1_FILES layout type. The LAYOUT4_NFSV4_1_FILES type LAYOUT4_NFSV4_1_FILES layout type. The LAYOUT4_NFSV4_1_FILES type
defines striping data across multiple NFSv4.1 data servers. defines striping data across multiple NFSv4.1 data servers.
13.1. Client ID and Session Considerations 13.1. Client ID and Session Considerations
skipping to change at page 307, line 8 skipping to change at page 308, line 8
Another scenario is for the metadata server and the storage device to Another scenario is for the metadata server and the storage device to
be distinct from one client's point of view, and the roles reversed be distinct from one client's point of view, and the roles reversed
from another client's point of view. For example, in the cluster from another client's point of view. For example, in the cluster
file system model, a metadata server to one client might be a data file system model, a metadata server to one client might be a data
server to another client. If NFSv4.1 is being used as the storage server to another client. If NFSv4.1 is being used as the storage
protocol, then pNFS servers need to encode the values of filehandles protocol, then pNFS servers need to encode the values of filehandles
according to their specific roles. according to their specific roles.
13.1.1. Sessions Considerations for Data Servers 13.1.1. Sessions Considerations for Data Servers
Section 2.10.10.2 states that a client has to keep its lease renewed Section 2.10.11.2 states that a client has to keep its lease renewed
in order to prevent a session from being deleted by the server. If in order to prevent a session from being deleted by the server. If
the reply to EXCHANGE_ID has just the EXCHGID4_FLAG_USE_PNFS_DS role the reply to EXCHANGE_ID has just the EXCHGID4_FLAG_USE_PNFS_DS role
set, then as noted in Section 13.6 the client will not be able to set, then as noted in Section 13.6 the client will not be able to
determine the data server's lease_time attribute, because GETATTR determine the data server's lease_time attribute, because GETATTR
will not be permitted. Instead, the rule is that any time a client will not be permitted. Instead, the rule is that any time a client
receives a layout referring it to a data server that returns just the receives a layout referring it to a data server that returns just the
EXCHGID4_FLAG_USE_PNFS_DS role, the client MAY assume that the EXCHGID4_FLAG_USE_PNFS_DS role, the client MAY assume that the
lease_time attribute from the metadata server that returned the lease_time attribute from the metadata server that returned the
layout applies to the data server. Thus the data server MUST be layout applies to the data server. Thus the data server MUST be
aware of the values of all lease_time attributes of all metadata aware of the values of all lease_time attributes of all metadata
skipping to change at page 329, line 41 skipping to change at page 330, line 41
layouts, then the implementation MUST support the SECINFO_NO_NAME layouts, then the implementation MUST support the SECINFO_NO_NAME
operation, on both the metadata and data servers. operation, on both the metadata and data servers.
14. Internationalization 14. Internationalization
The primary issue in which NFSv4.1 needs to deal with The primary issue in which NFSv4.1 needs to deal with
internationalization, or I18N, is with respect to file names and internationalization, or I18N, is with respect to file names and
other strings as used within the protocol. The choice of string other strings as used within the protocol. The choice of string
representation must allow reasonable name/string access to clients representation must allow reasonable name/string access to clients
which use various languages. The UTF-8 encoding of the UCS as which use various languages. The UTF-8 encoding of the UCS as
defined by ISO10646 [20] allows for this type of access and follows defined by ISO10646 [21] allows for this type of access and follows
the policy described in "IETF Policy on Character Sets and the policy described in "IETF Policy on Character Sets and
Languages", RFC2277 [21]. Languages", RFC2277 [22].
RFC3454 [18], otherwise know as "stringprep", documents a framework RFC3454 [19], otherwise know as "stringprep", documents a framework
for using Unicode/UTF-8 in networking protocols, so as "to increase for using Unicode/UTF-8 in networking protocols, so as "to increase
the likelihood that string input and string comparison work in ways the likelihood that string input and string comparison work in ways
that make sense for typical users throughout the world." A protocol that make sense for typical users throughout the world." A protocol
must define a profile of stringprep "in order to fully specify the must define a profile of stringprep "in order to fully specify the
processing options." The remainder of this Internationalization processing options." The remainder of this Internationalization
section defines the NFSv4.1 stringprep profiles. Much of terminology section defines the NFSv4.1 stringprep profiles. Much of terminology
used for the remainder of this section comes from stringprep. used for the remainder of this section comes from stringprep.
There are three UTF-8 string types defined for NFSv4.1: utf8str_cs, There are three UTF-8 string types defined for NFSv4.1: utf8str_cs,
utf8str_cis, and utf8str_mixed. Separate profiles are defined for utf8str_cis, and utf8str_mixed. Separate profiles are defined for
skipping to change at page 330, line 40 skipping to change at page 331, line 40
section 6 of stringprep) section 6 of stringprep)
o Any additional characters that are prohibited as output specific o Any additional characters that are prohibited as output specific
to the profile to the profile
Stringprep discusses Unicode characters, whereas NFSv4.1 renders Stringprep discusses Unicode characters, whereas NFSv4.1 renders
UTF-8 characters. Since there is a one-to-one mapping from UTF-8 to UTF-8 characters. Since there is a one-to-one mapping from UTF-8 to
Unicode, when the remainder of this document refers to Unicode, the Unicode, when the remainder of this document refers to Unicode, the
reader should assume UTF-8. reader should assume UTF-8.
Much of the text for the profiles comes from RFC3491 [22]. Much of the text for the profiles comes from RFC3491 [23].
14.1. Stringprep profile for the utf8str_cs type 14.1. Stringprep profile for the utf8str_cs type
Every use of the utf8str_cs type definition in the NFSv4 protocol Every use of the utf8str_cs type definition in the NFSv4 protocol
specification follows the profile named nfs4_cs_prep. specification follows the profile named nfs4_cs_prep.
14.1.1. Intended applicability of the nfs4_cs_prep profile 14.1.1. Intended applicability of the nfs4_cs_prep profile
The utf8str_cs type is a case sensitive string of UTF-8 characters. The utf8str_cs type is a case sensitive string of UTF-8 characters.
Its primary use in NFSv4.1 is for naming components and pathnames. Its primary use in NFSv4.1 is for naming components and pathnames.
skipping to change at page 348, line 20 skipping to change at page 349, line 20
A locking request was attempted which would require the upgrade or A locking request was attempted which would require the upgrade or
downgrade of a lock range already held by the owner when the server downgrade of a lock range already held by the owner when the server
does not support atomic upgrade or downgrade of locks. does not support atomic upgrade or downgrade of locks.
15.1.8.7. NFS4ERR_LOCK_RANGE (Error Code 10028) 15.1.8.7. NFS4ERR_LOCK_RANGE (Error Code 10028)
A lock request is operating on a range that overlaps in part a A lock request is operating on a range that overlaps in part a
currently held lock for the current lock-owner and does not precisely currently held lock for the current lock-owner and does not precisely
match a single such lock where the server does not support this type match a single such lock where the server does not support this type
of request, and thus does not implement POSIX locking semantics [23]. of request, and thus does not implement POSIX locking semantics [24].
See Section 18.10.4, Section 18.11.4, and Section 18.12.4 for a See Section 18.10.4, Section 18.11.4, and Section 18.12.4 for a
discussion of how this applies to LOCK, LOCKT, and LOCKU discussion of how this applies to LOCK, LOCKT, and LOCKU
respectively. respectively.
15.1.8.8. NFS4ERR_OPENMODE (Error Code 10038) 15.1.8.8. NFS4ERR_OPENMODE (Error Code 10038)
The client attempted a READ, WRITE, LOCK or other operation not The client attempted a READ, WRITE, LOCK or other operation not
sanctioned by the stateid passed (e.g. writing to a file opened only sanctioned by the stateid passed (e.g. writing to a file opened only
for read). for read).
skipping to change at page 406, line 5 skipping to change at page 407, line 5
o When a client executes a regular file, it has to read the file o When a client executes a regular file, it has to read the file
from the server. Strictly speaking, the server should not allow from the server. Strictly speaking, the server should not allow
the client to read a file being executed unless the user has read the client to read a file being executed unless the user has read
permissions on the file. Requiring users and administers to set permissions on the file. Requiring users and administers to set
read permissions on executable files in order to access them over read permissions on executable files in order to access them over
NFS is not going to be acceptable to some people. Historically, NFS is not going to be acceptable to some people. Historically,
NFS servers have allowed a user to READ a file if the user has NFS servers have allowed a user to READ a file if the user has
execute access to the file. execute access to the file.
As a practical example, the UNIX specification [51] states that an As a practical example, the UNIX specification [52] states that an
implementation claiming conformance to UNIX may indicate in the implementation claiming conformance to UNIX may indicate in the
access() programming interface's result that a privileged user has access() programming interface's result that a privileged user has
execute rights, even if no execute permission bits are set on the execute rights, even if no execute permission bits are set on the
regular file's attributes. It is possible to claim conformance to regular file's attributes. It is possible to claim conformance to
the UNIX specification and instead not indicate execute rights in the UNIX specification and instead not indicate execute rights in
that situation, which is true for some operating environments. that situation, which is true for some operating environments.
Suppose the operating environments of the client and server are Suppose the operating environments of the client and server are
implementing the access() semantics for privileged users differently, implementing the access() semantics for privileged users differently,
and the ACCESS operation implementations of the client and server and the ACCESS operation implementations of the client and server
follow their respective access() semantics. This can cause undesired follow their respective access() semantics. This can cause undesired
skipping to change at page 411, line 47 skipping to change at page 412, line 47
event or instantiation that may lead to a loss of uncommitted data. event or instantiation that may lead to a loss of uncommitted data.
Most commonly this occurs when the server is restarted; however, Most commonly this occurs when the server is restarted; however,
other events at the server may result in uncommitted data loss as other events at the server may result in uncommitted data loss as
well. well.
On success, the current filehandle retains its value. On success, the current filehandle retains its value.
18.3.4. IMPLEMENTATION 18.3.4. IMPLEMENTATION
The COMMIT operation is similar in operation and semantics to the The COMMIT operation is similar in operation and semantics to the
POSIX fsync() [24] system interface that synchronizes a file's state POSIX fsync() [25] system interface that synchronizes a file's state
with the disk (file data and metadata is flushed to disk or stable with the disk (file data and metadata is flushed to disk or stable
storage). COMMIT performs the same operation for a client, flushing storage). COMMIT performs the same operation for a client, flushing
any unsynchronized data and metadata on the server to the server's any unsynchronized data and metadata on the server to the server's
disk or stable storage for the specified file. Like fsync(2), it may disk or stable storage for the specified file. Like fsync(2), it may
be that there is some modified data or no modified data to be that there is some modified data or no modified data to
synchronize. The data may have been synchronized by the server's synchronize. The data may have been synchronized by the server's
normal periodic buffer synchronization activity. COMMIT should normal periodic buffer synchronization activity. COMMIT should
return NFS4_OK, unless there has been an unexpected error. return NFS4_OK, unless there has been an unexpected error.
COMMIT differs from fsync(2) in that it is possible for the client to COMMIT differs from fsync(2) in that it is possible for the client to
skipping to change at page 415, line 25 skipping to change at page 416, line 25
from the principal indicated in the RPC credentials of the call, but from the principal indicated in the RPC credentials of the call, but
the server's operating environment or file system semantics may the server's operating environment or file system semantics may
dictate other methods of derivation. Similarly, if createattrs dictate other methods of derivation. Similarly, if createattrs
includes neither the group attribute nor a group ACE, and if the includes neither the group attribute nor a group ACE, and if the
server's file system both supports and requires the notion of a group server's file system both supports and requires the notion of a group
attribute (or group ACE), the server MUST derive the group attribute attribute (or group ACE), the server MUST derive the group attribute
(or the corresponding owner ACE) for the file. This could be from (or the corresponding owner ACE) for the file. This could be from
the RPC call's credentials, such as the group principal if the the RPC call's credentials, such as the group principal if the
credentials include it (such as with AUTH_SYS), from the group credentials include it (such as with AUTH_SYS), from the group
identifier associated with the principal in the credentials (e.g., identifier associated with the principal in the credentials (e.g.,
POSIX systems have a user database [25] that has a group identifier POSIX systems have a user database [26] that has a group identifier
for every user identifier), inherited from directory the object is for every user identifier), inherited from directory the object is
created in, or whatever else the server's operating environment or created in, or whatever else the server's operating environment or
file system semantics dictate. This applies to the OPEN operation file system semantics dictate. This applies to the OPEN operation
too. too.
Conversely, it is possible the client will specify in createattrs an Conversely, it is possible the client will specify in createattrs an
owner attribute, group attribute, or ACL that the principal indicated owner attribute, group attribute, or ACL that the principal indicated
the RPC call's credentials does not have permissions to create files the RPC call's credentials does not have permissions to create files
for. The error to be returned in this instance is NFS4ERR_PERM. for. The error to be returned in this instance is NFS4ERR_PERM.
This applies to the OPEN operation too. This applies to the OPEN operation too.
skipping to change at page 429, line 51 skipping to change at page 430, line 51
Section 18.35) to send LOCKU. Section 18.35) to send LOCKU.
18.12.4. IMPLEMENTATION 18.12.4. IMPLEMENTATION
If the area to be unlocked does not correspond exactly to a lock If the area to be unlocked does not correspond exactly to a lock
actually held by the lock-owner the server may return the error actually held by the lock-owner the server may return the error
NFS4ERR_LOCK_RANGE. This includes the case in which the area is not NFS4ERR_LOCK_RANGE. This includes the case in which the area is not
locked, where the area is a sub-range of the area locked, where it locked, where the area is a sub-range of the area locked, where it
overlaps the area locked without matching exactly or the area overlaps the area locked without matching exactly or the area
specified includes multiple locks held by the lock-owner. In all of specified includes multiple locks held by the lock-owner. In all of
these cases, allowed by POSIX locking [23] semantics, a client these cases, allowed by POSIX locking [24] semantics, a client
receiving this error, should if it desires support for such receiving this error, should if it desires support for such
operations, simulate the operation using LOCKU on ranges operations, simulate the operation using LOCKU on ranges
corresponding to locks it actually holds, possibly followed by LOCK corresponding to locks it actually holds, possibly followed by LOCK
requests for the sub-ranges not being unlocked. requests for the sub-ranges not being unlocked.
When a client holds a write delegation, it may choose (See When a client holds a write delegation, it may choose (See
Section 18.10.4) to handle LOCK requests locally. In such a case, Section 18.10.4) to handle LOCK requests locally. In such a case,
LOCKU requests will similarly be handled locally. LOCKU requests will similarly be handled locally.
18.13. Operation 15: LOOKUP - Lookup Filename 18.13. Operation 15: LOOKUP - Lookup Filename
skipping to change at page 445, line 26 skipping to change at page 446, line 26
may specify an immediate recall in the delegation structure. may specify an immediate recall in the delegation structure.
The rflags returned by a successful OPEN allow the server to return The rflags returned by a successful OPEN allow the server to return
information governing how the open file is to be handled. information governing how the open file is to be handled.
o OPEN4_RESULT_CONFIRM is deprecated and MUST NOT be returned by an o OPEN4_RESULT_CONFIRM is deprecated and MUST NOT be returned by an
NFSv4.1 server. NFSv4.1 server.
o OPEN4_RESULT_LOCKTYPE_POSIX indicates the server's file locking o OPEN4_RESULT_LOCKTYPE_POSIX indicates the server's file locking
behavior supports the complete set of POSIX locking techniques behavior supports the complete set of POSIX locking techniques
[23]. From this the client can choose to manage file locking [24]. From this the client can choose to manage file locking
state in a way to handle a mis-match of file locking management. state in a way to handle a mis-match of file locking management.
o OPEN4_RESULT_PRESERVE_UNLINKED indicates the server will preserve o OPEN4_RESULT_PRESERVE_UNLINKED indicates the server will preserve
the open file if the client (or any other client) removes the file the open file if the client (or any other client) removes the file
as long as it is open. Furthermore, the server promises to as long as it is open. Furthermore, the server promises to
preserve the file through the grace period after server restart, preserve the file through the grace period after server restart,
thereby giving the client the opportunity to reclaim its open. thereby giving the client the opportunity to reclaim its open.
o OPEN4_RESULT_MAY_NOTIFY_LOCK indicates that the server may attempt o OPEN4_RESULT_MAY_NOTIFY_LOCK indicates that the server may attempt
CB_NOTIFY_LOCK callbacks for locks on this file. This flag is a CB_NOTIFY_LOCK callbacks for locks on this file. This flag is a
skipping to change at page 457, line 25 skipping to change at page 458, line 25
18.20.3. DESCRIPTION 18.20.3. DESCRIPTION
Replaces the current filehandle with the filehandle that represents Replaces the current filehandle with the filehandle that represents
the public filehandle of the server's name space. This filehandle the public filehandle of the server's name space. This filehandle
may be different from the "root" filehandle which may be associated may be different from the "root" filehandle which may be associated
with some other directory on the server. with some other directory on the server.
PUTPUBFH also clears the current stateid. PUTPUBFH also clears the current stateid.
The public filehandle represents the concepts embodied in RFC2054 The public filehandle represents the concepts embodied in RFC2054
[41], RFC2055 [42], and RFC2224 [52]. The intent for NFSv4.1 is that [42], RFC2055 [43], and RFC2224 [53]. The intent for NFSv4.1 is that
the public filehandle (represented by the PUTPUBFH operation) be used the public filehandle (represented by the PUTPUBFH operation) be used
as a method of providing WebNFS server compatibility with NFSv3. as a method of providing WebNFS server compatibility with NFSv3.
The public filehandle and the root filehandle (represented by the The public filehandle and the root filehandle (represented by the
PUTROOTFH operation) SHOULD be equivalent. If the public and root PUTROOTFH operation) SHOULD be equivalent. If the public and root
filehandles are not equivalent, then the directory corresponding to filehandles are not equivalent, then the directory corresponding to
the public filehandle MUST be a descendant of the directory the public filehandle MUST be a descendant of the directory
corresponding to the root filehandle. corresponding to the root filehandle.
See Section 16.2.3.1.1 for more details on the current filehandle. See Section 16.2.3.1.1 for more details on the current filehandle.
skipping to change at page 457, line 48 skipping to change at page 458, line 48
18.20.4. IMPLEMENTATION 18.20.4. IMPLEMENTATION
Used as the second operator (after SEQUENCE) in an NFS request to set Used as the second operator (after SEQUENCE) in an NFS request to set
the context for file accessing operations that follow in the same the context for file accessing operations that follow in the same
COMPOUND request. COMPOUND request.
With the NFSv3 public filehandle, the client is able to specify With the NFSv3 public filehandle, the client is able to specify
whether the path name provided in the LOOKUP should be evaluated as whether the path name provided in the LOOKUP should be evaluated as
either an absolute path relative to the server's root or relative to either an absolute path relative to the server's root or relative to
the public filehandle. RFC2224 [52] contains further discussion of the public filehandle. RFC2224 [53] contains further discussion of
the functionality. With NFSv4.1, that type of specification is not the functionality. With NFSv4.1, that type of specification is not
directly available in the LOOKUP operation. The reason for this is directly available in the LOOKUP operation. The reason for this is
because the component separators needed to specify absolute vs. because the component separators needed to specify absolute vs.
relative are not allowed in NFSv4. Therefore, the client is relative are not allowed in NFSv4. Therefore, the client is
responsible for constructing its request such that the use of either responsible for constructing its request such that the use of either
PUTROOTFH or PUTPUBFH are used to signify absolute or relative PUTROOTFH or PUTPUBFH are used to signify absolute or relative
evaluation of an NFS URL respectively. evaluation of an NFS URL respectively.
Note that there are warnings mentioned in RFC2224 [52] with respect Note that there are warnings mentioned in RFC2224 [53] with respect
to the use of absolute evaluation and the restrictions the server may to the use of absolute evaluation and the restrictions the server may
place on that evaluation with respect to how much of its namespace place on that evaluation with respect to how much of its namespace
has been made available. These same warnings apply to NFSv4.1. It has been made available. These same warnings apply to NFSv4.1. It
is likely, therefore that because of server implementation details, is likely, therefore that because of server implementation details,
an NFSv3 absolute public filehandle lookup may behave differently an NFSv3 absolute public filehandle lookup may behave differently
than an NFSv4.1 absolute resolution. than an NFSv4.1 absolute resolution.
There is a form of security negotiation as described in RFC2755 [53] There is a form of security negotiation as described in RFC2755 [54]
that uses the public filehandle and an overloading of the pathname. that uses the public filehandle and an overloading of the pathname.
This method is not available with NFSv4.1 as filehandles are not This method is not available with NFSv4.1 as filehandles are not
overloaded with special meaning and therefore do not provide the same overloaded with special meaning and therefore do not provide the same
framework as NFSv3. Clients should therefore use the security framework as NFSv3. Clients should therefore use the security
negotiation mechanisms described in Section 2.6. negotiation mechanisms described in Section 2.6.
18.21. Operation 24: PUTROOTFH - Set Root Filehandle 18.21. Operation 24: PUTROOTFH - Set Root Filehandle
18.21.1. ARGUMENTS 18.21.1. ARGUMENTS
skipping to change at page 467, line 5 skipping to change at page 468, line 5
the UTF-8 definition (and the server is enforcing UTF-8 encoding, see the UTF-8 definition (and the server is enforcing UTF-8 encoding, see
Section 14.4), the error NFS4ERR_INVAL will be returned. Section 14.4), the error NFS4ERR_INVAL will be returned.
On success, the current filehandle retains its value. On success, the current filehandle retains its value.
18.25.4. IMPLEMENTATION 18.25.4. IMPLEMENTATION
NFSv3 required a different operator RMDIR for directory removal and NFSv3 required a different operator RMDIR for directory removal and
REMOVE for non-directory removal. This allowed clients to skip REMOVE for non-directory removal. This allowed clients to skip
checking the file type when being passed a non-directory delete checking the file type when being passed a non-directory delete
system call (e.g. unlink() [26] in POSIX) to remove a directory, as system call (e.g. unlink() [27] in POSIX) to remove a directory, as
well as the converse (e.g. a rmdir() on a non-directory) because they well as the converse (e.g. a rmdir() on a non-directory) because they
knew the server would check the file type. NFSv4.1 REMOVE can be knew the server would check the file type. NFSv4.1 REMOVE can be
used to delete any directory entry independent of its file type. The used to delete any directory entry independent of its file type. The
implementor of an NFSv4.1 client's entry points from the unlink() and implementor of an NFSv4.1 client's entry points from the unlink() and
rmdir() system calls should first check the file type against the rmdir() system calls should first check the file type against the
types the system call is allowed to remove before issuing a REMOVE. types the system call is allowed to remove before issuing a REMOVE.
Alternatively, the implementor can produce a COMPOUND call that Alternatively, the implementor can produce a COMPOUND call that
includes a LOOKUP/VERIFY sequence to verify the file type before a includes a LOOKUP/VERIFY sequence to verify the file type before a
REMOVE operation in the same COMPOUND call. REMOVE operation in the same COMPOUND call.
skipping to change at page 487, line 38 skipping to change at page 488, line 38
18.33.2. RESULT 18.33.2. RESULT
struct BACKCHANNEL_CTL4res { struct BACKCHANNEL_CTL4res {
nfsstat4 bcr_status; nfsstat4 bcr_status;
}; };
18.33.3. DESCRIPTION 18.33.3. DESCRIPTION
The BACKCHANNEL_CTL operation replaces the backchannel's callback The BACKCHANNEL_CTL operation replaces the backchannel's callback
program number and adds (not replaces) RPCSEC_GSS contexts for use by program number and adds (not replaces) RPCSEC_GSS handles for use by
the backchannel. the backchannel.
The arguments of the BACKCHANNEL_CTL call are a subset of the The arguments of the BACKCHANNEL_CTL call are a subset of the
CREATE_SESSION parameters. In the arguments of BACKCHANNEL_CTL, the CREATE_SESSION parameters. In the arguments of BACKCHANNEL_CTL, the
bca_cb_program field and bca_sec_parms fields correspond respectively bca_cb_program field and bca_sec_parms fields correspond respectively
to the csa_cb_program and csa_sec_parms fields of the arguments of to the csa_cb_program and csa_sec_parms fields of the arguments of
CREATE_SESSION (Section 18.36). CREATE_SESSION (Section 18.36).
BACKCHANNEL_CTL MUST appear in a COMPOUND that starts with SEQUENCE. BACKCHANNEL_CTL MUST appear in a COMPOUND that starts with SEQUENCE.
If the RPCSEC_GSS handle identified by gcbp_handle_from_server does If the RPCSEC_GSS handle identified by gcbp_handle_from_server does
not exist on the server, the server MUST return NFS4ERR_NOENT. not exist on the server, the server MUST return NFS4ERR_NOENT.
If an RPCSEC_GSS handle is using the SSV context (see
Section 2.10.9), then because each SSV RPCSEC_GSS handle shares a
common SSV GSS context, there are security considerations specific to
this situation discussed in Section 2.10.10.
18.34. Operation 41: BIND_CONN_TO_SESSION - Associate Connection with 18.34. Operation 41: BIND_CONN_TO_SESSION - Associate Connection with
Session Session
18.34.1. ARGUMENT 18.34.1. ARGUMENT
enum channel_dir_from_client4 { enum channel_dir_from_client4 {
CDFC4_FORE = 0x1, CDFC4_FORE = 0x1,
CDFC4_BACK = 0x2, CDFC4_BACK = 0x2,
CDFC4_FORE_OR_BOTH = 0x3, CDFC4_FORE_OR_BOTH = 0x3,
CDFC4_BACK_OR_BOTH = 0x7 CDFC4_BACK_OR_BOTH = 0x7
skipping to change at page 501, line 13 skipping to change at page 502, line 13
spo_must_allow and the server agrees. spo_must_allow and the server agrees.
The SP4_SSV protection parameters also have: The SP4_SSV protection parameters also have:
ssp_hash_algs: ssp_hash_algs:
This is the set of algorithms the client supports for the purpose This is the set of algorithms the client supports for the purpose
of computing the digests needed for the internal SSV GSS mechanism of computing the digests needed for the internal SSV GSS mechanism
and for the SET_SSV operation. Each algorithm is specified as an and for the SET_SSV operation. Each algorithm is specified as an
object identifier (OID). The REQUIRED algorithms for a server are object identifier (OID). The REQUIRED algorithms for a server are
id-sha1, id-sha224, id-sha256, id-sha384, and id-sha512 [27]. The id-sha1, id-sha224, id-sha256, id-sha384, and id-sha512 [28]. The
algorithm the server selects among the set is indicated in algorithm the server selects among the set is indicated in
spi_hash_alg, a field of spr_ssv_prot_info. The field spi_hash_alg, a field of spr_ssv_prot_info. The field
spi_hash_alg is an index into the array ssp_hash_algs. If the spi_hash_alg is an index into the array ssp_hash_algs. If the
server does not support any of the offered algorithms, it returns server does not support any of the offered algorithms, it returns
NFS4ERR_HASH_ALG_UNSUPP. If ssp_hash_algs is empty, the server NFS4ERR_HASH_ALG_UNSUPP. If ssp_hash_algs is empty, the server
MUST return NFS4ERR_INVAL. MUST return NFS4ERR_INVAL.
ssp_encr_algs: ssp_encr_algs:
This is the set of algorithms the client supports for the purpose This is the set of algorithms the client supports for the purpose
of providing privacy protection for the internal SSV GSS of providing privacy protection for the internal SSV GSS
mechanism. Each algorithm is specified as an OID. The REQUIRED mechanism. Each algorithm is specified as an OID. The REQUIRED
algorithm for a server is id-aes256-CBC. The RECOMMENDED algorithm for a server is id-aes256-CBC. The RECOMMENDED
algorithms are id-aes192-CBC and id-aes128-CBC [28]. The selected algorithms are id-aes192-CBC and id-aes128-CBC [29]. The selected
algorithm is returned in spi_encr_alg, an index into algorithm is returned in spi_encr_alg, an index into
ssp_encr_algs. If the server does not support any of the offered ssp_encr_algs. If the server does not support any of the offered
algorithms, it returns NFS4ERR_ENCR_ALG_UNSUPP. If ssp_encr_algs algorithms, it returns NFS4ERR_ENCR_ALG_UNSUPP. If ssp_encr_algs
is empty, the server MUST return NFS4ERR_INVAL. Note that due to is empty, the server MUST return NFS4ERR_INVAL. Note that due to
previously stated requirements and recommendations on the previously stated requirements and recommendations on the
relationships between key length and hash length, some relationships between key length and hash length, some
combinations of RECOMMENDED and REQUIRED encryption algorithm and combinations of RECOMMENDED and REQUIRED encryption algorithm and
hash algorithm either SHOULD NOT or MUST NOT be used. Table 12 hash algorithm either SHOULD NOT or MUST NOT be used. Table 12
summarizes the illegal and discouraged combinations. summarizes the illegal and discouraged combinations.
skipping to change at page 502, line 17 skipping to change at page 503, line 17
This is the number of RPCSEC_GSS handles the server should create This is the number of RPCSEC_GSS handles the server should create
that are based on the GSS SSV mechanism (Section 2.10.9). It is that are based on the GSS SSV mechanism (Section 2.10.9). It is
not the total number of RPCSEC_GSS handles for the client ID. not the total number of RPCSEC_GSS handles for the client ID.
Indeed, subsequent calls to EXCHANGE_ID will add RPCSEC_GSS Indeed, subsequent calls to EXCHANGE_ID will add RPCSEC_GSS
handles. The server responds with a list of handles in handles. The server responds with a list of handles in
spi_handles. If the client asks for at least one handle and the spi_handles. If the client asks for at least one handle and the
server cannot create it, the server MUST return an error. The server cannot create it, the server MUST return an error. The
handles in spi_handles are not available for use until the client handles in spi_handles are not available for use until the client
ID is confirmed, which could be immediately if EXCHANGE_ID returns ID is confirmed, which could be immediately if EXCHANGE_ID returns
EXCHGID4_FLAG_CONFIRMED_R, or upon successful confirmation from EXCHGID4_FLAG_CONFIRMED_R, or upon successful confirmation from
CREATE_SESSION. While a client ID can span all the connections CREATE_SESSION.
that are connected to a server sharing the same
eir_server_owner.so_major_id, the RPCSEC_GSS handles returned in While a client ID can span all the connections that are connected
spi_handles can only be used on connections connected to a server to a server sharing the same eir_server_owner.so_major_id, the
that returns the same the eir_server_owner.so_major_id and RPCSEC_GSS handles returned in spi_handles can only be used on
eir_server_owner.so_minor_id on each connection. It is connections connected to a server that returns the same the
permissible for the client to set ssp_num_gss_handles to zero (0); eir_server_owner.so_major_id and eir_server_owner.so_minor_id on
the client can create more handles with another EXCHANGE_ID call. each connection. It is permissible for the client to set
ssp_num_gss_handles to zero (0); the client can create more
handles with another EXCHANGE_ID call.
Because each SSV RPCSEC_GSS handle shares a common SSV GSS
context, there are security considerations specific to this
situation discussed in Section 2.10.10.
The seq_window (see Section 5.2.3.1 of RFC2203 [4]) of each The seq_window (see Section 5.2.3.1 of RFC2203 [4]) of each
RPCSEC_GSS handle in spi_handle MUST be the same as the seq_window RPCSEC_GSS handle in spi_handle MUST be the same as the seq_window
of the RPCSEC_GSS handle used for the credential of the RPC of the RPCSEC_GSS handle used for the credential of the RPC
request that the EXCHANGE_ID request was sent with. request that the EXCHANGE_ID request was sent with.
+-------------------+----------------------+------------------------+ +-------------------+----------------------+------------------------+
| Encryption | MUST NOT be combined | SHOULD NOT be combined | | Encryption | MUST NOT be combined | SHOULD NOT be combined |
| Algorithm | with | with | | Algorithm | with | with |
+-------------------+----------------------+------------------------+ +-------------------+----------------------+------------------------+
skipping to change at page 515, line 37 skipping to change at page 516, line 37
NFS4ERR_NOENT. NFS4ERR_NOENT.
Within each element of csa_sec_parms, the fore and back RPCSEC_GSS Within each element of csa_sec_parms, the fore and back RPCSEC_GSS
contexts MUST share the same GSS context and MUST have the same contexts MUST share the same GSS context and MUST have the same
seq_window (see Section 5.2.3.1 of RFC2203 [4]). The fore and seq_window (see Section 5.2.3.1 of RFC2203 [4]). The fore and
back RPCSEC_GSS context state are independent of each other as far back RPCSEC_GSS context state are independent of each other as far
as the RPCSEC_GSS sequence number (see the seq_num field in the as the RPCSEC_GSS sequence number (see the seq_num field in the
rpc_gss_cred_t data type of Section 5 and of Section 5.3.1, "RPC rpc_gss_cred_t data type of Section 5 and of Section 5.3.1, "RPC
Request Header", of RFC2203). Request Header", of RFC2203).
If an RPCSEC_GSS handle is using the SSV context (see
Section 2.10.9), then because each SSV RPCSEC_GSS handle shares a
common SSV GSS context, there are security considerations specific
to this situation discussed in Section 2.10.10.
Once the session is created, the first SEQUENCE or CB_SEQUENCE Once the session is created, the first SEQUENCE or CB_SEQUENCE
received on a slot MUST have a sequence ID equal to 1; if not the received on a slot MUST have a sequence ID equal to 1; if not the
server MUST return NFS4ERR_SEQ_MISORDERED. server MUST return NFS4ERR_SEQ_MISORDERED.
18.36.4. IMPLEMENTATION 18.36.4. IMPLEMENTATION
To describe a possible implementation, the same notation for client To describe a possible implementation, the same notation for client
records introduced in the description of EXCHANGE_ID is used with the records introduced in the description of EXCHANGE_ID is used with the
following addition: following addition:
skipping to change at page 551, line 37 skipping to change at page 552, line 37
callback operation or not, and so, per rules on request retry, the callback operation or not, and so, per rules on request retry, the
server MUST retry the callback operation over the same session. server MUST retry the callback operation over the same session.
The SEQ4_STATUS_CB_PATH_DOWN_SESSION bit is the indication to the The SEQ4_STATUS_CB_PATH_DOWN_SESSION bit is the indication to the
client that it needs to associate a connection to the session's client that it needs to associate a connection to the session's
backchannel. This bit remains set on all SEQUENCE responses on backchannel. This bit remains set on all SEQUENCE responses on
the session until a backchannel on the session the path is the session until a backchannel on the session the path is
available. If the client fails to re-establish a backchannel for available. If the client fails to re-establish a backchannel for
the session, it is subject to having recallable state revoked. the session, it is subject to having recallable state revoked.
SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING
When set, indicates that all GSS contexts assigned to the When set, indicates that all GSS contexts or RPCSEC_GSS handles
session's backchannel will expire within a period equal to the assigned to the session's backchannel will expire within a period
lease time. This bit remains set on all SEQUENCE replies until equal to the lease time. This bit remains set on all SEQUENCE
the expiration time of at least one context is beyond the lease replies until at least one of the following are true:
period from the current time (relative to the time of when a
SEQUENCE response was sent) or until all GSS contexts for the
session's backchannel have expired.
* All SSV RPCSEC_GSS handles on the session's backchannel have
been destroyed and all non-SSV GSS contexts have expired.
* At least one more SSV RPCSEC_GSS handle has been added to the
backchannel.
* The expiration time of at least one non-SSV GSS context of an
RPCSEC_GSS handle is beyond the lease period from the current
time (relative to the time of when a SEQUENCE response was
sent)
SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED
When set, indicates all GSS contexts assigned to the session's When set, indicates all non-SSV GSS contexts and all SSV
backchannel have expired. This bit remains set on all SEQUENCE RPCSEC_GSS handles assigned to the session's backchannel have
replies until at least one non-expired context for the session's expired or have been destroyed. This bit remains set on all
backchannel has been established. SEQUENCE replies until at least one non-expired non-SSV GSS
context for the session's backchannel has been established or at
least one SSV RPCSEC_GSS handle has been assigned to the
backchannel.
SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED
When set, indicates that the lease has expired and as a result the When set, indicates that the lease has expired and as a result the
server released all of the client's locking state. This status server released all of the client's locking state. This status
bit remains set on all SEQUENCE replies until the loss of all such bit remains set on all SEQUENCE replies until the loss of all such
locks has been acknowledged by use of FREE_STATEID (see locks has been acknowledged by use of FREE_STATEID (see
Section 18.38), or by establishing a new client instance by Section 18.38), or by establishing a new client instance by
destroying all sessions (via DESTROY_SESSION), the client ID (via destroying all sessions (via DESTROY_SESSION), the client ID (via
DESTROY_CLIENTID), and then invoking EXCHANGE_ID and DESTROY_CLIENTID), and then invoking EXCHANGE_ID and
CREATE_SESSION to establish a new client ID. CREATE_SESSION to establish a new client ID.
skipping to change at page 583, line 4 skipping to change at page 584, line 4
RCA4_TYPE_MASK_DIR_DLG RCA4_TYPE_MASK_DIR_DLG
The client is to return directory delegations. The client is to return directory delegations.
RCA4_TYPE_MASK_FILE_LAYOUT RCA4_TYPE_MASK_FILE_LAYOUT
The client is to return layouts of type LAYOUT4_NFSV4_1_FILES. The client is to return layouts of type LAYOUT4_NFSV4_1_FILES.
RCA4_TYPE_MASK_BLK_LAYOUT RCA4_TYPE_MASK_BLK_LAYOUT
See [40] for a description. See [41] for a description.
RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX
See [39] for a description. See [40] for a description.
RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX
This range is reserved for telling the client to recall layouts of This range is reserved for telling the client to recall layouts of
experimental or site specific layout types (see Section 3.3.13). experimental or site specific layout types (see Section 3.3.13).
When a bit is set in the type mask that corresponds to an undefined When a bit is set in the type mask that corresponds to an undefined
type of recallable object, NFS4ERR_INVAL MUST be returned. When a type of recallable object, NFS4ERR_INVAL MUST be returned. When a
bit is set that corresponds to a defined type of object, but the bit is set that corresponds to a defined type of object, but the
client does not support an object of the type, NFS4ERR_INVAL MUST NOT client does not support an object of the type, NFS4ERR_INVAL MUST NOT
skipping to change at page 595, line 5 skipping to change at page 596, line 5
attack has two steps. First the attacker modifies the unprotected attack has two steps. First the attacker modifies the unprotected
results of some operation to return NFS4ERR_MOVED. Second, when results of some operation to return NFS4ERR_MOVED. Second, when
the client follows up with a GETATTR for the fs_locations or the client follows up with a GETATTR for the fs_locations or
fs_locations_info attributes, the attacker modifies the results to fs_locations_info attributes, the attacker modifies the results to
cause the client migrate its traffic to a server controlled by the cause the client migrate its traffic to a server controlled by the
attacker. With integrity protection, this attack is mitigated. attacker. With integrity protection, this attack is mitigated.
Relative to previous NFS versions, NFSv4.1 has additional security Relative to previous NFS versions, NFSv4.1 has additional security
considerations for pNFS (see Section 12.9 and Section 13.12), locking considerations for pNFS (see Section 12.9 and Section 13.12), locking
and session state (see Section 2.10.8.3), and state recovery during and session state (see Section 2.10.8.3), and state recovery during
grace period (see Section 8.4.2.1.1). grace period (see Section 8.4.2.1.1). With respect to locking and
session state, if SP4_SSV state protection is being used,
Section 2.10.10 has specific security considerations for the NFSv4.1
client and server.
22. IANA Considerations 22. IANA Considerations
This section uses terms that are defined in [54]. This section uses terms that are defined in [55].
22.1. Named Attribute Definitions 22.1. Named Attribute Definitions
IANA will create a registry called the "NFSv4 Named Attribute IANA will create a registry called the "NFSv4 Named Attribute
Definitions Registry". Definitions Registry".
The NFSv4.1 protocol supports the association of a file with zero or The NFSv4.1 protocol supports the association of a file with zero or
more named attributes. The name space identifiers for these more named attributes. The name space identifiers for these
attributes are defined as string names. The protocol does not define attributes are defined as string names. The protocol does not define
the specific assignment of the name space for these file attributes. the specific assignment of the name space for these file attributes.
skipping to change at page 595, line 32 skipping to change at page 596, line 35
attributes as needed, they are encouraged to register the attributes attributes as needed, they are encouraged to register the attributes
with IANA. with IANA.
Such registered named attributes are presumed to apply to all minor Such registered named attributes are presumed to apply to all minor
versions of NFSv4, including those defined subsequently to the versions of NFSv4, including those defined subsequently to the
registration. Where the named attribute is intended to be limited registration. Where the named attribute is intended to be limited
with regard to the minor versions for which they are not be used, the with regard to the minor versions for which they are not be used, the
assignment in registry will clearly state the applicable limits. assignment in registry will clearly state the applicable limits.
All assignments to the registry are made on a First Come First Served All assignments to the registry are made on a First Come First Served
basis, per section 4.1 of [54]. The policy for each assignment is basis, per section 4.1 of [55]. The policy for each assignment is
Specification Required, per section 4.1 of [54]. Specification Required, per section 4.1 of [55].
Under the NFSv4.1 specification, the name of a named attribute can in Under the NFSv4.1 specification, the name of a named attribute can in
theory be up to 2^32 - 1 bytes in length, but in practice NFSv4.1 theory be up to 2^32 - 1 bytes in length, but in practice NFSv4.1
clients and servers will be unable to a handle string that long. clients and servers will be unable to a handle string that long.
IANA should reject any assignment request with a named attribute that IANA should reject any assignment request with a named attribute that
exceeds 128 UTF-8 characters. To give IESG the flexibility to set up exceeds 128 UTF-8 characters. To give IESG the flexibility to set up
bases of assignment of Experimental Use and Standards Action, the bases of assignment of Experimental Use and Standards Action, the
prefixes of "EXPE" and "STDS" are Reserved. The zero length named prefixes of "EXPE" and "STDS" are Reserved. The zero length named
attribute name is Reserved. attribute name is Reserved.
skipping to change at page 596, line 46 skipping to change at page 597, line 49
The potential exists for new notification types to be added to the The potential exists for new notification types to be added to the
CB_NOTIFY_DEVICEID operation Section 20.12. This can be done via CB_NOTIFY_DEVICEID operation Section 20.12. This can be done via
changes to the operations that register notifications, or by adding changes to the operations that register notifications, or by adding
new operations to NFSv4. This requires a new minor version of NFSv4, new operations to NFSv4. This requires a new minor version of NFSv4,
and requires a standards track document from IETF. Another way to and requires a standards track document from IETF. Another way to
add a notification is to specify a new layout type (see add a notification is to specify a new layout type (see
Section 22.4). Section 22.4).
Hence all assignments to the registry are made on a Standards Action Hence all assignments to the registry are made on a Standards Action
basis per section 4.1 of [54], with Expert Review required. basis per section 4.1 of [55], with Expert Review required.
The registry is a list of assignments, each containing five fields The registry is a list of assignments, each containing five fields
per assignment. per assignment.
1. The name of the notification type. This name must have the 1. The name of the notification type. This name must have the
prefix: "NOTIFY_DEVICEID4_". This name must be unique. prefix: "NOTIFY_DEVICEID4_". This name must be unique.
2. The value of the notification. IANA will assign this number, and 2. The value of the notification. IANA will assign this number, and
the request from the registrant will use TBD1 instead of an the request from the registrant will use TBD1 instead of an
actual value. IANA MUST use a whole number which can be no actual value. IANA MUST use a whole number which can be no
higher than 2^32-1, and should be the next available value. The higher than 2^32-1, and should be the next available value. The
value assigned must be unique. A Designated Expert must be used value assigned must be unique. A Designated Expert must be used
to ensure that when the name of the notification type and its to ensure that when the name of the notification type and its
value are added to the NFSv4.1 notify_deviceid_type4 enumerated value are added to the NFSv4.1 notify_deviceid_type4 enumerated
data type in the NFSv4.1 XDR description ([12]), the result data type in the NFSv4.1 XDR description ([13]), the result
continues to be a valid XDR description. continues to be a valid XDR description.
3. The Standards Track RFC(s) that describe the notification. If 3. The Standards Track RFC(s) that describe the notification. If
the RFC(s) have not yet been published, the registrant will use the RFC(s) have not yet been published, the registrant will use
RFCTBD2, RFCTBD3, etc. instead of an actual RFC number. RFCTBD2, RFCTBD3, etc. instead of an actual RFC number.
4. How the RFC introduces the notification. This is indicated by a 4. How the RFC introduces the notification. This is indicated by a
single US-ASCII value. If the value is N, it means a minor single US-ASCII value. If the value is N, it means a minor
revision to the NFSv4 protocol. If the value is L, it means a revision to the NFSv4 protocol. If the value is L, it means a
new pNFS layout type. Other values can be used with IESG new pNFS layout type. Other values can be used with IESG
skipping to change at page 598, line 24 skipping to change at page 599, line 24
The potential exists for new object types to be added to the The potential exists for new object types to be added to the
CB_RECALL_ANY operation (see Section 20.6). This can be done via CB_RECALL_ANY operation (see Section 20.6). This can be done via
changes to the operations that add recallable types, or by adding new changes to the operations that add recallable types, or by adding new
operations to NFSv4. This requires a new minor version of NFSv4, and operations to NFSv4. This requires a new minor version of NFSv4, and
requires a standards track document from IETF. Another way to add a requires a standards track document from IETF. Another way to add a
new recallable object is to specify a new layout type (see new recallable object is to specify a new layout type (see
Section 22.4). Section 22.4).
All assignments to the registry are made on a Standards Action basis All assignments to the registry are made on a Standards Action basis
per section 4.1 of [54], with Expert Review required. per section 4.1 of [55], with Expert Review required.
Recallable object types are 32 bit unsigned numbers. There are no Recallable object types are 32 bit unsigned numbers. There are no
Reserved values. Values in the range 12 through 15, inclusive, are Reserved values. Values in the range 12 through 15, inclusive, are
for Private Use. for Private Use.
The registry is a list of assignments, each containing five fields The registry is a list of assignments, each containing five fields
per assignment. per assignment.
1. The name of the recallable object type. This name must have the 1. The name of the recallable object type. This name must have the
prefix: "RCA4_TYPE_MASK_". The name must be unique. prefix: "RCA4_TYPE_MASK_". The name must be unique.
2. The value of the recallable object type. IANA will assign this 2. The value of the recallable object type. IANA will assign this
number, and the request from the registrant will use TBD1 instead number, and the request from the registrant will use TBD1 instead
of an actual value. IANA MUST use a whole number which can be no of an actual value. IANA MUST use a whole number which can be no
higher than 2^32-1, and should be the next available value. The higher than 2^32-1, and should be the next available value. The
value must be unique. A Designated Expert must be used to ensure value must be unique. A Designated Expert must be used to ensure
that when the name of the recallable type and its value are added that when the name of the recallable type and its value are added
to the NFSv4 XDR description [12], the result continues to be a to the NFSv4 XDR description [13], the result continues to be a
valid XDR description. valid XDR description.
3. The Standards Track RFC(s) that describe the recallable object 3. The Standards Track RFC(s) that describe the recallable object
type. If the RFC(s) have not yet been published, the registrant type. If the RFC(s) have not yet been published, the registrant
will use RFCTBD2, RFCTBD3, etc. instead of an actual RFC number. will use RFCTBD2, RFCTBD3, etc. instead of an actual RFC number.
4. How the RFC introduces the recallable object type. This is 4. How the RFC introduces the recallable object type. This is
indicated by a single US-ASCII value. If the value is N, it indicated by a single US-ASCII value. If the value is N, it
means a minor revision to the NFSv4 protocol. If the value is L, means a minor revision to the NFSv4 protocol. If the value is L,
it means a new pNFS layout type. Other values can be used with it means a new pNFS layout type. Other values can be used with
skipping to change at page 599, line 29 skipping to change at page 600, line 29
| Recallable Object Type Name | Value | RFC | How | Minor | | Recallable Object Type Name | Value | RFC | How | Minor |
| | | | | Versions | | | | | | Versions |
+-------------------------------+-------+----------+-----+----------+ +-------------------------------+-------+----------+-----+----------+
| RCA4_TYPE_MASK_RDATA_DLG | 0 | RFCTBD10 | N | 1 | | RCA4_TYPE_MASK_RDATA_DLG | 0 | RFCTBD10 | N | 1 |
| RCA4_TYPE_MASK_WDATA_DLG | 1 | RFCTBD10 | N | 1 | | RCA4_TYPE_MASK_WDATA_DLG | 1 | RFCTBD10 | N | 1 |
| RCA4_TYPE_MASK_DIR_DLG | 2 | RFCTBD10 | N | 1 | | RCA4_TYPE_MASK_DIR_DLG | 2 | RFCTBD10 | N | 1 |
| RCA4_TYPE_MASK_FILE_LAYOUT | 3 | RFCTBD10 | N | 1 | | RCA4_TYPE_MASK_FILE_LAYOUT | 3 | RFCTBD10 | N | 1 |
| RCA4_TYPE_MASK_BLK_LAYOUT | 4 | RFCTBD20 | L | 1 | | RCA4_TYPE_MASK_BLK_LAYOUT | 4 | RFCTBD20 | L | 1 |
| RCA4_TYPE_MASK_OBJ_LAYOUT_MIN | 8 | RFCTBD30 | L | 1 | | RCA4_TYPE_MASK_OBJ_LAYOUT_MIN | 8 | RFCTBD30 | L | 1 |
| RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | 9 | RFCTBD30 | L | 1 | | RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | 9 | RFCTBD30 | L | 1 |
| Private Use | 12-15 | RFCTBD10 | L | 1 |
+-------------------------------+-------+----------+-----+----------+ +-------------------------------+-------+----------+-----+----------+
Table 17: Initial Recallable Object Type Assignments Table 17: Initial Recallable Object Type Assignments
22.3.2. Updating Registrations 22.3.2. Updating Registrations
The update of a registration will require IESG Approval on the advice The update of a registration will require IESG Approval on the advice
of a Designated Expert. of a Designated Expert.
22.4. Layout Types 22.4. Layout Types
skipping to change at page 600, line 13 skipping to change at page 601, line 13
The registry is a list of assignments, each containing five fields. The registry is a list of assignments, each containing five fields.
1. The name of the layout type. This name must have the prefix: 1. The name of the layout type. This name must have the prefix:
"LAYOUT4_". The name must be unique. "LAYOUT4_". The name must be unique.
2. The value of the layout type. IANA will assign this number, and 2. The value of the layout type. IANA will assign this number, and
the request from the registrant will use TBD1 instead of an the request from the registrant will use TBD1 instead of an
actual value. The value assigned must be unique. A Designated actual value. The value assigned must be unique. A Designated
Expert must be used to ensure that when the name of the layout Expert must be used to ensure that when the name of the layout
type and its value are added to the NFSv4.1 layouttype4 type and its value are added to the NFSv4.1 layouttype4
enumerated data type in the NFSv4.1 XDR description ([12]), the enumerated data type in the NFSv4.1 XDR description ([13]), the
result continues to be a valid XDR description. result continues to be a valid XDR description.
3. The Standards Track RFC(s) that describe the notification. If 3. The Standards Track RFC(s) that describe the notification. If
the RFC(s) have not yet been published, the registrant will use the RFC(s) have not yet been published, the registrant will use
RFCTBD2, RFCTBD3, etc. instead of an actual RFC number. RFCTBD2, RFCTBD3, etc. instead of an actual RFC number.
Collectively, the RFC(s) must adhere to the guidelines listed in Collectively, the RFC(s) must adhere to the guidelines listed in
Section 22.4.3. Section 22.4.3.
4. How the RFC introduces the layout type. This is indicated by a 4. How the RFC introduces the layout type. This is indicated by a
single US-ASCII value. If the value is N, it means a minor single US-ASCII value. If the value is N, it means a minor
skipping to change at page 606, line 46 skipping to change at page 607, line 46
Placement", draft-ietf-nfsv4-nfsdirect-08 (work in progress), Placement", draft-ietf-nfsv4-nfsdirect-08 (work in progress),
April 2008. April 2008.
[10] Recio, P., Metzler, B., Culley, P., Hilland, J., and D. Garcia, [10] Recio, P., Metzler, B., Culley, P., Hilland, J., and D. Garcia,
"A Remote Direct Memory Access Protocol Specification", "A Remote Direct Memory Access Protocol Specification",
RFC 5040, October 2007. RFC 5040, October 2007.
[11] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-Hashing [11] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-Hashing
for Message Authentication", RFC 2104, February 1997. for Message Authentication", RFC 2104, February 1997.
[12] Shepler, S., Eisler, M., and D. Noveck, "NFSv4 Minor Version 1 [12] Eisler, M., "RPCSEC_GSS Version 2", RFC 5403, February 2009.
[13] Shepler, S., Eisler, M., and D. Noveck, "NFSv4 Minor Version 1
XDR Description", draft-ietf-nfsv4-minorversion1-dot-x-12 (work XDR Description", draft-ietf-nfsv4-minorversion1-dot-x-12 (work
in progress), Dec 2008. in progress), Dec 2008.
[13] The Open Group, "Section 3.372 of Chapter 3 of Base Definitions [14] The Open Group, "Section 3.372 of Chapter 3 of Base Definitions
of The Open Group Base Specifications Issue 6 IEEE Std 1003.1, of The Open Group Base Specifications Issue 6 IEEE Std 1003.1,
2004 Edition, HTML Version (www.opengroup.org), ISBN 2004 Edition, HTML Version (www.opengroup.org), ISBN
1931624232", 2004. 1931624232", 2004.
[14] Eisler, M., "IANA Considerations for RPC Net Identifiers and [15] Eisler, M., "IANA Considerations for RPC Net Identifiers and
Universal Address Formats", draft-ietf-nfsv4-rpc-netid-04 (work Universal Address Formats", draft-ietf-nfsv4-rpc-netid-04 (work
in progress), December 2008. in progress), December 2008.
[15] The Open Group, "Section 'read()' of System Interfaces of The [16] The Open Group, "Section 'read()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232", Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004. 2004.
[16] The Open Group, "Section 'readdir()' of System Interfaces of [17] The Open Group, "Section 'readdir()' of System Interfaces of
The Open Group Base Specifications Issue 6 IEEE Std 1003.1, The Open Group Base Specifications Issue 6 IEEE Std 1003.1,
2004 Edition, HTML Version (www.opengroup.org), ISBN 2004 Edition, HTML Version (www.opengroup.org), ISBN
1931624232", 2004. 1931624232", 2004.
[17] The Open Group, "Section 'write()' of System Interfaces of The [18] The Open Group, "Section 'write()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232", Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004. 2004.
[18] Hoffman, P. and M. Blanchet, "Preparation of Internationalized [19] Hoffman, P. and M. Blanchet, "Preparation of Internationalized
Strings ("stringprep")", RFC 3454, December 2002. Strings ("stringprep")", RFC 3454, December 2002.
[19] The Open Group, "Section 'chmod()' of System Interfaces of The [20] The Open Group, "Section 'chmod()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232", Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004. 2004.
[20] International Organization for Standardization, "Information [21] International Organization for Standardization, "Information
Technology - Universal Multiple-octet coded Character Set (UCS) Technology - Universal Multiple-octet coded Character Set (UCS)
- Part 1: Architecture and Basic Multilingual Plane", - Part 1: Architecture and Basic Multilingual Plane",
ISO Standard 10646-1, May 1993. ISO Standard 10646-1, May 1993.
[21] Alvestrand, H., "IETF Policy on Character Sets and Languages", [22] Alvestrand, H., "IETF Policy on Character Sets and Languages",
BCP 18, RFC 2277, January 1998. BCP 18, RFC 2277, January 1998.
[22] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile [23] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile
for Internationalized Domain Names (IDN)", RFC 3491, for Internationalized Domain Names (IDN)", RFC 3491,
March 2003. March 2003.
[23] The Open Group, "Section 'fcntl()' of System Interfaces of The [24] The Open Group, "Section 'fcntl()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232", Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004. 2004.
[24] The Open Group, "Section 'fsync()' of System Interfaces of The [25] The Open Group, "Section 'fsync()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232", Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004. 2004.
[25] The Open Group, "Section 'getpwnam()' of System Interfaces of [26] The Open Group, "Section 'getpwnam()' of System Interfaces of
The Open Group Base Specifications Issue 6 IEEE Std 1003.1, The Open Group Base Specifications Issue 6 IEEE Std 1003.1,
2004 Edition, HTML Version (www.opengroup.org), ISBN 2004 Edition, HTML Version (www.opengroup.org), ISBN
1931624232", 2004. 1931624232", 2004.
[26] The Open Group, "Section 'unlink()' of System Interfaces of The [27] The Open Group, "Section 'unlink()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232", Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004. 2004.
[27] Schaad, J., Kaliski, B., and R. Housley, "Additional Algorithms [28] Schaad, J., Kaliski, B., and R. Housley, "Additional Algorithms
and Identifiers for RSA Cryptography for use in the Internet and Identifiers for RSA Cryptography for use in the Internet
X.509 Public Key Infrastructure Certificate and Certificate X.509 Public Key Infrastructure Certificate and Certificate
Revocation List (CRL) Profile", RFC 4055, June 2005. Revocation List (CRL) Profile", RFC 4055, June 2005.
[28] National Institute of Standards and Technology, "Cryptographic [29] National Institute of Standards and Technology, "Cryptographic
Algorithm Object Registration", URL http://csrc.nist.gov/ Algorithm Object Registration", URL http://csrc.nist.gov/
groups/ST/crypto_apps_infra/csor/algorithms.html, groups/ST/crypto_apps_infra/csor/algorithms.html,
November 2007. November 2007.
23.2. Informative References 23.2. Informative References
[29] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, [30] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame,
C., Eisler, M., and D. Noveck, "Network File System (NFS) C., Eisler, M., and D. Noveck, "Network File System (NFS)
version 4 Protocol", RFC 3530, April 2003. version 4 Protocol", RFC 3530, April 2003.
[30] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS Version 3 [31] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS Version 3
Protocol Specification", RFC 1813, June 1995. Protocol Specification", RFC 1813, June 1995.
[31] Eisler, M., "LIPKEY - A Low Infrastructure Public Key Mechanism [32] Eisler, M., "LIPKEY - A Low Infrastructure Public Key Mechanism
Using SPKM", RFC 2847, June 2000. Using SPKM", RFC 2847, June 2000.
[32] Eisler, M., "NFS Version 2 and Version 3 Security Issues and [33] Eisler, M., "NFS Version 2 and Version 3 Security Issues and
the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5", the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5",
RFC 2623, June 1999. RFC 2623, June 1999.
[33] Juszczak, C., "Improving the Performance and Correctness of an [34] Juszczak, C., "Improving the Performance and Correctness of an
NFS Server", USENIX Conference Proceedings , June 1990. NFS Server", USENIX Conference Proceedings , June 1990.
[34] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by an On- [35] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by an On-
line Database", RFC 3232, January 2002. line Database", RFC 3232, January 2002.
[35] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", [36] Srinivasan, R., "Binding Protocols for ONC RPC Version 2",
RFC 1833, August 1995. RFC 1833, August 1995.
[36] Werme, R., "RPC XID Issues", USENIX Conference Proceedings , [37] Werme, R., "RPC XID Issues", USENIX Conference Proceedings ,
February 1996. February 1996.
[37] Nowicki, B., "NFS: Network File System Protocol specification", [38] Nowicki, B., "NFS: Network File System Protocol specification",
RFC 1094, March 1989. RFC 1094, March 1989.
[38] Bhide, A., Elnozahy, E., and S. Morgan, "A Highly Available [39] Bhide, A., Elnozahy, E., and S. Morgan, "A Highly Available
Network Server", USENIX Conference Proceedings , January 1991. Network Server", USENIX Conference Proceedings , January 1991.
[39] Halevy, B., Welch, B., and J. Zelenka, "Object-based pNFS [40] Halevy, B., Welch, B., and J. Zelenka, "Object-based pNFS
Operations", draft-ietf-nfsv4-pnfs-obj-11 (work in progress), Operations", draft-ietf-nfsv4-pnfs-obj-11 (work in progress),
December 2008. December 2008.
[40] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/Volume [41] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/Volume
Layout", draft-ietf-nfsv4-pnfs-block-11 (work in progress), Layout", draft-ietf-nfsv4-pnfs-block-11 (work in progress),
December 2008. December 2008.
[41] Callaghan, B., "WebNFS Client Specification", RFC 2054, [42] Callaghan, B., "WebNFS Client Specification", RFC 2054,
October 1996. October 1996.
[42] Callaghan, B., "WebNFS Server Specification", RFC 2055, [43] Callaghan, B., "WebNFS Server Specification", RFC 2055,
October 1996. October 1996.
[43] IESG, "IESG Processing of RFC Errata for the IETF Stream", [44] IESG, "IESG Processing of RFC Errata for the IETF Stream",
July 2008. July 2008.
[44] Shepler, S., "NFS Version 4 Design Considerations", RFC 2624, [45] Shepler, S., "NFS Version 4 Design Considerations", RFC 2624,
June 1999. June 1999.
[45] The Open Group, "Protocols for Interworking: XNFS, Version 3W, [46] The Open Group, "Protocols for Interworking: XNFS, Version 3W,
ISBN 1-85912-184-5", February 1998. ISBN 1-85912-184-5", February 1998.
[46] Floyd, S. and V. Jacobson, "The Synchronization of Periodic [47] Floyd, S. and V. Jacobson, "The Synchronization of Periodic
Routing Messages", IEEE/ACM Transactions on Networking 2(2), Routing Messages", IEEE/ACM Transactions on Networking 2(2),
pp. 122-136, April 1994. pp. 122-136, April 1994.
[47] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and E. [48] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and E.
Zeidner, "Internet Small Computer Systems Interface (iSCSI)", Zeidner, "Internet Small Computer Systems Interface (iSCSI)",
RFC 3720, April 2004. RFC 3720, April 2004.
[48] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version [49] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version
(FCP-2)", ANSI/INCITS 350-2003, Oct 2003. (FCP-2)", ANSI/INCITS 350-2003, Oct 2003.
[49] Weber, R., "Object-Based Storage Device Commands (OSD)", ANSI/ [50] Weber, R., "Object-Based Storage Device Commands (OSD)", ANSI/
INCITS 400-2004, July 2004, INCITS 400-2004, July 2004,
<http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>. <http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>.
[50] Carns, P., Ligon III, W., Ross, R., and R. Thakur, "PVFS: A [51] Carns, P., Ligon III, W., Ross, R., and R. Thakur, "PVFS: A
Parallel File System for Linux Clusters.", Proceedings of the Parallel File System for Linux Clusters.", Proceedings of the
4th Annual Linux Showcase and Conference , 2000. 4th Annual Linux Showcase and Conference , 2000.
[51] The Open Group, "The Open Group Base Specifications Issue 6, [52] The Open Group, "The Open Group Base Specifications Issue 6,
IEEE Std 1003.1, 2004 Edition", 2004. IEEE Std 1003.1, 2004 Edition", 2004.
[52] Callaghan, B., "NFS URL Scheme", RFC 2224, October 1997. [53] Callaghan, B., "NFS URL Scheme", RFC 2224, October 1997.
[53] Chiu, A., Eisler, M., and B. Callaghan, "Security Negotiation [54] Chiu, A., Eisler, M., and B. Callaghan, "Security Negotiation
for WebNFS", RFC 2755, January 2000. for WebNFS", RFC 2755, January 2000.
[54] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA [55] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.
Appendix A. Acknowledgments Appendix A. Acknowledgments
The initial drafts for the SECINFO extensions were edited by Mike The initial drafts for the SECINFO extensions were edited by Mike
Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl
Burnett. Burnett.
The initial drafts for the SESSIONS extensions were edited by Tom The initial drafts for the SESSIONS extensions were edited by Tom
Talpey, Spencer Shepler, Jon Bauman with contributions from Charles Talpey, Spencer Shepler, Jon Bauman with contributions from Charles
skipping to change at page 612, line 43 skipping to change at page 613, line 44
[RFC Editor: please remove this section prior to publishing this [RFC Editor: please remove this section prior to publishing this
document as an RFC] document as an RFC]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the
RFC number of this document] RFC number of this document]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD20 with RFCyyyy where yyyy is the replace all occurrences of RFCTBD20 with RFCyyyy where yyyy is the
RFC number of the document referenced in [40]] RFC number of the document referenced in [41]]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD30 with RFCzzzz where zzzz is the replace all occurrences of RFCTBD30 with RFCzzzz where zzzz is the
RFC number of the document referenced in [39]] RFC number of the document referenced in [40]]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
ensure all section references to [14], including the reference from ensure all section references to [15], including the reference from
Section 3.3.9 are accurate if document referenced by [14] has been Section 3.3.9 are accurate if document referenced by [15] has been
finalized for RFC publication. If not finalized for publication, finalized for RFC publication. If not finalized for publication,
please remove section number references to [14]. please remove section number references to [15].
Authors' Addresses Authors' Addresses
Spencer Shepler Spencer Shepler
Storspeed, Inc. Storspeed, Inc.
7808 Moonflower Drive 7808 Moonflower Drive
Austin, TX 78750 Austin, TX 78750
USA USA
Phone: +1-512-402-5811 ext 8530 Phone: +1-512-402-5811 ext 8530
 End of changes. 189 change blocks. 
515 lines changed or deleted 610 lines changed or added

This html diff was produced by rfcdiff 1.35. The latest version is available from http://tools.ietf.org/tools/rfcdiff/