Diff: draft-pre-ch-7.txt - draft-ietf-nfsv4-minorversion1-22.txt

Found wdiff, but it reported no recognisable version. Falling back to builtin diff colouring... Diff: draft-pre-ch-7.txt - draft-ietf-nfsv4-minorversion1-22.txt

	draft-pre-ch-7.txt		draft-ietf-nfsv4-minorversion1-22.txt

	NFSv4 S. Shepler		NFSv4 S. Shepler
	Internet-Draft M. Eisler		Internet-Draft M. Eisler
	Intended status: Standards Track D. Noveck		Intended status: Standards Track D. Noveck

	Expires: September 14, 2008 Editors		Expires: September 19, 2008 Editors
	March 13, 2008		March 18, 2008

	NFS Version 4 Minor Version 1		NFS Version 4 Minor Version 1
	draft-ietf-nfsv4-minorversion1-22.txt		draft-ietf-nfsv4-minorversion1-22.txt

	Status of this Memo		Status of this Memo

	By submitting this Internet-Draft, each author represents that any		By submitting this Internet-Draft, each author represents that any
	applicable patent or other IPR claims of which he or she is aware		applicable patent or other IPR claims of which he or she is aware
	have been or will be disclosed, and any of which he or she becomes		have been or will be disclosed, and any of which he or she becomes
	aware will be disclosed, in accordance with Section 6 of BCP 79.		aware will be disclosed, in accordance with Section 6 of BCP 79.

	skipping to change at page 1, line 35		skipping to change at page 1, line 35
	and may be updated, replaced, or obsoleted by other documents at any		and may be updated, replaced, or obsoleted by other documents at any
	time. It is inappropriate to use Internet-Drafts as reference		time. It is inappropriate to use Internet-Drafts as reference
	material or to cite them other than as "work in progress."		material or to cite them other than as "work in progress."

	The list of current Internet-Drafts can be accessed at		The list of current Internet-Drafts can be accessed at
	http://www.ietf.org/ietf/1id-abstracts.txt.		http://www.ietf.org/ietf/1id-abstracts.txt.

	The list of Internet-Draft Shadow Directories can be accessed at		The list of Internet-Draft Shadow Directories can be accessed at
	http://www.ietf.org/shadow.html.		http://www.ietf.org/shadow.html.


	This Internet-Draft will expire on September 14, 2008.		This Internet-Draft will expire on September 19, 2008.

	Copyright Notice		Copyright Notice

	Copyright (C) The IETF Trust (2008).		Copyright (C) The IETF Trust (2008).

	Abstract		Abstract

	This Internet-Draft describes NFS version 4 minor version one,		This Internet-Draft describes NFS version 4 minor version one,
	including features retained from the base protocol and protocol		including features retained from the base protocol and protocol
	extensions made subsequently. Major extensions introduced in NFS		extensions made subsequently. Major extensions introduced in NFS

	skipping to change at page 4, line 30		skipping to change at page 4, line 30
	8. State Management . . . . . . . . . . . . . . . . . . . . . . 147		8. State Management . . . . . . . . . . . . . . . . . . . . . . 147
	8.1. Client and Session ID . . . . . . . . . . . . . . . . . 148		8.1. Client and Session ID . . . . . . . . . . . . . . . . . 148
	8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 148		8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 148
	8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 149		8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 149
	8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 150		8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 150
	8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 151		8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 151
	8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 152		8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 152
	8.2.5. Stateid Use for I/O Operations . . . . . . . . . . . 155		8.2.5. Stateid Use for I/O Operations . . . . . . . . . . . 155
	8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 156		8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 156
	8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 158		8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 158

	8.4.1. Client Failure and Recovery . . . . . . . . . . . . 158		8.4.1. Client Failure and Recovery . . . . . . . . . . . . 159
	8.4.2. Server Failure and Recovery . . . . . . . . . . . . 159		8.4.2. Server Failure and Recovery . . . . . . . . . . . . 159
	8.4.3. Network Partitions and Recovery . . . . . . . . . . 163		8.4.3. Network Partitions and Recovery . . . . . . . . . . 163

	8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 167		8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 168
	8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 168		8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 169
	8.7. Clocks, Propagation Delay, and Calculating Lease		8.7. Clocks, Propagation Delay, and Calculating Lease
	Expiration . . . . . . . . . . . . . . . . . . . . . . . 169		Expiration . . . . . . . . . . . . . . . . . . . . . . . 169

	8.8. Obsolete Locking Infrastructure From NFSv4.0 . . . . . . 169		8.8. Obsolete Locking Infrastructure From NFSv4.0 . . . . . . 170
	9. File Locking and Share Reservations . . . . . . . . . . . . . 170		9. File Locking and Share Reservations . . . . . . . . . . . . . 171
	9.1. Opens and Byte-range Locks . . . . . . . . . . . . . . . 171		9.1. Opens and Byte-range Locks . . . . . . . . . . . . . . . 171
	9.1.1. State-owner Definition . . . . . . . . . . . . . . . 171		9.1.1. State-owner Definition . . . . . . . . . . . . . . . 171

	9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 171		9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 172
	9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 174		9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 175
	9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 175		9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 175

	9.4. Stateid Seqid Values and Byte-range Locks . . . . . . . 175		9.4. Stateid Seqid Values and Byte-range Locks . . . . . . . 176
	9.5. Issues with Multiple Open-owners . . . . . . . . . . . . 175		9.5. Issues with Multiple Open-owners . . . . . . . . . . . . 176
	9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 176		9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 176

	9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 177		9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 178
	9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 178		9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 178
	9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 179		9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 179

	9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 179		9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 180
	9.11. Reclaim of Open and Byte-range Locks . . . . . . . . . . 180		9.11. Reclaim of Open and Byte-range Locks . . . . . . . . . . 180

	10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 180		10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 181
	10.1. Performance Challenges for Client-Side Caching . . . . . 181		10.1. Performance Challenges for Client-Side Caching . . . . . 181
	10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 182		10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 182

	10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 184		10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 185
	10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 186		10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 187
	10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 187		10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 187
	10.3.2. Data Caching and File Locking . . . . . . . . . . . 188		10.3.2. Data Caching and File Locking . . . . . . . . . . . 188

	10.3.3. Data Caching and Mandatory File Locking . . . . . . 189		10.3.3. Data Caching and Mandatory File Locking . . . . . . 190
	10.3.4. Data Caching and File Identity . . . . . . . . . . . 190		10.3.4. Data Caching and File Identity . . . . . . . . . . . 190
	10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 191		10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 191

	10.4.1. Open Delegation and Data Caching . . . . . . . . . . 193		10.4.1. Open Delegation and Data Caching . . . . . . . . . . 194
	10.4.2. Open Delegation and File Locks . . . . . . . . . . . 195		10.4.2. Open Delegation and File Locks . . . . . . . . . . . 195
	10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 195		10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 195
	10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 198		10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 198
	10.4.5. Clients that Fail to Honor Delegation Recalls . . . 200		10.4.5. Clients that Fail to Honor Delegation Recalls . . . 200

	10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 200		10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 201
	10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 201		10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 201
	10.5. Data Caching and Revocation . . . . . . . . . . . . . . 202		10.5. Data Caching and Revocation . . . . . . . . . . . . . . 202

	10.5.1. Revocation Recovery for Write Open Delegation . . . 202		10.5.1. Revocation Recovery for Write Open Delegation . . . 203
	10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 203		10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 203
	10.7. Data and Metadata Caching and Memory Mapped Files . . . 205		10.7. Data and Metadata Caching and Memory Mapped Files . . . 205
	10.8. Name and Directory Caching without Directory		10.8. Name and Directory Caching without Directory

	Delegations . . . . . . . . . . . . . . . . . . . . . . 207		Delegations . . . . . . . . . . . . . . . . . . . . . . 208
	10.8.1. Name Caching . . . . . . . . . . . . . . . . . . . . 207		10.8.1. Name Caching . . . . . . . . . . . . . . . . . . . . 208
	10.8.2. Directory Caching . . . . . . . . . . . . . . . . . 209		10.8.2. Directory Caching . . . . . . . . . . . . . . . . . 209
	10.9. Directory Delegations . . . . . . . . . . . . . . . . . 210		10.9. Directory Delegations . . . . . . . . . . . . . . . . . 210
	10.9.1. Introduction to Directory Delegations . . . . . . . 210		10.9.1. Introduction to Directory Delegations . . . . . . . 210
	10.9.2. Directory Delegation Design . . . . . . . . . . . . 211		10.9.2. Directory Delegation Design . . . . . . . . . . . . 211
	10.9.3. Attributes in Support of Directory Notifications . . 212		10.9.3. Attributes in Support of Directory Notifications . . 212
	10.9.4. Directory Delegation Recall . . . . . . . . . . . . 212		10.9.4. Directory Delegation Recall . . . . . . . . . . . . 212
	10.9.5. Directory Delegation Recovery . . . . . . . . . . . 213		10.9.5. Directory Delegation Recovery . . . . . . . . . . . 213
	11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 213		11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 213

	11.1. Location Attributes . . . . . . . . . . . . . . . . . . 213		11.1. Location Attributes . . . . . . . . . . . . . . . . . . 214
	11.2. File System Presence or Absence . . . . . . . . . . . . 214		11.2. File System Presence or Absence . . . . . . . . . . . . 214
	11.3. Getting Attributes for an Absent File System . . . . . . 215		11.3. Getting Attributes for an Absent File System . . . . . . 215
	11.3.1. GETATTR Within an Absent File System . . . . . . . . 215		11.3.1. GETATTR Within an Absent File System . . . . . . . . 215

	11.3.2. READDIR and Absent File Systems . . . . . . . . . . 216		11.3.2. READDIR and Absent File Systems . . . . . . . . . . 217
	11.4. Uses of Location Information . . . . . . . . . . . . . . 217		11.4. Uses of Location Information . . . . . . . . . . . . . . 217
	11.4.1. File System Replication . . . . . . . . . . . . . . 218		11.4.1. File System Replication . . . . . . . . . . . . . . 218
	11.4.2. File System Migration . . . . . . . . . . . . . . . 219		11.4.2. File System Migration . . . . . . . . . . . . . . . 219
	11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 220		11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 220
	11.5. Location Entries and Server Identity . . . . . . . . . . 221		11.5. Location Entries and Server Identity . . . . . . . . . . 221
	11.6. Additional Client-side Considerations . . . . . . . . . 222		11.6. Additional Client-side Considerations . . . . . . . . . 222
	11.7. Effecting File System Transitions . . . . . . . . . . . 223		11.7. Effecting File System Transitions . . . . . . . . . . . 223
	11.7.1. File System Transitions and Simultaneous Access . . 224		11.7.1. File System Transitions and Simultaneous Access . . 224

	11.7.2. Simultaneous Use and Transparent Transitions . . . . 224		11.7.2. Simultaneous Use and Transparent Transitions . . . . 225
	11.7.3. Filehandles and File System Transitions . . . . . . 227		11.7.3. Filehandles and File System Transitions . . . . . . 227

	11.7.4. Fileids and File System Transitions . . . . . . . . 227		11.7.4. Fileids and File System Transitions . . . . . . . . 228
	11.7.5. Fsids and File System Transitions . . . . . . . . . 229		11.7.5. Fsids and File System Transitions . . . . . . . . . 229

	11.7.6. The Change Attribute and File System Transitions . . 229		11.7.6. The Change Attribute and File System Transitions . . 230
	11.7.7. Lock State and File System Transitions . . . . . . . 230		11.7.7. Lock State and File System Transitions . . . . . . . 230
	11.7.8. Write Verifiers and File System Transitions . . . . 234		11.7.8. Write Verifiers and File System Transitions . . . . 234
	11.7.9. Readdir Cookies and Verifiers and File System		11.7.9. Readdir Cookies and Verifiers and File System
	Transitions . . . . . . . . . . . . . . . . . . . . 234		Transitions . . . . . . . . . . . . . . . . . . . . 234

	11.7.10. File System Data and File System Transitions . . . . 234		11.7.10. File System Data and File System Transitions . . . . 235
	11.8. Effecting File System Referrals . . . . . . . . . . . . 236		11.8. Effecting File System Referrals . . . . . . . . . . . . 236
	11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 236		11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 236
	11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 240		11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 240

	11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 242		11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 243
	11.10. The Attribute fs_locations_info . . . . . . . . . . . . 245		11.10. The Attribute fs_locations_info . . . . . . . . . . . . 245
	11.10.1. The fs_locations_server4 Structure . . . . . . . . . 248		11.10.1. The fs_locations_server4 Structure . . . . . . . . . 248

	11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 253		11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 254
	11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 254		11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 255
	11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 256		11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 257
	12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 260		12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 260
	12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 260		12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 260
	12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 262		12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 262
	12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 262		12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 262
	12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 262		12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 262
	12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 263		12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 263
	12.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 263		12.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 263
	12.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 263		12.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 263
	12.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 263		12.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 263
	12.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 263		12.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 263

	skipping to change at page 147, line 6		skipping to change at page 147, line 6
	a particular file system, as opposed to all of the data within it,		a particular file system, as opposed to all of the data within it,
	the server can apply the security policy of a shared resource in the		the server can apply the security policy of a shared resource in the
	server's namespace to components of the resource's ancestors. For		server's namespace to components of the resource's ancestors. For
	example:		example:

	/ (place holder/not exported)		/ (place holder/not exported)
	/a/b (file system 1)		/a/b (file system 1)
	/a/b/MySecretProject (file system 2)		/a/b/MySecretProject (file system 2)
	The /a/b/MySecretProject directory is a real file system and is the		The /a/b/MySecretProject directory is a real file system and is the
	shared resource. Suppose the security policy for /a/b/		shared resource. Suppose the security policy for /a/b/

	MySecretProject is Kerberos with integrity and it desired that		MySecretProject is Kerberos with integrity and it is desired to limit
	knowledge of the existence of this file system to be very limited.		knowledge of the existence of this file system. In this case, the
	In this case the server should apply the same security policy to		server should apply the same security policy to /a/b. This allows
	/a/b. This allows for knowledge the existence of a file system to be		for knowledge of the existence of a file system to be secured when
	secured in cases where this is desirable.		desirable.

	For the case of the use of multiple, disjoint security mechanisms in		For the case of the use of multiple, disjoint security mechanisms in
	the server's resources, applying that sort of policy would result in		the server's resources, applying that sort of policy would result in
	the higher-level file system not being accessible using any security		the higher-level file system not being accessible using any security
	flavor, which would make the that higher-level file system		flavor, which would make the that higher-level file system
	inaccessible. Therefore, that sort of configuration is not		inaccessible. Therefore, that sort of configuration is not
	compatible with hiding the existence (as opposed to the contents)		compatible with hiding the existence (as opposed to the contents)
	from clients using multiple disjoint sets of security flavors.		from clients using multiple disjoint sets of security flavors.

	In other circumstances, a desirable policy is for the security of a		In other circumstances, a desirable policy is for the security of a
	particular object in the server's namespace should include the union		particular object in the server's namespace should include the union
	of all security mechanisms of all direct descendants. A common and		of all security mechanisms of all direct descendants. A common and
	convenient practice, unless strong security requirements dictate		convenient practice, unless strong security requirements dictate
	otherwise, is to make all of the pseudo file system accessible by all		otherwise, is to make all of the pseudo file system accessible by all
	of the valid security mechanisms.		of the valid security mechanisms.


	Where there is concern about the security of data on the wire,		Where there is concern about the security of data on the network,
	clients should use strong security mechanisms to access the pseudo		clients should use strong security mechanisms to access the pseudo

	file system in order to prevent man-in-the-middle-attacks from		file system in order to prevent man-in-the-middle attacks.
	directing LOOKUPs within the pseudo file system from compromising the
	existence of sensitive data, or getting access to data that the
	client is sending by directing the client to send it using weak
	security mechanisms.

	8. State Management		8. State Management

	Integrating locking into the NFS protocol necessarily causes it to be		Integrating locking into the NFS protocol necessarily causes it to be
	stateful. With the inclusion of such features as share reservations,		stateful. With the inclusion of such features as share reservations,
	file and directory delegations, recallable layouts, and support for		file and directory delegations, recallable layouts, and support for

	mandatory record locking the protocol becomes substantially more		mandatory record locking, the protocol becomes substantially more
	dependent on proper management of state than the traditional		dependent on proper management of state than the traditional
	combination of NFS and NLM [36]. These features include expanded		combination of NFS and NLM [36]. These features include expanded
	locking facilities, which provide some measure of interclient		locking facilities, which provide some measure of interclient

	exclusion, but the state is also valuable to providing other useful		exclusion, but the state is also valuable to offering features not
	features not readily providable using a stateless model. There are		readily providable using a stateless model. There are three
	three components to making this state manageable:		components to making this state manageable:

	o Clear division between client and server		o Clear division between client and server


	o Ability to reliably detect inconsistency in state between client		o Ability to reliably detect inconsistency in state between client
	and server		and server

	o Simple and robust recovery mechanisms		o Simple and robust recovery mechanisms


	In this model, the server owns the state information. The client		In this model, the server owns the state information. The client
	requests changes in locks and the server responds with the changes		requests changes in locks and the server responds with the changes
	made. Non-client-initiated changes in locking state are infrequent		made. Non-client-initiated changes in locking state are infrequent
	and the client receives prompt notification of them and can adjust		and the client receives prompt notification of them and can adjust
	its view of the locking state to reflect the server's changes.		its view of the locking state to reflect the server's changes.

	Individual pieces of state created by the server and passed to the		Individual pieces of state created by the server and passed to the
	client at its request are represented by 128-bit stateids. These		client at its request are represented by 128-bit stateids. These
	stateids may represent a particular open file, a set of byte-range		stateids may represent a particular open file, a set of byte-range
	locks held by a particular owner, or a recallable delegation of		locks held by a particular owner, or a recallable delegation of

	skipping to change at page 149, line 4		skipping to change at page 148, line 47
	and a unitary client.		and a unitary client.

	8.2. Stateid Definition		8.2. Stateid Definition

	When the server grants a lock of any type (including opens, record		When the server grants a lock of any type (including opens, record
	locks, delegations, and layouts) it responds with a unique stateid,		locks, delegations, and layouts) it responds with a unique stateid,
	that represents a set of locks (often a single lock) for the same		that represents a set of locks (often a single lock) for the same
	file, of the same type, and sharing the same ownership		file, of the same type, and sharing the same ownership
	characteristics. Thus opens of the same file by different open-		characteristics. Thus opens of the same file by different open-
	owners each have an identifying stateid. Similarly, each set of		owners each have an identifying stateid. Similarly, each set of

	record locks on a file owned by a specific lock-owner and gotten via		record locks on a file owned by a specific lock-owner has its own
	an open for a specific open-owner, has its own identifying stateid.		identifying stateid. Delegations and layouts also have associated
	Delegations and layouts also have associated stateids by which they		stateids by which they may be referenced. The stateid is used as a
	may be referenced. The stateid is used as a shorthand reference to a		shorthand reference to a lock or set of locks and given a stateid the
	lock or set of locks and given a stateid the server can determine the		server can determine the associated state-owner or state-owners (in
	associated state-owner or state-owners (in the case of an open-owner/		the case of an open-owner/lock-owner pair) and the associated
	lock-owner pair) and the associated filehandle. When stateids are		filehandle. When stateids are used, the current filehandle must be
	used, the current filehandle must be the one associated with that		the one associated with that stateid.
	stateid.


	All stateids associated with a given clientid are associated with a		All stateids associated with a given client ID are associated with a
	common lease which represents the claim of those stateids and the		common lease which represents the claim of those stateids and the
	objects they represent to be maintained by the server. See		objects they represent to be maintained by the server. See
	Section 8.3 for a discussion of leases.		Section 8.3 for a discussion of leases.

	The server may assign stateids independently for different clients.		The server may assign stateids independently for different clients.
	A stateid with the same bit pattern for one client may designate an		A stateid with the same bit pattern for one client may designate an
	entirely different set of locks for a different client. The stateid		entirely different set of locks for a different client. The stateid
	is always interpreted with respect to the client ID associated with		is always interpreted with respect to the client ID associated with
	the current session. Stateids apply to all sessions associated with		the current session. Stateids apply to all sessions associated with
	the given client ID and the client may use a stateid obtained from		the given client ID and the client may use a stateid obtained from

	skipping to change at page 149, line 38		skipping to change at page 149, line 32

	With the exception of special stateids, to be discussed later, each		With the exception of special stateids, to be discussed later, each
	stateid represents locking objects of one of a set of types defined		stateid represents locking objects of one of a set of types defined
	by the NFSv4.1 protocol. Note that in all these cases, where we		by the NFSv4.1 protocol. Note that in all these cases, where we
	speak of guarantee, it is understood there are situations such as a		speak of guarantee, it is understood there are situations such as a
	client restart, or lock revocation, that allow the guarantee to be		client restart, or lock revocation, that allow the guarantee to be
	voided.		voided.

	o Stateids may represent opens of files.		o Stateids may represent opens of files.


	Each stateid in this case represents the open for a given		Each stateid in this case represents the open for a given client
	clientid/open-owner/filehandle triple. Such stateids are subject		ID/open-owner/filehandle triple. Such stateids are subject to
	to change (with consequent bumping of the seqid) in response to		change (with consequent incrementing of the stateid's seqid) in
	OPENs that result in upgrade and OPEN_DOWNGRADE operations.		response to OPENs that result in upgrade and OPEN_DOWNGRADE
			operations.

	o Stateids may represent sets of byte-range locks.		o Stateids may represent sets of byte-range locks.

	All locks held on a particular file by a particular owner and all		All locks held on a particular file by a particular owner and all
	gotten under the aegis of a particular open file are associated		gotten under the aegis of a particular open file are associated

	with a single stateid with the seqid being bumped as LOCK and		with a single stateid with the seqid being increment whenever LOCK
	LOCKU operation affect that set of locks.		and LOCKU operations affect that set of locks.

	o Stateids may represent file delegations, which are recallable		o Stateids may represent file delegations, which are recallable
	guarantees by the server to the client, that other clients will		guarantees by the server to the client, that other clients will
	not reference, or will not modify a particular file, until the		not reference, or will not modify a particular file, until the
	delegation is returned. In NFSv4.1, file delegations may be		delegation is returned. In NFSv4.1, file delegations may be
	obtained on both regular and non-regular files.		obtained on both regular and non-regular files.

	A stateid represents a single delegation held by a client for a		A stateid represents a single delegation held by a client for a
	particular filehandle.		particular filehandle.


	skipping to change at page 150, line 25		skipping to change at page 150, line 20
	A stateid represents a single delegation held by a client for a		A stateid represents a single delegation held by a client for a
	particular directory filehandle.		particular directory filehandle.

	o Stateids may represent layouts, which are recallable guarantees by		o Stateids may represent layouts, which are recallable guarantees by
	the server to the client, that particular files may be accessed		the server to the client, that particular files may be accessed
	via an alternate data access protocol at specific locations. Such		via an alternate data access protocol at specific locations. Such
	access is limited to particular sets of byte ranges and may		access is limited to particular sets of byte ranges and may
	proceed until those byte ranges are reduced or the layout is		proceed until those byte ranges are reduced or the layout is
	returned.		returned.


	A stateid represents all layouts held by a particular client for a		A stateid represents the set of all layouts held by a particular
	particular filehandle with a given layout type. The seqid is		client for a particular filehandle with a given layout type. The
	updated as the contents of that set changes with LAYOUT		seqid is updated as the layouts of that set changes with layout
			stateid changing operations such as LAYOUTGET and LAYOUTRETURN.

	8.2.2. Stateid Structure		8.2.2. Stateid Structure

	Stateids are divided into two fields, a 96-bit "other" field		Stateids are divided into two fields, a 96-bit "other" field
	identifying the specific set of locks and a 32-bit "seqid" sequence		identifying the specific set of locks and a 32-bit "seqid" sequence
	value. Except in the case of special stateids, to be discussed		value. Except in the case of special stateids, to be discussed
	below, a particular value of the "other" field denotes a set of locks		below, a particular value of the "other" field denotes a set of locks
	of the same type (for example byte-range locks, opens, delegations,		of the same type (for example byte-range locks, opens, delegations,
	or layouts), for a specific file or directory, and sharing the same		or layouts), for a specific file or directory, and sharing the same
	ownership characteristics. The seqid designates a specific instance		ownership characteristics. The seqid designates a specific instance

	skipping to change at page 156, line 8		skipping to change at page 156, line 4
	8.2.5. Stateid Use for I/O Operations		8.2.5. Stateid Use for I/O Operations

	Clients performing I/O operations (and SETATTR's modifying the file		Clients performing I/O operations (and SETATTR's modifying the file
	size), need to select an appropriate stateid based on the locks		size), need to select an appropriate stateid based on the locks
	(including opens and delegations) held by the client and the various		(including opens and delegations) held by the client and the various
	types of state-owners issuing the I/O requests.		types of state-owners issuing the I/O requests.

	The following rules, applied in order of decreasing priority, govern		The following rules, applied in order of decreasing priority, govern
	the selection of the appropriate stateid. Note that the rules are		the selection of the appropriate stateid. Note that the rules are
	slightly different in the case of I/O to data servers when file		slightly different in the case of I/O to data servers when file

	layouts are being used. (See Section 13.9.1).		layouts are being used (see Section 13.9.1).

	o If the client holds a delegation for the file in question, the		o If the client holds a delegation for the file in question, the

	delegation stateid should be used.		delegation stateid SHOULD be used.

	o Otherwise, if the lock-owner corresponding entity (e.g. process)		o Otherwise, if the lock-owner corresponding entity (e.g. process)
	issuing the I/O has a lock stateid for the associated open file,		issuing the I/O has a lock stateid for the associated open file,

	then the lock stateid for that lock-owner and open file should be		then the lock stateid for that lock-owner and open file SHOULD be
	used.		used.

	o If there is no lock stateid, then the open stateid for the open		o If there is no lock stateid, then the open stateid for the open

	file in question is used.		file in question SHOULD be used.


	o Finally, if none of the above apply, then a special stateid should		o Finally, if none of the above apply, then a special stateid SHOULD
	be used.		be used.

	8.3. Lease Renewal		8.3. Lease Renewal


	The purpose of a lease is to provide allow the client to indicate to		The purpose of a lease is to allow the client to indicate to the
	the server, in a low-overhead way, that it is active, and thus that		server, in a low-overhead way, that it is active, and thus that the
	the server is to retain its locks. This arrangement allows the		server is to retain the client's locks. This arrangement allows the
	server to remove stale locking-related objects that are held by a		server to remove stale locking-related objects that are held by a
	client that has crashed or is otherwise unreachable, once the		client that has crashed or is otherwise unreachable, once the

	relevant lease expires. This allows other clients to obtain		relevant lease expires. This in turn allows other clients to obtain
	conflicting locks without being delayed indefinitely by inactive or		conflicting locks without being delayed indefinitely by inactive or
	unreachable clients. It is not a mechanism for cache consistency and		unreachable clients. It is not a mechanism for cache consistency and
	lease renewals may not be denied if the lease interval has not		lease renewals may not be denied if the lease interval has not
	expired.		expired.

	Since each session is associated with a specific client (identified		Since each session is associated with a specific client (identified
	by the client's client ID), any operation sent on that session is an		by the client's client ID), any operation sent on that session is an
	indication that the associated client is reachable. When a request		indication that the associated client is reachable. When a request
	is sent for a given session, successful execution of a SEQUENCE		is sent for a given session, successful execution of a SEQUENCE
	operation (or successful retrieval of the result of SEQUENCE from the		operation (or successful retrieval of the result of SEQUENCE from the
	reply cache) on an unexpired lease will result in the lease being		reply cache) on an unexpired lease will result in the lease being

	implicitly renewed, for the standard renewal period.		implicitly renewed, for the standard renewal period (equal to the
			lease_time attribute).

	If the client ID's lease has not expired when the server receives a		If the client ID's lease has not expired when the server receives a
	SEQUENCE operation, then the server MUST renew the lease. If the		SEQUENCE operation, then the server MUST renew the lease. If the
	client ID's lease has expired when the server receives a SEQUENCE		client ID's lease has expired when the server receives a SEQUENCE
	operation, the server MAY renew the lease; this depends on whether		operation, the server MAY renew the lease; this depends on whether
	any state was revoked as a result of the client's failure to renew		any state was revoked as a result of the client's failure to renew
	the lease before expiration.		the lease before expiration.

	Absent other activity that would renew the lease, a COMPOUND		Absent other activity that would renew the lease, a COMPOUND
	consisting of a single SEQUENCE operation will suffice. The client		consisting of a single SEQUENCE operation will suffice. The client
	should also take communication-related delays into account and take		should also take communication-related delays into account and take
	steps to ensure that the renewal messages actually reach the server		steps to ensure that the renewal messages actually reach the server
	in good time. For example:		in good time. For example:

	o When trunking is in effect, the client should consider issuing		o When trunking is in effect, the client should consider issuing
	multiple requests on different connections, in order to ensure		multiple requests on different connections, in order to ensure
	that renewal occurs, even in the event of blockage in the path		that renewal occurs, even in the event of blockage in the path
	used for one of those connections.		used for one of those connections.


	o TCP retransmission delays might become so large as to approach or		o Transport retransmission delays might become so large as to
	exceed the length of the lease period. This may be particularly		approach or exceed the length of the lease period. This may be
	likely when the server is unresponsive due to a restart; see		particularly likely when the server is unresponsive due to a
	Section 8.4.2.1		restart; see Section 8.4.2.1. If the client implementation is not
			careful, transport retransmission delays can result in the client
			failing to detect a server restart before the grace period ends.
			The scenario is that the client is using a transport with
			exponential back off, such that the maximum retransmission timeout
			excees the both the grace period and the lease_time attribute. A
			network partition causes the client's connection's retransmission
			interval to back off, and even after the partition heals, the next
			transport-level retransmission is sent after the server has
			restarted and its grace period ends.

			The client MUST either recover from the ensuing NFS4ERR_NOGRACE
			errors, or it MUST ensure that despite transport level
			retransmission intervals that exceed the lease_time, nonetheless a
			SEQUENCE operation is sent that renews the lease before
			expiration. The client can achieve this by associating a new
			connection with the session, and sending a SEQUENCE operation on
			it. However, if the attempt to establish a new connection is
			delayed for same reason (exponential backoff of the connection
			establishment packets), the client will have to abort the
			connection establishment attempt before the lease expires, and try
			again.

	If the server renews the lease upon receiving a SEQUENCE operation,		If the server renews the lease upon receiving a SEQUENCE operation,
	the server MUST NOT allow the lease to expire while the rest of the		the server MUST NOT allow the lease to expire while the rest of the
	operations in the COMPOUND procedure's request are still executing.		operations in the COMPOUND procedure's request are still executing.
	Once the last operation has finished, and the response to COMPOUND		Once the last operation has finished, and the response to COMPOUND
	has been sent, the server MUST set the lease to expire no sooner than		has been sent, the server MUST set the lease to expire no sooner than
	the sum of current time and the value of the lease_time attribute.		the sum of current time and the value of the lease_time attribute.

	A client ID's lease can expire when it has been at least the lease		A client ID's lease can expire when it has been at least the lease
	interval (lease_time) since the last lease-renewing SEQUENCE		interval (lease_time) since the last lease-renewing SEQUENCE

	operation was sent on any of the client ID's sessions and there must		operation was sent on any of the client ID's sessions and there are
	be no active COMPOUND operations on any such session.		no active COMPOUND operations on any such sessions.

	Because the SEQUENCE operation is the basic mechanism to renew a		Because the SEQUENCE operation is the basic mechanism to renew a
	lease, and because if must be done at least once for each lease		lease, and because if must be done at least once for each lease
	period, it is the natural mechanism whereby the server will inform		period, it is the natural mechanism whereby the server will inform
	the client of changes in the lease status that the client needs to be		the client of changes in the lease status that the client needs to be
	informed of. The client should inspect the status flags		informed of. The client should inspect the status flags
	(sr_status_flags) returned by sequence and take the appropriate		(sr_status_flags) returned by sequence and take the appropriate

	action. (See Section 18.46.3 for details).		action (see Section 18.46.3 for details).

	o The status bits SEQ4_STATUS_CB_PATH_DOWN and		o The status bits SEQ4_STATUS_CB_PATH_DOWN and
	SEQ4_STATUS_CB_PATH_DOWN_SESSION indicate problems with the		SEQ4_STATUS_CB_PATH_DOWN_SESSION indicate problems with the
	backchannel which the client may need to address in order to		backchannel which the client may need to address in order to
	receive callback requests.		receive callback requests.

	o The status bits SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING and		o The status bits SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING and

	SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED indicates actual problems with		SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED indicate problems with GSS
	GSS contexts for the backchannel which the client may have to		contexts for the backchannel which the client may have to address
	address to allow callback requests to be sent to it.		to allow callback requests to be sent to it.

	o The status bits SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED,		o The status bits SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED,
	SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED,		SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED,
	SEQ4_STATUS_ADMIN_STATE_REVOKED, and		SEQ4_STATUS_ADMIN_STATE_REVOKED, and
	SEQ4_STATUS_RECALLABLE_STATE_REVOKED notify the client of lock		SEQ4_STATUS_RECALLABLE_STATE_REVOKED notify the client of lock
	revocation events. When these bits are set, the client should use		revocation events. When these bits are set, the client should use
	TEST_STATEID to find what stateids have been revoked and use		TEST_STATEID to find what stateids have been revoked and use
	FREE_STATEID to acknowledge loss of the associated state.		FREE_STATEID to acknowledge loss of the associated state.

	o The status bit SEQ4_STATUS_LEASE_MOVE indicates that		o The status bit SEQ4_STATUS_LEASE_MOVE indicates that
	responsibility for lease renewal has been transferred to one or		responsibility for lease renewal has been transferred to one or
	more new servers.		more new servers.

	o The status bit SEQ4_STATUS_RESTART_RECLAIM_NEEDED indicates that		o The status bit SEQ4_STATUS_RESTART_RECLAIM_NEEDED indicates that
	due to server restart the client must reclaim locking state.		due to server restart the client must reclaim locking state.


	o The status bit SEQ4_STATUS_BACKCHANNEL_FAULT indicates server has		o The status bit SEQ4_STATUS_BACKCHANNEL_FAULT indicates the server
	encountered an unrecoverable fault with the backchannel (e.g. it		has encountered an unrecoverable fault with the backchannel (e.g.
	has lost track of a sequence id for a slot in the backchannel).		it has lost track of a sequence id for a slot in the backchannel).

	8.4. Crash Recovery		8.4. Crash Recovery

	A critical requirement in crash recovery is that both the client and		A critical requirement in crash recovery is that both the client and
	the server know when the other has failed. Additionally, it is		the server know when the other has failed. Additionally, it is
	required that a client sees a consistent view of data across server		required that a client sees a consistent view of data across server
	restarts. All READ and WRITE operations that may have been queued		restarts. All READ and WRITE operations that may have been queued
	within the client or network buffers must wait until the client has		within the client or network buffers must wait until the client has
	successfully recovered the locks protecting the READ and WRITE		successfully recovered the locks protecting the READ and WRITE
	operations. Any that reach the server before the server can safely		operations. Any that reach the server before the server can safely
	determine that the client has recovered enough locking state to be		determine that the client has recovered enough locking state to be
	sure that such operations can be safely processed must be rejected.		sure that such operations can be safely processed must be rejected.
	This will happen because either:		This will happen because either:

	o The state presented is no longer valid since it is associated with		o The state presented is no longer valid since it is associated with

	a now invalid clientid. In this case the client will receive		a now invalid client ID. In this case the client will receive
	either an NFS4ERR_BADSESSION or NFS4ERR_DEADSESSION error, and any		either an NFS4ERR_BADSESSION or NFS4ERR_DEADSESSION error, and any

	attempt to attach a new session to the existing clientid will		attempt to attach a new session to the existing client ID will
	encounter an NFS4ERR_STALE_CLIENTID error.		result in an NFS4ERR_STALE_CLIENTID error.

	o Subsequent recovery of locks may make execution of the operation		o Subsequent recovery of locks may make execution of the operation
	inappropriate (NFS4ERR_GRACE).		inappropriate (NFS4ERR_GRACE).

	8.4.1. Client Failure and Recovery		8.4.1. Client Failure and Recovery

	In the event that a client fails, the server may release the client's		In the event that a client fails, the server may release the client's
	locks when the associated lease has expired. Conflicting locks from		locks when the associated lease has expired. Conflicting locks from
	another client may only be granted after this lease expiration. As		another client may only be granted after this lease expiration. As
	discussed in Section 8.3, when a client has not failed and re-		discussed in Section 8.3, when a client has not failed and re-
	establishes its lease before expiration occurs, requests for		establishes its lease before expiration occurs, requests for
	conflicting locks will not be granted.		conflicting locks will not be granted.

	To minimize client delay upon restart, lock requests are associated		To minimize client delay upon restart, lock requests are associated
	with an instance of the client by a client-supplied verifier. This		with an instance of the client by a client-supplied verifier. This
	verifier is part of the client_owner4 sent in the initial EXCHANGE_ID		verifier is part of the client_owner4 sent in the initial EXCHANGE_ID
	call made by the client. The server returns a client ID as a result		call made by the client. The server returns a client ID as a result
	of the EXCHANGE_ID operation. The client then confirms the use of		of the EXCHANGE_ID operation. The client then confirms the use of
	the client ID by establishing a session associated with that client		the client ID by establishing a session associated with that client

	ID. See Section 18.36.3 for a description how this is done. All		ID (see Section 18.36.3 for a description how this is done). All
	locks, including opens, record locks, delegations, and layouts		locks, including opens, record locks, delegations, and layouts
	obtained by sessions using that client ID are associated with that		obtained by sessions using that client ID are associated with that
	client ID.		client ID.

	Since the verifier will be changed by the client upon each		Since the verifier will be changed by the client upon each
	initialization, the server can compare a new verifier to the verifier		initialization, the server can compare a new verifier to the verifier
	associated with currently held locks and determine that they do not		associated with currently held locks and determine that they do not
	match. This signifies the client's new instantiation and subsequent		match. This signifies the client's new instantiation and subsequent

	loss of locking state. As a result, the server is free to release		loss (upon confirmation of new the client ID) of locking state. As a
	all locks held which are associated with the old client ID which was		result, the server is free to release all locks held which are
	derived from the old verifier. At this point conflicting locks from		associated with the old client ID which was derived from the old
	other clients, kept waiting while the lease had not yet expired, can		verifier. At this point conflicting locks from other clients, kept
	be granted. In addition, all stateids associated with the old		waiting while the lease had not yet expired, can be granted. In
	clientid can also be freed, as they are no longer reference-able.		addition, all stateids associated with the old client ID can also be
			freed, as they are no longer reference-able.

	Note that the verifier must have the same uniqueness properties as		Note that the verifier must have the same uniqueness properties as
	the verifier for the COMMIT operation.		the verifier for the COMMIT operation.

	8.4.2. Server Failure and Recovery		8.4.2. Server Failure and Recovery

	If the server loses locking state (usually as a result of a restart),		If the server loses locking state (usually as a result of a restart),
	it must allow clients time to discover this fact and re-establish the		it must allow clients time to discover this fact and re-establish the
	lost locking state. The client must be able to re-establish the		lost locking state. The client must be able to re-establish the
	locking state without having the server deny valid requests because		locking state without having the server deny valid requests because

	skipping to change at page 159, line 50		skipping to change at page 160, line 22

	A client can determine that loss of locking state has occurred via		A client can determine that loss of locking state has occurred via
	several methods.		several methods.

	1. When a SEQUENCE (most common) or other operation returns		1. When a SEQUENCE (most common) or other operation returns
	NFS4ERR_BADSESSION, this may mean the session has been destroyed,		NFS4ERR_BADSESSION, this may mean the session has been destroyed,
	but the client ID is still valid. The client sends a		but the client ID is still valid. The client sends a
	CREATE_SESSION request with the client ID to re-establish the		CREATE_SESSION request with the client ID to re-establish the
	session. If CREATE_SESSION fails with NFS4ERR_STALE_CLIENTID,		session. If CREATE_SESSION fails with NFS4ERR_STALE_CLIENTID,
	the client must establish a new client ID (see Section 8.1) and		the client must establish a new client ID (see Section 8.1) and

	re-establish its lock state after the CREATE_SESSION, with the		re-establish its lock state with the new client ID, after the
	new client ID CREATE_SESSION succeeds, (Section 8.4.2.1).		CREATE_SESSION operation succeeds (see Section 8.4.2.1).

	2. When a SEQUENCE (most common) or other operation on a persistent		2. When a SEQUENCE (most common) or other operation on a persistent
	session returns NFS4ERR_DEADSESSION, this indicates that a		session returns NFS4ERR_DEADSESSION, this indicates that a
	session is no longer usable for new, i.e. not satisfied from the		session is no longer usable for new, i.e. not satisfied from the
	reply cache, operations. Once all pending operations are		reply cache, operations. Once all pending operations are
	determined to be either performed before the retry or not		determined to be either performed before the retry or not
	performed, the client sends a CREATE_SESSION request with the		performed, the client sends a CREATE_SESSION request with the
	client ID to re-establish the session. If CREATE_SESSION fails		client ID to re-establish the session. If CREATE_SESSION fails
	with NFS4ERR_STALE_CLIENTID, the client must establish a new		with NFS4ERR_STALE_CLIENTID, the client must establish a new
	client ID (see Section 8.1) and re-establish its lock state after		client ID (see Section 8.1) and re-establish its lock state after

	skipping to change at page 160, line 52		skipping to change at page 161, line 24
	reliably determine (through state persistently maintained across		reliably determine (through state persistently maintained across
	restart instances), that granting any such lock cannot possibly		restart instances), that granting any such lock cannot possibly
	conflict with a subsequent reclaim. When a request is made to obtain		conflict with a subsequent reclaim. When a request is made to obtain
	a new lock (i.e. not a reclaim-type request) during the grace period		a new lock (i.e. not a reclaim-type request) during the grace period
	and such a determination cannot be made, the server must return the		and such a determination cannot be made, the server must return the
	error NFS4ERR_GRACE.		error NFS4ERR_GRACE.

	Once a session is established using the new client ID, the client		Once a session is established using the new client ID, the client
	will use reclaim-type locking requests (e.g. LOCK requests with		will use reclaim-type locking requests (e.g. LOCK requests with
	reclaim set to TRUE and OPEN operations with a claim type of		reclaim set to TRUE and OPEN operations with a claim type of

	CLAIM_PREVIOUS. See Section 9.11) to re-establish its locking state.		CLAIM_PREVIOUS; see Section 9.11) to re-establish its locking state.

	Once this is done, or if there is no such locking state to reclaim,		Once this is done, or if there is no such locking state to reclaim,
	the client sends a global RECLAIM_COMPLETE operation, i.e. one with		the client sends a global RECLAIM_COMPLETE operation, i.e. one with
	the rca_one_fs argument set to FALSE, to indicate that it has		the rca_one_fs argument set to FALSE, to indicate that it has
	reclaimed all of the locking state that it will reclaim. Once a		reclaimed all of the locking state that it will reclaim. Once a
	client sends such a RECLAIM_COMPLETE operation, it may attempt non-		client sends such a RECLAIM_COMPLETE operation, it may attempt non-
	reclaim locking operations, although it may get NFS4ERR_GRACE errors		reclaim locking operations, although it may get NFS4ERR_GRACE errors
	the operations until the period of special handling is over. See		the operations until the period of special handling is over. See
	Section 11.7.7 for a discussion of the analogous handling lock		Section 11.7.7 for a discussion of the analogous handling lock
	reclamation in the case of file systems transitioning from server to		reclamation in the case of file systems transitioning from server to
	server.		server.

	skipping to change at page 161, line 26		skipping to change at page 161, line 46
	During the grace period, the server must reject READ and WRITE		During the grace period, the server must reject READ and WRITE
	operations and non-reclaim locking requests (i.e. other LOCK and OPEN		operations and non-reclaim locking requests (i.e. other LOCK and OPEN
	operations) with an error of NFS4ERR_GRACE, unless it is able to		operations) with an error of NFS4ERR_GRACE, unless it is able to
	guarantee that these may be done safely, as described below.		guarantee that these may be done safely, as described below.

	The grace period may last until all clients which are known to		The grace period may last until all clients which are known to
	possibly have had locks have done a global RECLAIM_COMPLETE		possibly have had locks have done a global RECLAIM_COMPLETE
	operation, indicating that they have finished reclaiming the locks		operation, indicating that they have finished reclaiming the locks
	they held before the server restart. This means that a client which		they held before the server restart. This means that a client which
	has done a RECLAIM_COMPLETE must be prepared to receive an		has done a RECLAIM_COMPLETE must be prepared to receive an

	NFS4ERR_GRACE when attempting to acquire new locks. The server is		NFS4ERR_GRACE when attempting to acquire new locks. In order for the
	assumed to maintain in stable storage a list of clients which may		server to know that all clients with possible prior lock state have
	have such locks. The server may also terminate the grace period		done a RECLAIM_COMPLETE, the server must maintain in stable storage a
	before all clients have done a global RECLAIM_COMPLETE. The server		list of clients which may have such locks. The server may also
	SHOULD NOT terminate the grace period before a time equal to the		terminate the grace period before all clients have done a global
	lease period in order to give clients an opportunity to find out		RECLAIM_COMPLETE. The server SHOULD NOT terminate the grace period
	about the server restart, as a result of issuing requests on		before a time equal to the lease period in order to give clients an
	associated sessions with a frequency governed by the lease time.		opportunity to find out about the server restart, as a result of
	Note that when a client does not issue such requests (or they are		issuing requests on associated sessions with a frequency governed by
	issued by the client but not received by the server), it is possible		the lease time. Note that when a client does not issue such requests
	for the grace period to expire before the client finds out that the		(or they are issued by the client but not received by the server), it
	server restart has occurred.		is possible for the grace period to expire before the client finds
			out that the server restart has occurred.

	Some additional time in order to allow a client to establish a new		Some additional time in order to allow a client to establish a new
	client ID and session and to effect lock reclaims may be added to the		client ID and session and to effect lock reclaims may be added to the
	lease time. Note that analogous rules apply to file system-specific		lease time. Note that analogous rules apply to file system-specific
	grace periods discussed in Section 11.7.7.		grace periods discussed in Section 11.7.7.

	If the server can reliably determine that granting a non-reclaim		If the server can reliably determine that granting a non-reclaim
	request will not conflict with reclamation of locks by other clients,		request will not conflict with reclamation of locks by other clients,
	the NFS4ERR_GRACE error does not have to be returned even within the		the NFS4ERR_GRACE error does not have to be returned even within the
	grace period, although NFS4ERR_GRACE must always be returned to		grace period, although NFS4ERR_GRACE must always be returned to

	skipping to change at page 163, line 17		skipping to change at page 163, line 38
	established, refetch the lease_time attribute and use it as the basis		established, refetch the lease_time attribute and use it as the basis
	for lease renewal for the lease associated with that server.		for lease renewal for the lease associated with that server.
	However, the server must establish, for this restart event, a grace		However, the server must establish, for this restart event, a grace
	period at least as long as the lease period for the previous server		period at least as long as the lease period for the previous server
	instantiation. This allows the client state obtained during the		instantiation. This allows the client state obtained during the
	previous server instance to be reliably re-established.		previous server instance to be reliably re-established.

	8.4.3. Network Partitions and Recovery		8.4.3. Network Partitions and Recovery

	If the duration of a network partition is greater than the lease		If the duration of a network partition is greater than the lease

	period provided by the server, the server will have not received a		period provided by the server, the server will not have received a
	lease renewal from the client. If this occurs, the server may free		lease renewal from the client. If this occurs, the server may free
	all locks held for the client, or it may allow the lock state to		all locks held for the client, or it may allow the lock state to
	remain for a considerable period, subject to the constraint that if a		remain for a considerable period, subject to the constraint that if a
	request for a conflicting lock is made, locks associated with an		request for a conflicting lock is made, locks associated with an
	expired lease do not prevent such a conflicting lock from being		expired lease do not prevent such a conflicting lock from being
	granted but MUST be revoked as necessary so as not to interfere with		granted but MUST be revoked as necessary so as not to interfere with
	such conflicting requests.		such conflicting requests.

	If the server chooses to delay freeing of lock state until there is a		If the server chooses to delay freeing of lock state until there is a
	conflict, it may either free all of the clients locks once there is a		conflict, it may either free all of the clients locks once there is a

	skipping to change at page 163, line 42		skipping to change at page 164, line 15

	When the server chooses to free all of a client's lock state, either		When the server chooses to free all of a client's lock state, either
	immediately upon lease expiration, or a result of the first attempt		immediately upon lease expiration, or a result of the first attempt
	to obtain a conflicting a lock, the server may report the loss of		to obtain a conflicting a lock, the server may report the loss of
	lock state in a number of ways.		lock state in a number of ways.

	The server may choose to invalidate the session and the associated		The server may choose to invalidate the session and the associated
	client ID. In this case, when the client is able to communicate with		client ID. In this case, when the client is able to communicate with
	the server, it will receive an NFS4ERR_BADSESSION. Upon attempting		the server, it will receive an NFS4ERR_BADSESSION. Upon attempting
	to create a new session, it would get an NFS4ERR_STALE_CLIENTID.		to create a new session, it would get an NFS4ERR_STALE_CLIENTID.

	Upon creating the new clientid and new session it would attempt to		Upon creating the new client ID and new session it would attempt to
	reclaim locks not be allowed to do so by the server.		reclaim locks not be allowed to do so by the server.

	Another possibility is for the server to maintain the session and		Another possibility is for the server to maintain the session and

	clientid but for all stateids held by the client to become invalid or		client ID but for all stateids held by the client to become invalid
	stale. Once the client is able to reach the server after such a		or stale. Once the client is able to reach the server after such a
	network partition, the status returned by the SEQUENCE operation will		network partition, the status returned by the SEQUENCE operation will
	indicate a loss of locking state. (The flag		indicate a loss of locking state. (The flag
	SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED will be set in		SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED will be set in
	sr_status_flags.) In addition, all I/O submitted by the client with		sr_status_flags.) In addition, all I/O submitted by the client with
	the now invalid stateids will fail with the server returning the		the now invalid stateids will fail with the server returning the
	error NFS4ERR_EXPIRED. Once the client learns of the loss of locking		error NFS4ERR_EXPIRED. Once the client learns of the loss of locking
	state, it will suitably notify the applications that held the		state, it will suitably notify the applications that held the
	invalidated locks. The client should then take action to free		invalidated locks. The client should then take action to free
	invalidated stateids, either by establishing a new client ID using a		invalidated stateids, either by establishing a new client ID using a
	new verifier or by doing a FREE_STATEID operation to release each of		new verifier or by doing a FREE_STATEID operation to release each of
	the invalidated stateids.		the invalidated stateids.

	When the server adopts a finer-grained approach to revocation of		When the server adopts a finer-grained approach to revocation of
	locks when lease have expired, only a subset of stateids will		locks when lease have expired, only a subset of stateids will
	normally become invalid during a network partition. When the client		normally become invalid during a network partition. When the client
	is able to communicate with the server after such a network		is able to communicate with the server after such a network
	partition, the status returned by the SEQUENCE operation will		partition, the status returned by the SEQUENCE operation will

	indicate a partial loss of locking state. In addition, operations,		indicate a partial loss of locking state
	including I/O submitted by the client with the now invalid stateids		(SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED). In addition, operations,
			including I/O submitted by the client, with the now invalid stateids
	will fail with the server returning the error NFS4ERR_EXPIRED. Once		will fail with the server returning the error NFS4ERR_EXPIRED. Once
	the client learns of the loss of locking state, it will use the		the client learns of the loss of locking state, it will use the
	TEST_STATEID operation on all of its stateids to determine which		TEST_STATEID operation on all of its stateids to determine which
	locks have been lost and then suitably notify the applications that		locks have been lost and then suitably notify the applications that
	held the invalidated locks. The client can then release the		held the invalidated locks. The client can then release the
	invalidated locking state and acknowledge the revocation of the		invalidated locking state and acknowledge the revocation of the
	associated locks by doing a FREE_STATEID operation on each of the		associated locks by doing a FREE_STATEID operation on each of the
	invalidated stateids.		invalidated stateids.

	When a network partition is combined with a server restart, there are		When a network partition is combined with a server restart, there are

	skipping to change at page 167, line 12		skipping to change at page 167, line 35
	Regardless of the level and approach to record keeping, the server		Regardless of the level and approach to record keeping, the server
	MUST implement one of the following strategies (which apply to		MUST implement one of the following strategies (which apply to
	reclaims of share reservations, record locks, and delegations):		reclaims of share reservations, record locks, and delegations):

	1. Reject all reclaims with NFS4ERR_NO_GRACE. This is extremely		1. Reject all reclaims with NFS4ERR_NO_GRACE. This is extremely
	unforgiving, but necessary if the server does not record lock		unforgiving, but necessary if the server does not record lock
	state in stable storage.		state in stable storage.

	2. Record sufficient state in stable storage such that all known		2. Record sufficient state in stable storage such that all known
	edge conditions involving server restart, including the two noted		edge conditions involving server restart, including the two noted

	in this section, are detected. Erroneously recognizing a edge		in this section, are detected. It is acceptable to erroneously
	condition and not allowing, when, with sufficient knowledge it		recognize an edge condition and not allow a reclaim, when, with
	would be grantable, acceptable. Note that at this time, it is		sufficient knowledge it would be allowed. Note it is not known
	not known if there are other edge conditions.		if there are other edge conditions.

	In the event that, after a server restart, the server determines		In the event that, after a server restart, the server determines
	that there is unrecoverable damage or corruption to the		that there is unrecoverable damage or corruption to the
	information in stable storage, then for all clients and/or locks		information in stable storage, then for all clients and/or locks
	which may be affected, the server MUST return NFS4ERR_NO_GRACE.		which may be affected, the server MUST return NFS4ERR_NO_GRACE.

	A mandate for the client's handling of the NFS4ERR_NO_GRACE error is		A mandate for the client's handling of the NFS4ERR_NO_GRACE error is
	outside the scope of this specification, since the strategies for		outside the scope of this specification, since the strategies for
	such handling are very dependent on the client's operating		such handling are very dependent on the client's operating
	environment. However, one potential approach is described below.		environment. However, one potential approach is described below.

	skipping to change at page 169, line 4		skipping to change at page 169, line 26

	When determining the time period for the server lease, the usual		When determining the time period for the server lease, the usual
	lease tradeoffs apply. Short leases are good for fast server		lease tradeoffs apply. Short leases are good for fast server
	recovery at a cost of increased operations to effect lease renewal		recovery at a cost of increased operations to effect lease renewal
	(when there are no other operations during the period to effect lease		(when there are no other operations during the period to effect lease
	renewal as a side-effect). Long leases are certainly kinder and		renewal as a side-effect). Long leases are certainly kinder and
	gentler to servers trying to handle very large numbers of clients.		gentler to servers trying to handle very large numbers of clients.
	The number of extra requests to effect lock renewal drops in inverse		The number of extra requests to effect lock renewal drops in inverse
	proportion to the lease time. The disadvantages of long leases		proportion to the lease time. The disadvantages of long leases
	include the possibility of slower recovery after certain failures.		include the possibility of slower recovery after certain failures.


	After server failure, a longer grace period may be required when some		After server failure, a longer grace period may be required when some
	clients do not promptly reclaim their locks and do a global		clients do not promptly reclaim their locks and do a global
	RECLAIM_COMPLETE. In the event of client failure, there can be a		RECLAIM_COMPLETE. In the event of client failure, there can be a
	longer period for leases to expire thus forcing conflicting requests		longer period for leases to expire thus forcing conflicting requests
	to wait.		to wait.


	Long leases are usable if the server is able to store lease state in		Long leases are practical if the server is able to store lease state
	non-volatile memory. Upon recovery, the server can reconstruct the		in non-volatile memory. Upon recovery, the server can reconstruct
	lease state from its non-volatile memory and continue operation with		the lease state from its non-volatile memory and continue operation
	its clients and therefore long leases would not be an issue.		with its clients and therefore long leases would not be an issue.

	8.7. Clocks, Propagation Delay, and Calculating Lease Expiration		8.7. Clocks, Propagation Delay, and Calculating Lease Expiration

	To avoid the need for synchronized clocks, lease times are granted by		To avoid the need for synchronized clocks, lease times are granted by
	the server as a time delta. However, there is a requirement that the		the server as a time delta. However, there is a requirement that the
	client and server clocks do not drift excessively over the duration		client and server clocks do not drift excessively over the duration
	of the lease. There is also the issue of propagation delay across		of the lease. There is also the issue of propagation delay across
	the network which could easily be several hundred milliseconds as		the network which could easily be several hundred milliseconds as
	well as the possibility that requests will be lost and need to be		well as the possibility that requests will be lost and need to be
	retransmitted.		retransmitted.

	To take propagation delay into account, the client should subtract it		To take propagation delay into account, the client should subtract it
	from lease times (e.g. if the client estimates the one-way		from lease times (e.g. if the client estimates the one-way

	propagation delay as 200 msec, then it can assume that the lease is		propagation delay as 200 millseconds, then it can assume that the
	already 200 msec old when it gets it). In addition, it will take		lease is already 200 millseconds old when it gets it). In addition,
	another 200 msec to get a response back to the server. So the client		it will take another 200 millseconds to get a response back to the
	must send a lease renewal or write data back to the server 400 msec		server. So the client must send a lease renewal or write data back
	before the lease would expire.		to the server at least 400 millseconds before the lease would expire.

	The server's lease period configuration should take into account the		The server's lease period configuration should take into account the
	network distance of the clients that will be accessing the server's		network distance of the clients that will be accessing the server's
	resources. It is expected that the lease period will take into		resources. It is expected that the lease period will take into
	account the network propagation delays and other network delay		account the network propagation delays and other network delay
	factors for the client population. Since the protocol does not allow		factors for the client population. Since the protocol does not allow
	for an automatic method to determine an appropriate lease period, the		for an automatic method to determine an appropriate lease period, the
	server's administrator may have to tune the lease period.		server's administrator may have to tune the lease period.

	8.8. Obsolete Locking Infrastructure From NFSv4.0		8.8. Obsolete Locking Infrastructure From NFSv4.0

	skipping to change at page 170, line 10		skipping to change at page 170, line 32

	The following NFSv4.0 operations MUST NOT be implemented in NFSv4.1.		The following NFSv4.0 operations MUST NOT be implemented in NFSv4.1.
	The server MUST return NFS4ERR_NOTSUPP if these operations are found		The server MUST return NFS4ERR_NOTSUPP if these operations are found
	in an NFSv4.1 COMPOUND.		in an NFSv4.1 COMPOUND.

	o SETCLIENTID since its function has been replaced by EXCHANGE_ID.		o SETCLIENTID since its function has been replaced by EXCHANGE_ID.

	o SETCLIENTID_CONFIRM since client ID confirmation now happens by		o SETCLIENTID_CONFIRM since client ID confirmation now happens by
	means of CREATE_SESSION.		means of CREATE_SESSION.


	o OPEN_CONFIRM because OPENs no longer require confirmation to		o OPEN_CONFIRM because state-owner-based seqids have been replaced
	establish an owner-based sequence value.		by the sequence id in the SEQUENCE operation.

	o RELEASE_LOCKOWNER because lock-owners with no associated locks do		o RELEASE_LOCKOWNER because lock-owners with no associated locks do
	not have any sequence-related state and so can be deleted by the		not have any sequence-related state and so can be deleted by the
	server at will.		server at will.

	o RENEW because every SEQUENCE operation for a session causes lease		o RENEW because every SEQUENCE operation for a session causes lease

	renewal, making a separate operation useless.		renewal, making a separate operation superfluous.

	Also, there are a number of fields, present in existing operations		Also, there are a number of fields, present in existing operations
	related to locking that have no use in minor version one. They were		related to locking that have no use in minor version one. They were
	used in minor version zero to perform functions now provided in a		used in minor version zero to perform functions now provided in a
	different fashion.		different fashion.

	o Sequence ids used to sequence requests for a given state-owner and		o Sequence ids used to sequence requests for a given state-owner and
	to provide retry protection, now provided via sessions.		to provide retry protection, now provided via sessions.

	o Client IDs used to identify the client associated with a given		o Client IDs used to identify the client associated with a given

End of changes. 66 change blocks.
	140 lines changed or deleted		160 lines changed or added
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/