Diff: draft-ietf-nfsv4-minorversion1-21.txt - draft-ietf-nfsv4-minorversion1-22.txt

Found wdiff, but it reported no recognisable version. Falling back to builtin diff colouring... Diff: draft-ietf-nfsv4-minorversion1-21.txt - draft-ietf-nfsv4-minorversion1-22.txt

	draft-ietf-nfsv4-minorversion1-21.txt		draft-ietf-nfsv4-minorversion1-22.txt

	NFSv4 S. Shepler		NFSv4 S. Shepler
	Internet-Draft M. Eisler		Internet-Draft M. Eisler
	Intended status: Standards Track D. Noveck		Intended status: Standards Track D. Noveck

	Expires: August 28, 2008 Editors		Expires: September 14, 2008 Editors
	February 25, 2008		March 13, 2008

	NFS Version 4 Minor Version 1		NFS Version 4 Minor Version 1

	draft-ietf-nfsv4-minorversion1-21.txt		draft-ietf-nfsv4-minorversion1-22.txt

	Status of this Memo		Status of this Memo

	By submitting this Internet-Draft, each author represents that any		By submitting this Internet-Draft, each author represents that any
	applicable patent or other IPR claims of which he or she is aware		applicable patent or other IPR claims of which he or she is aware
	have been or will be disclosed, and any of which he or she becomes		have been or will be disclosed, and any of which he or she becomes
	aware will be disclosed, in accordance with Section 6 of BCP 79.		aware will be disclosed, in accordance with Section 6 of BCP 79.

	Internet-Drafts are working documents of the Internet Engineering		Internet-Drafts are working documents of the Internet Engineering
	Task Force (IETF), its areas, and its working groups. Note that		Task Force (IETF), its areas, and its working groups. Note that

	skipping to change at page 1, line 35		skipping to change at page 1, line 35
	and may be updated, replaced, or obsoleted by other documents at any		and may be updated, replaced, or obsoleted by other documents at any
	time. It is inappropriate to use Internet-Drafts as reference		time. It is inappropriate to use Internet-Drafts as reference
	material or to cite them other than as "work in progress."		material or to cite them other than as "work in progress."

	The list of current Internet-Drafts can be accessed at		The list of current Internet-Drafts can be accessed at
	http://www.ietf.org/ietf/1id-abstracts.txt.		http://www.ietf.org/ietf/1id-abstracts.txt.

	The list of Internet-Draft Shadow Directories can be accessed at		The list of Internet-Draft Shadow Directories can be accessed at
	http://www.ietf.org/shadow.html.		http://www.ietf.org/shadow.html.


	This Internet-Draft will expire on August 28, 2008.		This Internet-Draft will expire on September 14, 2008.

	Copyright Notice		Copyright Notice

	Copyright (C) The IETF Trust (2008).		Copyright (C) The IETF Trust (2008).

	Abstract		Abstract

	This Internet-Draft describes NFS version 4 minor version one,		This Internet-Draft describes NFS version 4 minor version one,
	including features retained from the base protocol and protocol		including features retained from the base protocol and protocol
	extensions made subsequently. Major extensions introduced in NFS		extensions made subsequently. Major extensions introduced in NFS

	skipping to change at page 6, line 15		skipping to change at page 6, line 15
	11.7.6. The Change Attribute and File System Transitions . . 229		11.7.6. The Change Attribute and File System Transitions . . 229
	11.7.7. Lock State and File System Transitions . . . . . . . 230		11.7.7. Lock State and File System Transitions . . . . . . . 230
	11.7.8. Write Verifiers and File System Transitions . . . . 234		11.7.8. Write Verifiers and File System Transitions . . . . 234
	11.7.9. Readdir Cookies and Verifiers and File System		11.7.9. Readdir Cookies and Verifiers and File System
	Transitions . . . . . . . . . . . . . . . . . . . . 234		Transitions . . . . . . . . . . . . . . . . . . . . 234
	11.7.10. File System Data and File System Transitions . . . . 234		11.7.10. File System Data and File System Transitions . . . . 234
	11.8. Effecting File System Referrals . . . . . . . . . . . . 236		11.8. Effecting File System Referrals . . . . . . . . . . . . 236
	11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 236		11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 236
	11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 240		11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 240
	11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 242		11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 242

	11.10. The Attribute fs_locations_info . . . . . . . . . . . . 244		11.10. The Attribute fs_locations_info . . . . . . . . . . . . 245
	11.10.1. The fs_locations_server4 Structure . . . . . . . . . 248		11.10.1. The fs_locations_server4 Structure . . . . . . . . . 248
	11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 253		11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 253
	11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 254		11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 254
	11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 256		11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 256
	12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 260		12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 260
	12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 260		12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 260
	12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 262		12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 262
	12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 262		12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 262
	12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 262		12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 262
	12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 263		12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 263

	skipping to change at page 6, line 46		skipping to change at page 6, line 46
	12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 267		12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 267
	12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 269		12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 269
	12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 270		12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 270
	12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 271		12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 271
	12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 274		12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 274
	12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 281		12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 281
	12.5.7. Metadata Server Write Propagation . . . . . . . . . 281		12.5.7. Metadata Server Write Propagation . . . . . . . . . 281
	12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 281		12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 281
	12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 283		12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 283
	12.7.1. Recovery from Client Restart . . . . . . . . . . . . 283		12.7.1. Recovery from Client Restart . . . . . . . . . . . . 283

	12.7.2. Dealing with Lease Expiration on the Client . . . . 283		12.7.2. Dealing with Lease Expiration on the Client . . . . 284
	12.7.3. Dealing with Loss of Layout State on the Metadata		12.7.3. Dealing with Loss of Layout State on the Metadata

	Server . . . . . . . . . . . . . . . . . . . . . . . 284		Server . . . . . . . . . . . . . . . . . . . . . . . 285
	12.7.4. Recovery from Metadata Server Restart . . . . . . . 285		12.7.4. Recovery from Metadata Server Restart . . . . . . . 285
	12.7.5. Operations During Metadata Server Grace Period . . . 287		12.7.5. Operations During Metadata Server Grace Period . . . 287

	12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 287		12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 288
	12.8. Metadata and Storage Device Roles . . . . . . . . . . . 288		12.8. Metadata and Storage Device Roles . . . . . . . . . . . 288
	12.9. Security Considerations for pNFS . . . . . . . . . . . . 288		12.9. Security Considerations for pNFS . . . . . . . . . . . . 288
	13. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 289		13. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 289

	13.1. Client ID and Session Considerations . . . . . . . . . . 289		13.1. Client ID and Session Considerations . . . . . . . . . . 290
	13.1.1. Sessions Considerations for Data Servers . . . . . . 292		13.1.1. Sessions Considerations for Data Servers . . . . . . 292
	13.2. File Layout Definitions . . . . . . . . . . . . . . . . 292		13.2. File Layout Definitions . . . . . . . . . . . . . . . . 292
	13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 293		13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 293
	13.4. Interpreting the File Layout . . . . . . . . . . . . . . 297		13.4. Interpreting the File Layout . . . . . . . . . . . . . . 297
	13.4.1. Determining the Stripe Unit Number . . . . . . . . . 297		13.4.1. Determining the Stripe Unit Number . . . . . . . . . 297
	13.4.2. Interpreting the File Layout Using Sparse Packing . 297		13.4.2. Interpreting the File Layout Using Sparse Packing . 297
	13.4.3. Interpreting the File Layout Using Dense Packing . . 300		13.4.3. Interpreting the File Layout Using Dense Packing . . 300
	13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 302		13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 302
	13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 304		13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 304
	13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 305		13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 305

	skipping to change at page 8, line 31		skipping to change at page 8, line 31
	18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 406		18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 406
	18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 408		18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 408
	18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 409		18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 409
	18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 411		18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 411
	18.15. Operation 17: NVERIFY - Verify Difference in		18.15. Operation 17: NVERIFY - Verify Difference in
	Attributes . . . . . . . . . . . . . . . . . . . . . . . 412		Attributes . . . . . . . . . . . . . . . . . . . . . . . 412
	18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 413		18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 413
	18.17. Operation 19: OPENATTR - Open Named Attribute		18.17. Operation 19: OPENATTR - Open Named Attribute
	Directory . . . . . . . . . . . . . . . . . . . . . . . 432		Directory . . . . . . . . . . . . . . . . . . . . . . . 432
	18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 433		18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 433

	18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 434		18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 435
	18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 435		18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 435
	18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 437		18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 437
	18.22. Operation 25: READ - Read from File . . . . . . . . . . 437		18.22. Operation 25: READ - Read from File . . . . . . . . . . 437
	18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 440		18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 440

	18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 443		18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 444
	18.25. Operation 28: REMOVE - Remove File System Object . . . . 444		18.25. Operation 28: REMOVE - Remove File System Object . . . . 445
	18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 447		18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 447

	18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 450		18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 451
	18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 451		18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 452
	18.29. Operation 33: SECINFO - Obtain Available Security . . . 452		18.29. Operation 33: SECINFO - Obtain Available Security . . . 452
	18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 455		18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 455
	18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 458		18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 458
	18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 459		18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 459
	18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 464		18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 464
	18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 465		18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 465
	18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 468		18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 468
	18.36. Operation 43: CREATE_SESSION - Create New Session and		18.36. Operation 43: CREATE_SESSION - Create New Session and
	Confirm Client ID . . . . . . . . . . . . . . . . . . . 484		Confirm Client ID . . . . . . . . . . . . . . . . . . . 484
	18.37. Operation 44: DESTROY_SESSION - Destroy existing		18.37. Operation 44: DESTROY_SESSION - Destroy existing

	skipping to change at page 113, line 37		skipping to change at page 113, line 37
	The fs_layout_type attribute (see Section 3.3.13) applies to a file		The fs_layout_type attribute (see Section 3.3.13) applies to a file
	system and indicates what layout types are supported by the file		system and indicates what layout types are supported by the file
	system. When the client encounters a new fsid, the client SHOULD		system. When the client encounters a new fsid, the client SHOULD
	obtain the value for the fs_layout_type attribute associated with the		obtain the value for the fs_layout_type attribute associated with the
	new file system. This attribute is used by the client to determine		new file system. This attribute is used by the client to determine
	if the layout types supported by the server match any of the client's		if the layout types supported by the server match any of the client's
	supported layout types.		supported layout types.

	5.11.2. Attribute 66: layout_alignment		5.11.2. Attribute 66: layout_alignment


	When a client has layouts for a file system, the layout_alignment		When a client holds layouts on files of a file system, the
	attribute indicates the preferred alignment for I/O to files on that		layout_alignment attribute indicates the preferred alignment for I/O
	file system. Where possible, the client should send READ and WRITE		to files on that file system. Where possible, the client should send
	operations with offsets that are whole multiples of the		READ and WRITE operations with offsets that are whole multiples of
	layout_alignment attribute.		the layout_alignment attribute.

	5.11.3. Attribute 65: layout_blksize		5.11.3. Attribute 65: layout_blksize


	When a client has layouts for a file system, the layout_blksize		When a client holds layouts on files of a file system, the
	attribute indicates the preferred block size for I/O to files on that		layout_blksize attribute indicates the preferred block size for I/O
	file system. Where possible, the client should send READ operations		to files on that file system. Where possible, the client should send
	with a count argument that is a whole multiple of layout_blksize, and		READ operations with a count argument that is a whole multiple of
	WRITE operations with a data argument of size that is a whole		layout_blksize, and WRITE operations with a data argument of size
	multiple of layout_blksize.		that is a whole multiple of layout_blksize.

	5.11.4. Attribute 63: layout_hint		5.11.4. Attribute 63: layout_hint

	The layout_hint attribute (see Section 3.3.19) may be set on newly		The layout_hint attribute (see Section 3.3.19) may be set on newly
	created files to influence the metadata server's choice for the		created files to influence the metadata server's choice for the
	file's layout. If possible, this attribute is one of those set in		file's layout. If possible, this attribute is one of those set in
	the initial attributes within the OPEN operation. The metadata		the initial attributes within the OPEN operation. The metadata
	server may choose to ignore this attribute. The layout_hint		server may choose to ignore this attribute. The layout_hint
	attribute is a sub-set of the layout structure returned by LAYOUTGET.		attribute is a sub-set of the layout structure returned by LAYOUTGET.
	For example, instead of specifying particular devices, this would be		For example, instead of specifying particular devices, this would be

	skipping to change at page 117, line 26		skipping to change at page 117, line 26
	of the ACE4_WRITE_RETENTION_HOLD ACL permission. The enabling of		of the ACE4_WRITE_RETENTION_HOLD ACL permission. The enabling of
	administration retention holds does not prevent the enabling of		administration retention holds does not prevent the enabling of
	event-based or non-event-based retention.		event-based or non-event-based retention.

	6. Access Control Attributes		6. Access Control Attributes

	Access Control Lists (ACLs) are file attributes that specify fine		Access Control Lists (ACLs) are file attributes that specify fine
	grained access control. This chapter covers the "acl", "dacl",		grained access control. This chapter covers the "acl", "dacl",
	"sacl", "aclsupport", "mode", "mode_set_masked" file attributes, and		"sacl", "aclsupport", "mode", "mode_set_masked" file attributes, and
	their interactions. Note that file attributes may apply to any file		their interactions. Note that file attributes may apply to any file

	system objects.		system object.

	6.1. Goals		6.1. Goals

	ACLs and modes represent two well established models for specifying		ACLs and modes represent two well established models for specifying
	permissions. This chapter specifies requirements that attempt to		permissions. This chapter specifies requirements that attempt to
	meet the following goals:		meet the following goals:

	o If a server supports the mode attribute, it should provide		o If a server supports the mode attribute, it should provide
	reasonable semantics to clients that only set and retrieve the		reasonable semantics to clients that only set and retrieve the
	mode attribute.		mode attribute.

	skipping to change at page 122, line 28		skipping to change at page 122, line 28
	const ACE4_WRITE_RETENTION_HOLD = 0x00000400;		const ACE4_WRITE_RETENTION_HOLD = 0x00000400;

	const ACE4_DELETE = 0x00010000;		const ACE4_DELETE = 0x00010000;
	const ACE4_READ_ACL = 0x00020000;		const ACE4_READ_ACL = 0x00020000;
	const ACE4_WRITE_ACL = 0x00040000;		const ACE4_WRITE_ACL = 0x00040000;
	const ACE4_WRITE_OWNER = 0x00080000;		const ACE4_WRITE_OWNER = 0x00080000;
	const ACE4_SYNCHRONIZE = 0x00100000;		const ACE4_SYNCHRONIZE = 0x00100000;

	Note that some masks have coincident values, for example,		Note that some masks have coincident values, for example,
	ACE4_READ_DATA and ACE4_LIST_DIRECTORY. The mask entries		ACE4_READ_DATA and ACE4_LIST_DIRECTORY. The mask entries

	ACE4_LIST_DIRECTORY, ACE4_ADD_SUBDIRECTORY, and ACE4_TRAVERSE are		ACE4_LIST_DIRECTORY, ACE4_ADD_FILE, and ACE4_ADD_SUBDIRECTORY are
	intended to be used with directory objects, while ACE4_READ_DATA,		intended to be used with directory objects, while ACE4_READ_DATA,

	ACE4_WRITE_DATA, and ACE4_EXECUTE are intended to be used with non-		ACE4_WRITE_DATA, and ACE4_APPEND_DATA are intended to be used with
	directory objects.		non-directory objects.

	6.2.1.3.1. Discussion of Mask Attributes		6.2.1.3.1. Discussion of Mask Attributes

	ACE4_READ_DATA		ACE4_READ_DATA

	Operation(s) affected:		Operation(s) affected:

	READ		READ

	OPEN		OPEN

	skipping to change at page 149, line 39		skipping to change at page 149, line 39
	With the exception of special stateids, to be discussed later, each		With the exception of special stateids, to be discussed later, each
	stateid represents locking objects of one of a set of types defined		stateid represents locking objects of one of a set of types defined
	by the NFSv4.1 protocol. Note that in all these cases, where we		by the NFSv4.1 protocol. Note that in all these cases, where we
	speak of guarantee, it is understood there are situations such as a		speak of guarantee, it is understood there are situations such as a
	client restart, or lock revocation, that allow the guarantee to be		client restart, or lock revocation, that allow the guarantee to be
	voided.		voided.

	o Stateids may represent opens of files.		o Stateids may represent opens of files.

	Each stateid in this case represents the open for a given		Each stateid in this case represents the open for a given

	clientid/open-owner/filehandle triple. Such tateids are subject		clientid/open-owner/filehandle triple. Such stateids are subject
	to change (with consequent bumping of the seqid) in response to		to change (with consequent bumping of the seqid) in response to
	OPENs that result in upgrade and OPEN_DOWNGRADE operations.		OPENs that result in upgrade and OPEN_DOWNGRADE operations.

	o Stateids may represent sets of byte-range locks.		o Stateids may represent sets of byte-range locks.

	All locks held on a particular file by a particular owner and all		All locks held on a particular file by a particular owner and all
	gotten under the aegis of a particular open file are associated		gotten under the aegis of a particular open file are associated
	with a single stateid with the seqid being bumped as LOCK and		with a single stateid with the seqid being bumped as LOCK and
	LOCKU operation affect that set of locks.		LOCKU operation affect that set of locks.


	skipping to change at page 154, line 24		skipping to change at page 154, line 24
	analyzed by this procedure.		analyzed by this procedure.

	If server restart has resulted in an invalid client ID or a sessionid		If server restart has resulted in an invalid client ID or a sessionid
	which is invalid, SEQUENCE will return an error and the operation		which is invalid, SEQUENCE will return an error and the operation
	that takes a stateid as an argument will never be processed.		that takes a stateid as an argument will never be processed.

	If there has been a server restart where there is a persistent		If there has been a server restart where there is a persistent
	session, and all leased state has been lost, then the session in		session, and all leased state has been lost, then the session in
	question will, although valid, be marked as dead, and any operation		question will, although valid, be marked as dead, and any operation
	not satisfied by means of the reply cache will receive the error		not satisfied by means of the reply cache will receive the error

	NFS4ERR_DEADSESSION, and thus not be processed as indicated below		NFS4ERR_DEADSESSION, and thus not be processed as indicated below.
	either.

	When a stateid is being tested, and the "other" field is all zeros or		When a stateid is being tested, and the "other" field is all zeros or
	all ones, a check that the "other" and "seqid" fields match a defined		all ones, a check that the "other" and "seqid" fields match a defined
	combination for a special stateid is done and the results determined		combination for a special stateid is done and the results determined
	as follows:		as follows:

	o If the "other" and "seqid" fields do not match a defined		o If the "other" and "seqid" fields do not match a defined
	combination associated with a special stateid, the error		combination associated with a special stateid, the error
	NFS4ERR_BAD_STATEID is returned.		NFS4ERR_BAD_STATEID is returned.


	skipping to change at page 158, line 15		skipping to change at page 158, line 14
	SEQ4_STATUS_RECALLABLE_STATE_REVOKED notify the client of lock		SEQ4_STATUS_RECALLABLE_STATE_REVOKED notify the client of lock
	revocation events. When these bits are set, the client should use		revocation events. When these bits are set, the client should use
	TEST_STATEID to find what stateids have been revoked and use		TEST_STATEID to find what stateids have been revoked and use
	FREE_STATEID to acknowledge loss of the associated state.		FREE_STATEID to acknowledge loss of the associated state.

	o The status bit SEQ4_STATUS_LEASE_MOVE indicates that		o The status bit SEQ4_STATUS_LEASE_MOVE indicates that
	responsibility for lease renewal has been transferred to one or		responsibility for lease renewal has been transferred to one or
	more new servers.		more new servers.

	o The status bit SEQ4_STATUS_RESTART_RECLAIM_NEEDED indicates that		o The status bit SEQ4_STATUS_RESTART_RECLAIM_NEEDED indicates that

	due to server restart or restart the client must reclaim locking		due to server restart the client must reclaim locking state.
	state.

	o The status bit SEQ4_STATUS_BACKCHANNEL_FAULT indicates server has		o The status bit SEQ4_STATUS_BACKCHANNEL_FAULT indicates server has
	encountered an unrecoverable fault with the backchannel (e.g. it		encountered an unrecoverable fault with the backchannel (e.g. it
	has lost track of a sequence id for a slot in the backchannel).		has lost track of a sequence id for a slot in the backchannel).

	8.4. Crash Recovery		8.4. Crash Recovery

	A critical requirement in crash recovery is that both the client and		A critical requirement in crash recovery is that both the client and
	the server know when the other has failed. Additionally, it is		the server know when the other has failed. Additionally, it is
	required that a client sees a consistent view of data across server		required that a client sees a consistent view of data across server

	skipping to change at page 174, line 29		skipping to change at page 174, line 29
	write delegation and WRITE conflicts with a read delegation.		write delegation and WRITE conflicts with a read delegation.

	When a client holds a delegation, it is particularly important to		When a client holds a delegation, it is particularly important to
	make sure that the stateid sent conveys the association of operation		make sure that the stateid sent conveys the association of operation
	with the delegation, to avoid the delegation from being avoidably		with the delegation, to avoid the delegation from being avoidably
	recalled. When the delegation stateid, or a stateid open associated		recalled. When the delegation stateid, or a stateid open associated
	with that delegation, or a stateid representing byte-range locks		with that delegation, or a stateid representing byte-range locks
	derived form such an open is used, the server knows that the READ,		derived form such an open is used, the server knows that the READ,
	WRITE, or SETATTR does not conflict with the delegation, but is sent		WRITE, or SETATTR does not conflict with the delegation, but is sent
	under the aegis of the delegation. Even though it is possible for		under the aegis of the delegation. Even though it is possible for

	the server to determine from the clientid (gotten from the sessionid)		the server to determine from the clientid (via the sessionid) that
	that the client does in fact have a delegation, the server is not		the client does in fact have a delegation, the server is not obliged
	obliged to check this, so using a special stateid can result in		to check this, so using a special stateid can result in avoidable
	avoidable recall of the delegation.		recall of the delegation.

	9.2. Lock Ranges		9.2. Lock Ranges

	The protocol allows a lock-owner to request a lock with a byte range		The protocol allows a lock-owner to request a lock with a byte range
	and then either upgrade, downgrade, or unlock a sub-range of the		and then either upgrade, downgrade, or unlock a sub-range of the
	initial lock, or a range that consists of a range which overlaps,		initial lock, or a range that consists of a range which overlaps,
	fully or partially, that initial lock or a combination of a set of		fully or partially, that initial lock or a combination of a set of
	existing locks for the same lock-owner. It is expected that this		existing locks for the same lock-owner. It is expected that this
	will be an uncommon type of request. In any case, servers or server		will be an uncommon type of request. In any case, servers or server
	file systems may not be able to support sub-range lock semantics. In		file systems may not be able to support sub-range lock semantics. In

	skipping to change at page 186, line 33		skipping to change at page 186, line 33
	however, the server may extend the period in which conflicting		however, the server may extend the period in which conflicting
	requests are held off. Eventually the occurrence of a conflicting		requests are held off. Eventually the occurrence of a conflicting
	request from another client will cause revocation of the delegation.		request from another client will cause revocation of the delegation.
	A loss of the backchannel (e.g. by later network configuration		A loss of the backchannel (e.g. by later network configuration
	change) will have the same effect. A recall request will fail and		change) will have the same effect. A recall request will fail and
	revocation of the delegation will result.		revocation of the delegation will result.

	A client normally finds out about revocation of a delegation when it		A client normally finds out about revocation of a delegation when it
	uses a stateid associated with a delegation and receives one of the		uses a stateid associated with a delegation and receives one of the
	errors NFS4EER_EXPIRED, NFS4ERR_ADMIN_REVOKED, or		errors NFS4EER_EXPIRED, NFS4ERR_ADMIN_REVOKED, or

	MFS4ERR_DELEG_REVOKED. It also may find out about delegation		NFS4ERR_DELEG_REVOKED. It also may find out about delegation
	revocation after a client restart when it attempts to reclaim a		revocation after a client restart when it attempts to reclaim a
	delegation and receives that same error. Note that in the case of a		delegation and receives that same error. Note that in the case of a
	revoked write open delegation, there are issues because data may have		revoked write open delegation, there are issues because data may have
	been modified by the client whose delegation is revoked and		been modified by the client whose delegation is revoked and
	separately by other clients. See Section 10.5.1 for a discussion of		separately by other clients. See Section 10.5.1 for a discussion of
	such issues. Note also that when delegations are revoked,		such issues. Note also that when delegations are revoked,
	information about the revoked delegation will be written by the		information about the revoked delegation will be written by the
	server to stable storage (as described in Section 8.4.3). This is		server to stable storage (as described in Section 8.4.3). This is
	done to deal with the case in which a server restarts after revoking		done to deal with the case in which a server restarts after revoking
	a delegation but before the client holding the revoked delegation is		a delegation but before the client holding the revoked delegation is

	skipping to change at page 230, line 4		skipping to change at page 230, line 4
	each of the target file systems.		each of the target file systems.

	11.7.6. The Change Attribute and File System Transitions		11.7.6. The Change Attribute and File System Transitions

	Since the change attribute is defined as a server-specific one,		Since the change attribute is defined as a server-specific one,
	change attributes fetched from one server are normally presumed to be		change attributes fetched from one server are normally presumed to be
	invalid on another server. Such a presumption is troublesome since		invalid on another server. Such a presumption is troublesome since
	it would invalidate all cached change attributes, requiring		it would invalidate all cached change attributes, requiring
	refetching. Even more disruptive, the absence of any assured		refetching. Even more disruptive, the absence of any assured
	continuity for the change attribute means that even if the same value		continuity for the change attribute means that even if the same value

	is gotten on refetch no conclusions can drawn as to whether the		is retrieved on refetch no conclusions can drawn as to whether the
	object in question has changed. The identical change attribute could		object in question has changed. The identical change attribute could
	be merely an artifact of a modified file with a different change		be merely an artifact of a modified file with a different change
	attribute construction algorithm, with that new algorithm just		attribute construction algorithm, with that new algorithm just
	happening to result in an identical change value.		happening to result in an identical change value.

	When the two file systems have consistent change attribute formats,		When the two file systems have consistent change attribute formats,
	and this fact is communicated to the client by reporting as in the		and this fact is communicated to the client by reporting as in the
	same _change_ class, the client may assume a continuity of change		same _change_ class, the client may assume a continuity of change
	attribute construction and handle this situation just as it would be		attribute construction and handle this situation just as it would be
	handled without any file system transition.		handled without any file system transition.

	skipping to change at page 237, line 48		skipping to change at page 237, line 48
	that op but was moved between the last LOOKUP and the GETFH (since		that op but was moved between the last LOOKUP and the GETFH (since
	COMPOUND is not atomic). Even if we had the fsids for all of the		COMPOUND is not atomic). Even if we had the fsids for all of the
	intermediate directories, we could have no way of knowing that /this/		intermediate directories, we could have no way of knowing that /this/
	is/the/path was the root of a new file system, since we don't yet		is/the/path was the root of a new file system, since we don't yet
	have its fsid.		have its fsid.

	In order to get the necessary information, let us re-send the chain		In order to get the necessary information, let us re-send the chain
	of LOOKUPs with GETFHs and GETATTRs to at least get the fsids so we		of LOOKUPs with GETFHs and GETATTRs to at least get the fsids so we
	can be sure where the appropriate file system boundaries are. The		can be sure where the appropriate file system boundaries are. The
	client could choose to get fs_locations_info at the same time but in		client could choose to get fs_locations_info at the same time but in

	most cases the client will have a good guess as to where fs		most cases the client will have a good guess as to where file system
	boundaries are (because of where NFS4ERR_MOVED was gotten and where		boundaries are (because of where and where not NFS4ERR_MOVED was
	not) making fetching of fs_locations_info unnecessary.		received) making fetching of fs_locations_info unnecessary.

	OP01: PUTROOTFH --> NFS_OK		OP01: PUTROOTFH --> NFS_OK

	- Current fh is root of pseudo-fs.		- Current fh is root of pseudo-fs.

	OP02: GETATTR(fsid) --> NFS_OK		OP02: GETATTR(fsid) --> NFS_OK

	- Just for completeness. Normally, clients will know the fsid of		- Just for completeness. Normally, clients will know the fsid of
	the pseudo-fs as soon as they establish communication with a		the pseudo-fs as soon as they establish communication with a
	server.		server.

	skipping to change at page 239, line 31		skipping to change at page 239, line 31
	in fact the fsid we have for this file system might be a valid		in fact the fsid we have for this file system might be a valid
	fsid of a different file system on that new server.		fsid of a different file system on that new server.

	- In this particular case, we are pretty sure anyway that what has		- In this particular case, we are pretty sure anyway that what has
	moved is /this/is/the/path rather than /this/is/the since we have		moved is /this/is/the/path rather than /this/is/the since we have
	the fsid of the latter and it is that of the pseudo-fs, which		the fsid of the latter and it is that of the pseudo-fs, which
	presumably cannot move. However, in other examples, we might not		presumably cannot move. However, in other examples, we might not
	have this kind of information to rely on (e.g. /this/is/the might		have this kind of information to rely on (e.g. /this/is/the might
	be a non-pseudo file system separate from /this/is/the/path), so		be a non-pseudo file system separate from /this/is/the/path), so
	we need to have another reliable source information on the		we need to have another reliable source information on the

	boundary of the fs which is moved. If, for example, the file		boundary of the file system which is moved. If, for example, the
	system "/this/is" had moved we would have a case of migration		file system "/this/is" had moved we would have a case of migration
	rather than referral and once the boundaries of the migrated file		rather than referral and once the boundaries of the migrated file
	system was clear we could fetch fs_locations_info.		system was clear we could fetch fs_locations_info.

	- We are fetching fs_locations_info because the fact that we got an		- We are fetching fs_locations_info because the fact that we got an
	NFS4ERR_MOVED at this point means that it most likely that this is		NFS4ERR_MOVED at this point means that it most likely that this is
	a referral and we need the destination. Even if it is the case		a referral and we need the destination. Even if it is the case
	that "/this/is/the" is a file system which has migrated, we will		that "/this/is/the" is a file system which has migrated, we will
	still need the location information for that file system.		still need the location information for that file system.

	OP14: GETFH --> NFS4ERR_MOVED		OP14: GETFH --> NFS4ERR_MOVED

	skipping to change at page 242, line 27		skipping to change at page 242, line 27
	o READDIR (rdattr_error, fs_locations_info, mounted_on_fileid, fsid,		o READDIR (rdattr_error, fs_locations_info, mounted_on_fileid, fsid,
	size, time_modify) --> NFS_OK. The attributes will be as shown		size, time_modify) --> NFS_OK. The attributes will be as shown
	below.		below.

	The attributes for "path" will only contain		The attributes for "path" will only contain

	o rdattr_error (value: NFS_OK)		o rdattr_error (value: NFS_OK)

	o fs_locations_info		o fs_locations_info


	o mounted_on_fileid (value: unique fileid within referring fs)		o mounted_on_fileid (value: unique fileid within referring file
			system)

	o fsid (value: unique value within referring server)		o fsid (value: unique value within referring server)

	The attribute entry for "path" will not contain size or time_modify		The attribute entry for "path" will not contain size or time_modify
	because these attributes are not available within an absent file		because these attributes are not available within an absent file
	system.		system.

	11.9. The Attribute fs_locations		11.9. The Attribute fs_locations

	The fs_locations attribute is structured in the following way:		The fs_locations attribute is structured in the following way:

	skipping to change at page 266, line 19		skipping to change at page 266, line 19
	the same layout type and client ID again. This requirement is		the same layout type and client ID again. This requirement is
	feasible because the device ID is 16 bytes long, leaving sufficient		feasible because the device ID is 16 bytes long, leaving sufficient
	room to store a generation number if server's implementation requires		room to store a generation number if server's implementation requires
	most of the rest of the device ID's content to be reused. This		most of the rest of the device ID's content to be reused. This
	requirement is necessary because otherwise the race conditions		requirement is necessary because otherwise the race conditions
	between asynchronous notification of device ID addition and deletion		between asynchronous notification of device ID addition and deletion
	would be too difficult to sort out.		would be too difficult to sort out.

	Device ID to device address mappings are not leased, and can be		Device ID to device address mappings are not leased, and can be
	changed at any time. (Note that while device ID to device address		changed at any time. (Note that while device ID to device address

	mappings are likely to change after the metadata server restarts the		mappings are likely to change after the metadata server restarts, the
	server is not required to change the mappings.) A server has two		server is not required to change the mappings.) A server has two
	choices for changing mappings. It can recall all layouts referring		choices for changing mappings. It can recall all layouts referring
	to the device ID or it can use a notification mechanism.		to the device ID or it can use a notification mechanism.

	The NFSv4.1 protocol has no optimal way to recall all layouts that		The NFSv4.1 protocol has no optimal way to recall all layouts that
	referred to a particular device ID (unless the server associates a		referred to a particular device ID (unless the server associates a
	single device ID with a single fsid or a single client ID; in which		single device ID with a single fsid or a single client ID; in which
	case, CB_LAYOUTRECALL has options for recalling all layouts		case, CB_LAYOUTRECALL has options for recalling all layouts
	associated with the fsid, client ID pair or just the client ID).		associated with the fsid, client ID pair or just the client ID).


	skipping to change at page 270, line 29		skipping to change at page 270, line 29
	CB_LAYOUTRECALL request. When the client fully processes the		CB_LAYOUTRECALL request. When the client fully processes the
	response to a LAYOUTGET or LAYOUTRETURN, or fully processes the		response to a LAYOUTGET or LAYOUTRETURN, or fully processes the
	arguments of a CB_LAYOUTRECALL, it MUST use the seqid of the stateid		arguments of a CB_LAYOUTRECALL, it MUST use the seqid of the stateid
	of the reply from LAYOUTGET and LAYOUTRETURN, or the seqid of the		of the reply from LAYOUTGET and LAYOUTRETURN, or the seqid of the
	stateid in the arguments of CB_LAYOUTRECALL, on subsequent calls to		stateid in the arguments of CB_LAYOUTRECALL, on subsequent calls to
	LAYOUTGET or LAYOUTRETURN. The client and server use the "seqid" of		LAYOUTGET or LAYOUTRETURN. The client and server use the "seqid" of
	the layout stateid for the following purposes:		the layout stateid for the following purposes:

	o Permit the client to send parallel LAYOUTGET operations on the		o Permit the client to send parallel LAYOUTGET operations on the
	same file. As with parallel opens (see Section 9.10) the use of		same file. As with parallel opens (see Section 9.10) the use of

	the sequence ID allows a client to avoid serializing LAYOUTGET		the stateid's seqid allows a client to avoid serializing LAYOUTGET
	operations. If LAYOUTGETs were serialized, especially non-		operations. If LAYOUTGETs were serialized, especially non-
	overlapping LAYOUTGETs, then non-overlapping I/Os to storage		overlapping LAYOUTGETs, then non-overlapping I/Os to storage
	devices would in turn be effectively serialized with each other.		devices would in turn be effectively serialized with each other.
	In the event parallel LAYOUTGET operations are sent with a non-		In the event parallel LAYOUTGET operations are sent with a non-
	layout stateid (because the client does not yet have a layout		layout stateid (because the client does not yet have a layout
	stateid), the successful responses MUST have the same "other"		stateid), the successful responses MUST have the same "other"
	field in the LAYOUTSTATEID, and each response with a unique		field in the LAYOUTSTATEID, and each response with a unique
	"seqid", where the lowest "seqid" is one, and the highest "seqid"		"seqid", where the lowest "seqid" is one, and the highest "seqid"
	is equal to the count of parallel LAYOUTGET operations invoked on		is equal to the count of parallel LAYOUTGET operations invoked on
	the non-layout stateid.		the non-layout stateid.

	skipping to change at page 272, line 48		skipping to change at page 272, line 48
	update time_modify at LAYOUTCOMMIT. At LAYOUTCOMMIT completion, the		update time_modify at LAYOUTCOMMIT. At LAYOUTCOMMIT completion, the
	updated attributes should be visible if that file was modified since		updated attributes should be visible if that file was modified since
	the latest previous LAYOUTCOMMIT or LAYOUTGET.		the latest previous LAYOUTCOMMIT or LAYOUTGET.

	12.5.4.2. LAYOUTCOMMIT and size		12.5.4.2. LAYOUTCOMMIT and size

	The size of a file may be updated when the LAYOUTCOMMIT operation is		The size of a file may be updated when the LAYOUTCOMMIT operation is
	used by the client. One of the fields in the argument to		used by the client. One of the fields in the argument to
	LAYOUTCOMMIT is loca_last_write_offset; this field indicates the		LAYOUTCOMMIT is loca_last_write_offset; this field indicates the
	highest byte offset written but not yet committed with the		highest byte offset written but not yet committed with the

	LAYOUTCOMMIT operation. The data type of lora_last_write_offset is		LAYOUTCOMMIT operation. The data type of loca_last_write_offset is
	newoffset4 and is switched on a boolean value, no_newoffset, that		newoffset4 and is switched on a boolean value, no_newoffset, that
	indicates if a previous write occurred or not. If no_newoffset is		indicates if a previous write occurred or not. If no_newoffset is
	FALSE, an offset is not given. If the client has a layout with		FALSE, an offset is not given. If the client has a layout with
	LAYOUTIOMODE4_RW iomode on the file, with an lo_offset and lo_length		LAYOUTIOMODE4_RW iomode on the file, with an lo_offset and lo_length
	that overlaps loca_last_write_offset, then the client MAY set		that overlaps loca_last_write_offset, then the client MAY set
	no_newoffset to TRUE and provide an offset that will update the file		no_newoffset to TRUE and provide an offset that will update the file
	size. Keep in mind that offset is not the same as length, though		size. Keep in mind that offset is not the same as length, though
	they are related. For example, a loca_last_write_offset value of		they are related. For example, a loca_last_write_offset value of
	zero means that one byte was written at offset zero, and so the		zero means that one byte was written at offset zero, and so the
	length of the file is at least one byte.		length of the file is at least one byte.

	skipping to change at page 273, line 41		skipping to change at page 273, line 41

	The results of LAYOUTCOMMIT contain a new size value in the form of a		The results of LAYOUTCOMMIT contain a new size value in the form of a
	newsize4 union data type. If the file's size is set as a result of		newsize4 union data type. If the file's size is set as a result of
	LAYOUTCOMMIT, the metadata server must reply with the new size;		LAYOUTCOMMIT, the metadata server must reply with the new size;
	otherwise the new size is not provided. If the file size is updated,		otherwise the new size is not provided. If the file size is updated,
	the metadata server SHOULD update the storage devices such that the		the metadata server SHOULD update the storage devices such that the
	new file size is reflected when LAYOUTCOMMIT processing is complete.		new file size is reflected when LAYOUTCOMMIT processing is complete.
	For example, the client should be able to READ up to the new file		For example, the client should be able to READ up to the new file
	size.		size.


	If the client wants to explicitly zero-extend or truncate a file, the		The client can extend the length of a file or truncate a file by
	SETATTR operation MUST be used; SETATTR use is not required when		sending a SETATTR operation to the metadata server with the size
	simply writing past EOF via WRITE.		attribute specified. If the size specified is larger than the
			current size of the file, the file is "zero extended", i.e., zeroes
			are implicitly added between the file's previous EOF and the new EOF.
			(In many implementations the zero extended region of the file
			consists of unallocated holes in the file.) When the client writes
			past EOF via WRITE, the SETATTR operation does not need to be used.

	12.5.4.3. LAYOUTCOMMIT and layoutupdate		12.5.4.3. LAYOUTCOMMIT and layoutupdate

	The LAYOUTCOMMIT argument contains a loca_layoutupdate field		The LAYOUTCOMMIT argument contains a loca_layoutupdate field
	(Section 18.42.1) of data type layoutupdate4 (Section 3.3.18). This		(Section 18.42.1) of data type layoutupdate4 (Section 3.3.18). This
	argument is a layout type-specific structure. The structure can be		argument is a layout type-specific structure. The structure can be
	used to pass arbitrary layout type-specific information from the		used to pass arbitrary layout type-specific information from the
	client to the metadata server at LAYOUTCOMMIT time. For example, if		client to the metadata server at LAYOUTCOMMIT time. For example, if
	using a block/volume layout, the client can indicate to the metadata		using a block/volume layout, the client can indicate to the metadata
	server which reserved or allocated blocks the client used or did not		server which reserved or allocated blocks the client used or did not

	skipping to change at page 277, line 26		skipping to change at page 277, line 35
	12.5.5.2. Sequencing of Layout Operations		12.5.5.2. Sequencing of Layout Operations

	As with other stateful operations, pNFS requires the correct		As with other stateful operations, pNFS requires the correct
	sequencing of layout operations. PNFS uses the "seqid" in the layout		sequencing of layout operations. PNFS uses the "seqid" in the layout
	stateid to provide the correct sequencing between regular operations		stateid to provide the correct sequencing between regular operations
	and callbacks. It is the server's responsibility to avoid		and callbacks. It is the server's responsibility to avoid
	inconsistencies regarding the layouts provided and the client's		inconsistencies regarding the layouts provided and the client's
	responsibility to properly serialize its layout requests and layout		responsibility to properly serialize its layout requests and layout
	returns.		returns.


	12.5.5.2.1. Recall/Return Sequencing		12.5.5.2.1. Layout Recall and Return Sequencing


	Section 2.10.5.3 describes the sessions mechanism for allowing the		One critical issue with regard to layout operations sequencing
	client to detect such situations in order to delay processing such a		concerns callbacks. The protocol must defend against races between
	CB_LAYOUTRECALL. The server MUST reference all conflicting LAYOUTGET		the reply to a LAYOUTGET or LAYOUTRETURN operation and a subsequent
	operations in the CB_SEQUENCE that precedes the CB_LAYOUTRECALL. A		CB_LAYOUTRECALL. A client MUST NOT process a CB_LAYOUTRECALL that
	zero length array of referenced operations is used by the server to		implies one or more outstanding LAYOUTGET or LAYOUTRETURN operations
	tell the client that the server does not know of any LAYOUTGET		to which the client has not yet received a reply. The client detects
	operations that conflict with the recall.		such a CB_LAYOUTRECALL by examining the "seqid" field of the recall's
			layout stateid. If the "seqid" is not one higher than what the
			client currently has recorded, and the client has at least one
			LAYOUTGET and/or LAYOUTRETURN operation outstanding, the client knows
			the server sent the CB_LAYOUTRECALL after sending a response to an
			outstanding LAYOUTGET or LAYOUTRETURN. The client MUST wait before
			processing such a CB_LAYOUTRECALL until it processes all replies for
			outstanding LAYOUTGET and LAYOUTRETURN operations for the
			corresponding file with seqid less than the seqid given by
			CB_LAYOUTRECALL (lor_stateid, see Section 20.3.)


	While referencing conflicting operations in CB_SEQUENCE conveys to		In addition to the seqid-based mechanism, Section 2.10.5.3 describes
	the client that the server is aware of races, one critical issue with		the sessions mechanism for allowing the client to detect callback
	regard to operation sequencing concerns callbacks. The protocol must		race conditions and delay processing such a CB_LAYOUTRECALL. The
	defend against races between the reply to a LAYOUTGET or LAYOUTRETURN		server MAY reference conflicting operations in the CB_SEQUENCE that
	operation and a subsequent CB_LAYOUTRECALL. A client MUST NOT		precedes the CB_LAYOUTRECALL. Because the server has already sent
	process a CB_LAYOUTRECALL that implies one or more outstanding		replies for these operations before issuing the callback, the replies
	LAYOUTGET or LAYOUTRETURN operations to which the client has not yet		may race with the CB_LAYOUTRECALL. The client MUST wait for all the
	received a reply. The client detects such a CB_LAYOUTRECALL by		referenced calls to complete and update its view of the layout state
	examining the "seqid" field of the recall's layout stateid. If the		before processing the CB_LAYOUTRECALL.
	"seqid" is not one higher than what the client currently has
	recorded, and the client has at least one LAYOUTGET and/or
	LAYOUTRETURN operation outstanding, the client knows the server sent
	the CB_LAYOUTRECALL after the server sent a response to an
	outstanding LAYOUTGET or LAYOUTRETURN.

	12.5.5.2.1.1. Get/Return Sequencing		12.5.5.2.1.1. Get/Return Sequencing

	The protocol allows the client to send concurrent LAYOUTGET and		The protocol allows the client to send concurrent LAYOUTGET and
	LAYOUTRETURN operations to the server. The protocol does not provide		LAYOUTRETURN operations to the server. The protocol does not provide
	any means for the server to process the requests in the same order in		any means for the server to process the requests in the same order in
	which they were created. However, through the use of the "seqid"		which they were created. However, through the use of the "seqid"
	field in the layout stateid, the client can determine the order in		field in the layout stateid, the client can determine the order in
	which parallel outstanding operations were processed by the server.		which parallel outstanding operations were processed by the server.
	Thus, when a layout retrieved by an outstanding LAYOUTGET operation		Thus, when a layout retrieved by an outstanding LAYOUTGET operation

	skipping to change at page 284, line 46		skipping to change at page 285, line 12
	the lease expires, but arrive after the lease expires. See		the lease expires, but arrive after the lease expires. See
	Section 12.7.3.		Section 12.7.3.

	12.7.3. Dealing with Loss of Layout State on the Metadata Server		12.7.3. Dealing with Loss of Layout State on the Metadata Server

	This is a description of the case where all of the following are		This is a description of the case where all of the following are
	true:		true:

	o the metadata server has not restarted		o the metadata server has not restarted


	o a pNFS client's device ID to layouts have been discarded (usually		o a pNFS client's layouts have been discarded (usually because the
	because the client's lease expired) and are invalid		client's lease expired) and are invalid

	o an I/O from the pNFS client arrives at the storage device		o an I/O from the pNFS client arrives at the storage device

	The metadata server and its storage devices MUST solve this by		The metadata server and its storage devices MUST solve this by
	fencing the client. In other words, prevent the execution of I/O		fencing the client. In other words, prevent the execution of I/O
	operations from the client to the storage devices after layout state		operations from the client to the storage devices after layout state
	loss. The details of how fencing is done are specific to the layout		loss. The details of how fencing is done are specific to the layout
	type. The solution for NFSv4.1 file-based layouts is described in		type. The solution for NFSv4.1 file-based layouts is described in
	(Section 13.11), and for other layout types in their respective		(Section 13.11), and for other layout types in their respective
	external specification documents.		external specification documents.

	12.7.4. Recovery from Metadata Server Restart		12.7.4. Recovery from Metadata Server Restart

	The pNFS client will discover that the metadata server has restarted		The pNFS client will discover that the metadata server has restarted

	(e.g. restarted) via the methods described in Section 8.4.2 and		via the methods described in Section 8.4.2 and discussed in a pNFS-
	discussed in a pNFS-specific context in Paragraph 2, of		specific context in Paragraph 2, of Section 12.7.2. The client MUST
	Section 12.7.2. The client MUST stop using layouts and delete the		stop using layouts and delete the device ID to device address
	device ID to device address mappings it previously received from the		mappings it previously received from the metadata server. Having
	metadata server. Having done that, if the client wrote data to the		done that, if the client wrote data to the storage device without
	storage device without committing the layouts via LAYOUTCOMMIT, then		committing the layouts via LAYOUTCOMMIT, then the client has
	the client has additional work to do in order to have the client,		additional work to do in order to have the client, metadata server
	metadata server and storage device(s) all synchronized on the state		and storage device(s) all synchronized on the state of the data.
	of the data.

	o If the client has data still modified and unwritten in the		o If the client has data still modified and unwritten in the
	client's memory, the client has only two choices.		client's memory, the client has only two choices.

	1. The client can obtain a layout via LAYOUTGET after the		1. The client can obtain a layout via LAYOUTGET after the
	server's grace period and write the data to the storage		server's grace period and write the data to the storage
	devices.		devices.

	2. The client can write that data through the metadata server		2. The client can write that data through the metadata server
	using the WRITE (Section 18.32) operation, and then obtain		using the WRITE (Section 18.32) operation, and then obtain

	skipping to change at page 424, line 15		skipping to change at page 424, line 15
	\| CLAIM_DELEG_PREV_FH \| granted to a previous client instance; \|		\| CLAIM_DELEG_PREV_FH \| granted to a previous client instance; \|
	\| \| used after the client restarts. The server \|		\| \| used after the client restarts. The server \|
	\| \| MAY support CLAIM_DELEGATE_PREV or \|		\| \| MAY support CLAIM_DELEGATE_PREV or \|
	\| \| CLAIM_DELEG_PREV_FH (new to NFSv4.1). If \|		\| \| CLAIM_DELEG_PREV_FH (new to NFSv4.1). If \|
	\| \| it does support either open type, \|		\| \| it does support either open type, \|
	\| \| CREATE_SESSION MUST NOT remove the \|		\| \| CREATE_SESSION MUST NOT remove the \|
	\| \| client's delegation state, and the server \|		\| \| client's delegation state, and the server \|
	\| \| MUST support the DELEGPURGE operation. \|		\| \| MUST support the DELEGPURGE operation. \|
	+----------------------+--------------------------------------------+		+----------------------+--------------------------------------------+


	For OPEN requests whose claim type is other than CLAIM_PREVIOUS (i.e.		For OPEN requests that reach the server during the grace period, the
	requests other than those devoted to reclaiming opens after a server		server returns an error of NFS4ERR_GRACE. The following claim types
	restart) that reach the server during its grace or lease expiration		are exceptions:
	period, the server returns an error of NFS4ERR_GRACE.
			o OPEN requests specifying the claim type CLAIM_PREVIOUS are devoted
			to reclaiming opens after a server reboot and are typically only
			valid during the grace period.

			o OPEN requests specifying the claim types CLAIM_DELEGATE_CUR and
			CLAIM_DELEG_CUR_FH are valid both during and after the grace
			period. Since the granting of the delegation that they are
			subordinate to assures that there is no conflict with locks to be
			reclaimed by other clients, the server need not return
			NFS4ERR_GRACE when these are received during the grace period.

	For any OPEN request, the server may return an open delegation, which		For any OPEN request, the server may return an open delegation, which
	allows further opens and closes to be handled locally on the client		allows further opens and closes to be handled locally on the client
	as described in Section 10.4. Note that delegation is up to the		as described in Section 10.4. Note that delegation is up to the
	server to decide. The client should never assume that delegation		server to decide. The client should never assume that delegation
	will or will not be granted in a particular instance. It should		will or will not be granted in a particular instance. It should
	always be prepared for either case. A partial exception is the		always be prepared for either case. A partial exception is the
	reclaim (CLAIM_PREVIOUS) case, in which a delegation type is claimed.		reclaim (CLAIM_PREVIOUS) case, in which a delegation type is claimed.
	In this case, delegation will always be granted, although the server		In this case, delegation will always be granted, although the server
	may specify an immediate recall in the delegation structure.		may specify an immediate recall in the delegation structure.

	skipping to change at page 429, line 23		skipping to change at page 429, line 31
	use time_modify_set or time_access_set to store the verifier. The		use time_modify_set or time_access_set to store the verifier. The
	server SHOULD NOT store the verifier in the following attributes: acl		server SHOULD NOT store the verifier in the following attributes: acl
	(it is desirable for access control to be established at creation),		(it is desirable for access control to be established at creation),
	dacl (ditto), mode (ditto), owner (ditto), owner_group (ditto),		dacl (ditto), mode (ditto), owner (ditto), owner_group (ditto),
	retentevt_set (it may be desired to establish retention at creation)		retentevt_set (it may be desired to establish retention at creation)
	retention_hold (ditto), retention_set (ditto), sacl (it is desirable		retention_hold (ditto), retention_set (ditto), sacl (it is desirable
	for auditing control to be established at creation), size (on some		for auditing control to be established at creation), size (on some
	servers, size may have a limited range of values), mode_set_masked		servers, size may have a limited range of values), mode_set_masked
	(as with mode), and time_creation (a meaningful file creation should		(as with mode), and time_creation (a meaningful file creation should
	be set when the file is created). Another alternative for the server		be set when the file is created). Another alternative for the server

	is to use named attribute to store the verifier.		is to use a named attribute to store the verifier.

	Because the EXCLUSIVE4 create method does not specify initial		Because the EXCLUSIVE4 create method does not specify initial
	attributes, when processing an EXCLUSIVE4 create, the server		attributes, when processing an EXCLUSIVE4 create, the server

	o SHOULD set the owner of the file to that corresponding to the		o SHOULD set the owner of the file to that corresponding to the
	credential of request's RPC header.		credential of request's RPC header.

	o SHOULD NOT leave the file's access control to anyone but the owner		o SHOULD NOT leave the file's access control to anyone but the owner
	of the file.		of the file.


	skipping to change at page 462, line 45		skipping to change at page 462, line 45

	The definition of stable storage has been historically a point of		The definition of stable storage has been historically a point of
	contention. The following expected properties of stable storage may		contention. The following expected properties of stable storage may
	help in resolving design sends in the implementation. Stable storage		help in resolving design sends in the implementation. Stable storage
	is persistent storage that survives:		is persistent storage that survives:

	1. Repeated power failures.		1. Repeated power failures.

	2. Hardware failures (of any board, power supply, etc.).		2. Hardware failures (of any board, power supply, etc.).


	3. Repeated software crashes, including restart cycle.		3. Repeated software crashes and restarts.

	This definition does not address failure of the stable storage module		This definition does not address failure of the stable storage module
	itself.		itself.

	The verifier is defined to allow a client to detect different		The verifier is defined to allow a client to detect different
	instances of an NFSv4.1 protocol server over which cached,		instances of an NFSv4.1 protocol server over which cached,
	uncommitted data may be lost. In the most likely case, the verifier		uncommitted data may be lost. In the most likely case, the verifier
	allows the client to detect server restarts. This information is		allows the client to detect server restarts. This information is
	required so that the client can safely determine whether the server		required so that the client can safely determine whether the server
	could have lost cached data. If the server fails unexpectedly and		could have lost cached data. If the server fails unexpectedly and
	the client has uncommitted data from previous WRITE requests (done		the client has uncommitted data from previous WRITE requests (done
	with the stable argument set to UNSTABLE4 and in which the result		with the stable argument set to UNSTABLE4 and in which the result
	committed was returned as UNSTABLE4 as well) it may not have flushed		committed was returned as UNSTABLE4 as well) it may not have flushed
	cached data to stable storage. The burden of recovery is on the		cached data to stable storage. The burden of recovery is on the
	client and the client will need to retransmit the data to the server.		client and the client will need to retransmit the data to the server.

	A suggested verifier would be to use the time that the server was		A suggested verifier would be to use the time that the server was

	booted or the time the server was last started (if restarting the		last started (if restarting the server results in lost buffers).
	server without a restart results in lost buffers).

	The committed field in the results allows the client to do more		The committed field in the results allows the client to do more
	effective caching. If the server is committing all WRITE requests to		effective caching. If the server is committing all WRITE requests to
	stable storage, then it should return with committed set to		stable storage, then it should return with committed set to
	FILE_SYNC4, regardless of the value of the stable field in the		FILE_SYNC4, regardless of the value of the stable field in the
	arguments. A server that uses an NVRAM accelerator may choose to		arguments. A server that uses an NVRAM accelerator may choose to
	implement this policy. The client can use this to increase the		implement this policy. The client can use this to increase the
	effectiveness of the cache by discarding cached data that has already		effectiveness of the cache by discarding cached data that has already
	been committed on the server.		been committed on the server.


	skipping to change at page 522, line 50		skipping to change at page 522, line 50
	SEQ4_STATUS_LEASE_MOVED		SEQ4_STATUS_LEASE_MOVED
	When set indicates that responsibility for lease renewal has been		When set indicates that responsibility for lease renewal has been
	transferred to one or more new servers. This condition will		transferred to one or more new servers. This condition will
	continue until the client receives an NFS4ERR_MOVED error and the		continue until the client receives an NFS4ERR_MOVED error and the
	server receives the subsequent GETATTR for the fs_locations or		server receives the subsequent GETATTR for the fs_locations or
	fs_locations_info attribute for an access to each file system for		fs_locations_info attribute for an access to each file system for
	which a lease has been moved to a new server. See		which a lease has been moved to a new server. See
	Section 11.7.7.1.		Section 11.7.7.1.

	SEQ4_STATUS_RESTART_RECLAIM_NEEDED		SEQ4_STATUS_RESTART_RECLAIM_NEEDED

	When set indicates that due to server restart or restart the		When set indicates that due to server restart the client must
	client must reclaim locking state. Until the client sends a		reclaim locking state. Until the client sends a global
	global RECLAIM_COMPLETE (Section 18.51), every SEQUENCE operation		RECLAIM_COMPLETE (Section 18.51), every SEQUENCE operation will
	will return SEQ4_STATUS_RESTART_RECLAIM_NEEDED.		return SEQ4_STATUS_RESTART_RECLAIM_NEEDED.

	SEQ4_STATUS_BACKCHANNEL_FAULT		SEQ4_STATUS_BACKCHANNEL_FAULT
	The server has encountered an unrecoverable fault with the		The server has encountered an unrecoverable fault with the
	backchannel (e.g. it has lost track of the sequence id for a slot		backchannel (e.g. it has lost track of the sequence id for a slot
	in the backchannel). The client MUST stop sending more requests		in the backchannel). The client MUST stop sending more requests
	on the session's fore channel, wait for all outstanding requests		on the session's fore channel, wait for all outstanding requests
	to complete on the fore and back channel, and then destroy the		to complete on the fore and back channel, and then destroy the
	session.		session.

	SEQ4_STATUS_DEVID_CHANGED		SEQ4_STATUS_DEVID_CHANGED

	skipping to change at page 524, line 25		skipping to change at page 524, line 25
	The server MUST maintain a mapping of sessionid to client ID in order		The server MUST maintain a mapping of sessionid to client ID in order
	to validate any operations that follow SEQUENCE that take a stateid		to validate any operations that follow SEQUENCE that take a stateid
	as an argument and/or result.		as an argument and/or result.

	If the client establishes a persistent session, then a SEQUENCE done		If the client establishes a persistent session, then a SEQUENCE done
	after a server restart may encounter requests performed and recorded		after a server restart may encounter requests performed and recorded
	in a persistent reply cache before the server restart. In this case,		in a persistent reply cache before the server restart. In this case,
	SEQUENCE will be processed successfully, while requests which were		SEQUENCE will be processed successfully, while requests which were
	not processed previously are rejected with NFS4ERR_DEADSESSION.		not processed previously are rejected with NFS4ERR_DEADSESSION.


	Depending on the operations within the COMPOUND successfully		Depending on which of the operations within the COMPOUND were
	performed before the server restart, these operations will also have		successfully performed before the server restart, these operations
	replies sent from the server reply cache. Note that when these		will also have replies sent from the server reply cache. Note that
	operations establish locking state it is locking state that applies		when these operations establish locking state it is locking state
	to the previous server instance and to the previous client ID, even		that applies to the previous server instance and to the previous
	though the server restart, which logically happened after these		client ID, even though the server restart, which logically happened
	operations eliminated that state. In the case of a partially		after these operations, eliminated that state. In the case of a
	executed COMPOUND, processing may reach an operation not processed		partially executed COMPOUND, processing may reach an operation not
	during the earlier server instance, making this operation a new one		processed during the earlier server instance, making this operation a
	and not performable on the existing session. In this case		new one and not performable on the existing session. In this case,
	NFS4ERR_DEADSESSION will be returned from that operation.		NFS4ERR_DEADSESSION will be returned from that operation.

	18.47. Operation 54: SET_SSV - Update SSV for a Client ID		18.47. Operation 54: SET_SSV - Update SSV for a Client ID

	18.47.1. ARGUMENT		18.47.1. ARGUMENT

	struct ssa_digest_input4 {		struct ssa_digest_input4 {
	SEQUENCE4args sdi_seqargs;		SEQUENCE4args sdi_seqargs;
	};		};


	skipping to change at page 529, line 14		skipping to change at page 529, line 14

	18.49.1. ARGUMENT		18.49.1. ARGUMENT

	union deleg_claim4 switch (open_claim_type4 dc_claim) {		union deleg_claim4 switch (open_claim_type4 dc_claim) {
	/*		/*
	* No special rights to object. Ordinary delegation		* No special rights to object. Ordinary delegation
	* request of the specified object. Object identified		* request of the specified object. Object identified
	* by filehandle.		* by filehandle.
	*/		*/
	case CLAIM_FH: /* new to v4.1 */		case CLAIM_FH: /* new to v4.1 */

			/* CURRENT_FH: object being delegated */
	void;		void;

	/*		/*
	* Right to file based on a delegation granted		* Right to file based on a delegation granted
	* to a previous boot instance of the client.		* to a previous boot instance of the client.
	* File is specified by filehandle.		* File is specified by filehandle.
	*/		*/
	case CLAIM_DELEG_PREV_FH: /* new to v4.1 */		case CLAIM_DELEG_PREV_FH: /* new to v4.1 */
	/* CURRENT_FH: object being delegated */		/* CURRENT_FH: object being delegated */
	void;		void;

	skipping to change at page 530, line 21		skipping to change at page 530, line 21
	This operation allows a client to		This operation allows a client to

	o get a delegation on all types of files except directories. The		o get a delegation on all types of files except directories. The
	server MAY support this operation. If the server does not support		server MAY support this operation. If the server does not support
	this operation, it MUST return NFS4ERR_NOTSUPP.		this operation, it MUST return NFS4ERR_NOTSUPP.

	o register a "want" for a delegation for the specified file object,		o register a "want" for a delegation for the specified file object,
	and be notified via a callback when the delegation is available.		and be notified via a callback when the delegation is available.
	The server MAY support notifications of availability via		The server MAY support notifications of availability via
	callbacks. If the server does not support registration of wants		callbacks. If the server does not support registration of wants

	it MUST NOT return an error to indicate that. When the server		it MUST NOT return an error to indicate that, and instead MUST
	indicates that it will notify the server by means of a callback,		return ond_why set to WND4_CONTENTION or WND4_RESOURCE and
	it will either provide the delegation using a CB_PUSH_DELEG		ond_server_will_push_deleg or ond_server_will_signal_avail set to
	operation, or cancel its promise by sending a CB_WANTS_CANCELLED		FALSE. When the server indicates that it will notify the client
	operation.		by means of a callback, it will either provide the delegation
			using a CB_PUSH_DELEG operation, or cancel its promise by sending
			a CB_WANTS_CANCELLED operation.

	o cancel a want for a delegation.		o cancel a want for a delegation.

	The client SHOULD NOT set OPEN4_SHARE_ACCESS_READ and SHOULD NOT set		The client SHOULD NOT set OPEN4_SHARE_ACCESS_READ and SHOULD NOT set
	OPEN4_SHARE_ACCESS_WRITE in wda_want. If it does, the server MUST		OPEN4_SHARE_ACCESS_WRITE in wda_want. If it does, the server MUST
	ignore them.		ignore them.

	The meanings of the following flags in wda_want are the same as they		The meanings of the following flags in wda_want are the same as they
	are in OPEN:		are in OPEN:


	skipping to change at page 556, line 41		skipping to change at page 556, line 41
	protocol error must result. See Section 18.46.3 for a description of		protocol error must result. See Section 18.46.3 for a description of
	how slots are processed.		how slots are processed.

	If csa_cachethis is TRUE, then the server is requesting that the		If csa_cachethis is TRUE, then the server is requesting that the
	client cache the reply in the callback reply cache. The client MUST		client cache the reply in the callback reply cache. The client MUST
	cache the reply (see Section 2.10.5.1.3).		cache the reply (see Section 2.10.5.1.3).

	The csa_referring_call_lists array is the list of COMPOUND requests,		The csa_referring_call_lists array is the list of COMPOUND requests,
	identified by sessionid, slot id and sequencid. These are requests		identified by sessionid, slot id and sequencid. These are requests
	that the client previously sent to the server. These previous		that the client previously sent to the server. These previous

	requests created state that some operation(s) in the in the same		requests created state that some operation(s) in the same CB_COMPOUND
	CB_COMPOUND as the csa_referring_call_lists is identifying. A		as the csa_referring_call_lists is identifying. A sessionid is
	sessionid is included because leased state is tied to a client ID,		included because leased state is tied to a client ID, and a client ID
	and a client ID can have multiple sessions. See Section 2.10.5.3.		can have multiple sessions. See Section 2.10.5.3.

	The value of csa_sequenceid argument relative to the cached sequence		The value of csa_sequenceid argument relative to the cached sequence
	id on the slot falls into one of three cases.		id on the slot falls into one of three cases.

	o If the difference between csa_sequenceid and the client's cached		o If the difference between csa_sequenceid and the client's cached
	sequence id at the slot id is two (2) or more, or if		sequence id at the slot id is two (2) or more, or if
	csa_sequenceid is less than the cached sequence id (accounting for		csa_sequenceid is less than the cached sequence id (accounting for
	wraparound of the unsigned sequence id value), then the client		wraparound of the unsigned sequence id value), then the client
	MUST return NFS4ERR_SEQ_MISORDERED.		MUST return NFS4ERR_SEQ_MISORDERED.


End of changes. 43 change blocks.
	116 lines changed or deleted		135 lines changed or added
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/