Found wdiff, but it reported no recognisable version. Falling back to builtin diff colouring...
draft-pre-record-byte-range.txt | draft-ietf-nfsv4-minorversion1-22.txt | |||
---|---|---|---|---|
skipping to change at page 14, line 5 | skipping to change at page 14, line 5 | |||
is irrevocably granted a lock. At the end of a lease period the | is irrevocably granted a lock. At the end of a lease period the | |||
lock may be revoked if the lease has not been extended. The lock | lock may be revoked if the lease has not been extended. The lock | |||
must be revoked if a conflicting lock has been granted after the | must be revoked if a conflicting lock has been granted after the | |||
lease interval. | lease interval. | |||
All leases granted by a server have the same fixed interval. Note | All leases granted by a server have the same fixed interval. Note | |||
that the fixed interval was chosen to alleviate the expense a | that the fixed interval was chosen to alleviate the expense a | |||
server would have in maintaining state about variable length | server would have in maintaining state about variable length | |||
leases across server failures. | leases across server failures. | |||
Lock The term "lock" is used to refer to record (byte-range) locks, | Lock The term "lock" is used to refer to byte-range (in UNIX | |||
share reservations, delegations, or layouts unless specifically | environments, also known as record) locks, share reservations, | |||
stated otherwise. | delegations, or layouts unless specifically stated otherwise. | |||
Server The "Server" is the entity responsible for coordinating | Server The "Server" is the entity responsible for coordinating | |||
client access to a set of file systems and is identified by a | client access to a set of file systems and is identified by a | |||
Server owner. A server can span multiple network addresses. | Server owner. A server can span multiple network addresses. | |||
Server Owner The "Server Owner" identifies the server to the client. | Server Owner The "Server Owner" identifies the server to the client. | |||
The server owner consists of a major and minor identifier. When | The server owner consists of a major and minor identifier. When | |||
the client has two connections each to a peer with the same major | the client has two connections each to a peer with the same major | |||
identifier, the client assumes both peers are the same server (the | identifier, the client assumes both peers are the same server (the | |||
server namespace is the same via each connection), and assumes and | server namespace is the same via each connection), and assumes and | |||
skipping to change at page 147, line 36 | skipping to change at page 147, line 36 | |||
Where there is concern about the security of data on the network, | Where there is concern about the security of data on the network, | |||
clients should use strong security mechanisms to access the pseudo | clients should use strong security mechanisms to access the pseudo | |||
file system in order to prevent man-in-the-middle attacks. | file system in order to prevent man-in-the-middle attacks. | |||
8. State Management | 8. State Management | |||
Integrating locking into the NFS protocol necessarily causes it to be | Integrating locking into the NFS protocol necessarily causes it to be | |||
stateful. With the inclusion of such features as share reservations, | stateful. With the inclusion of such features as share reservations, | |||
file and directory delegations, recallable layouts, and support for | file and directory delegations, recallable layouts, and support for | |||
mandatory record locking, the protocol becomes substantially more | mandatory byte-range locking, the protocol becomes substantially more | |||
dependent on proper management of state than the traditional | dependent on proper management of state than the traditional | |||
combination of NFS and NLM [36]. These features include expanded | combination of NFS and NLM [36]. These features include expanded | |||
locking facilities, which provide some measure of interclient | locking facilities, which provide some measure of interclient | |||
exclusion, but the state also offers features not readily providable | exclusion, but the state also offers features not readily providable | |||
using a stateless model. There are three components to making this | using a stateless model. There are three components to making this | |||
state manageable: | state manageable: | |||
o Clear division between client and server | o Clear division between client and server | |||
o Ability to reliably detect inconsistency in state between client | o Ability to reliably detect inconsistency in state between client | |||
skipping to change at page 148, line 41 | skipping to change at page 148, line 41 | |||
For some types of locking interactions, the client will represent | For some types of locking interactions, the client will represent | |||
some number of internal locking entities called "owners", which | some number of internal locking entities called "owners", which | |||
normally correspond to processes internal to the client. For other | normally correspond to processes internal to the client. For other | |||
types of locking-related objects, such as delegations and layouts, no | types of locking-related objects, such as delegations and layouts, no | |||
such intermediate entities are provided for, and the locking-related | such intermediate entities are provided for, and the locking-related | |||
objects are considered to be transferred directly between the server | objects are considered to be transferred directly between the server | |||
and a unitary client. | and a unitary client. | |||
8.2. Stateid Definition | 8.2. Stateid Definition | |||
When the server grants a lock of any type (including opens, record | When the server grants a lock of any type (including opens, byte- | |||
locks, delegations, and layouts) it responds with a unique stateid, | range locks, delegations, and layouts) it responds with a unique | |||
that represents a set of locks (often a single lock) for the same | stateid, that represents a set of locks (often a single lock) for the | |||
file, of the same type, and sharing the same ownership | same file, of the same type, and sharing the same ownership | |||
characteristics. Thus opens of the same file by different open- | characteristics. Thus opens of the same file by different open- | |||
owners each have an identifying stateid. Similarly, each set of | owners each have an identifying stateid. Similarly, each set of | |||
record locks on a file owned by a specific lock-owner has its own | byte-range locks on a file owned by a specific lock-owner has its own | |||
identifying stateid. Delegations and layouts also have associated | identifying stateid. Delegations and layouts also have associated | |||
stateids by which they may be referenced. The stateid is used as a | stateids by which they may be referenced. The stateid is used as a | |||
shorthand reference to a lock or set of locks and given a stateid the | shorthand reference to a lock or set of locks and given a stateid the | |||
server can determine the associated state-owner or state-owners (in | server can determine the associated state-owner or state-owners (in | |||
the case of an open-owner/lock-owner pair) and the associated | the case of an open-owner/lock-owner pair) and the associated | |||
filehandle. When stateids are used, the current filehandle must be | filehandle. When stateids are used, the current filehandle must be | |||
the one associated with that stateid. | the one associated with that stateid. | |||
All stateids associated with a given client ID are associated with a | All stateids associated with a given client ID are associated with a | |||
common lease which represents the claim of those stateids and the | common lease which represents the claim of those stateids and the | |||
skipping to change at page 153, line 14 | skipping to change at page 153, line 14 | |||
the operation to which the stateid is passed will return | the operation to which the stateid is passed will return | |||
NFS4ERR_BAD_STATEID. | NFS4ERR_BAD_STATEID. | |||
8.2.4. Stateid Lifetime and Validation | 8.2.4. Stateid Lifetime and Validation | |||
Stateids must remain valid until either a client restart or a server | Stateids must remain valid until either a client restart or a server | |||
restart or until the client returns all of the locks associated with | restart or until the client returns all of the locks associated with | |||
the stateid by means of an operation such as CLOSE or DELEGRETURN. | the stateid by means of an operation such as CLOSE or DELEGRETURN. | |||
If the locks are lost due to revocation the stateid remains a valid | If the locks are lost due to revocation the stateid remains a valid | |||
designation of that revoked state until the client frees it by using | designation of that revoked state until the client frees it by using | |||
FREE_STATEID. Stateids associated with record locks are an | FREE_STATEID. Stateids associated with byte-range locks are an | |||
exception. They remain valid even if a LOCKU frees all remaining | exception. They remain valid even if a LOCKU frees all remaining | |||
locks, so long as the open file with which they are associated | locks, so long as the open file with which they are associated | |||
remains open, unless the client does a FREE_STATEID to cause the | remains open, unless the client does a FREE_STATEID to cause the | |||
stateid to be freed. | stateid to be freed. | |||
It should be noted that there are situations in which the client's | It should be noted that there are situations in which the client's | |||
locks become invalid, without the client requesting they be returned. | locks become invalid, without the client requesting they be returned. | |||
These include lease expiration and a number of forms of lock | These include lease expiration and a number of forms of lock | |||
revocation within the lease period. It is important to note that in | revocation within the lease period. It is important to note that in | |||
these situations, the stateid remains valid and the client can use it | these situations, the stateid remains valid and the client can use it | |||
skipping to change at page 154, line 5 | skipping to change at page 154, line 5 | |||
And then store in each table entry, | And then store in each table entry, | |||
o The client ID with which the stateid is associated. | o The client ID with which the stateid is associated. | |||
o The current generation number for the (at most one) valid stateid | o The current generation number for the (at most one) valid stateid | |||
sharing this index value. | sharing this index value. | |||
o The filehandle of the file on which the locks are taken. | o The filehandle of the file on which the locks are taken. | |||
o An indication of the type of stateid (open, record lock, file | o An indication of the type of stateid (open, byte-range lock, file | |||
delegation, directory delegation, layout). | delegation, directory delegation, layout). | |||
o The last "seqid" value returned corresponding to the current | o The last "seqid" value returned corresponding to the current | |||
"other" value. | "other" value. | |||
o An indication of the current status of the locks associated with | o An indication of the current status of the locks associated with | |||
this stateid. In particular, whether these have been revoked and | this stateid. In particular, whether these have been revoked and | |||
if so, for what reason. | if so, for what reason. | |||
With this information, an incoming stateid can be validated and the | With this information, an incoming stateid can be validated and the | |||
skipping to change at page 160, line 30 | skipping to change at page 160, line 30 | |||
establishes its lease before expiration occurs, requests for | establishes its lease before expiration occurs, requests for | |||
conflicting locks will not be granted. | conflicting locks will not be granted. | |||
To minimize client delay upon restart, lock requests are associated | To minimize client delay upon restart, lock requests are associated | |||
with an instance of the client by a client-supplied verifier. This | with an instance of the client by a client-supplied verifier. This | |||
verifier is part of the client_owner4 sent in the initial EXCHANGE_ID | verifier is part of the client_owner4 sent in the initial EXCHANGE_ID | |||
call made by the client. The server returns a client ID as a result | call made by the client. The server returns a client ID as a result | |||
of the EXCHANGE_ID operation. The client then confirms the use of | of the EXCHANGE_ID operation. The client then confirms the use of | |||
the client ID by establishing a session associated with that client | the client ID by establishing a session associated with that client | |||
ID (see Section 18.36.3 for a description how this is done). All | ID (see Section 18.36.3 for a description how this is done). All | |||
locks, including opens, record locks, delegations, and layouts | locks, including opens, byte-range locks, delegations, and layouts | |||
obtained by sessions using that client ID are associated with that | obtained by sessions using that client ID are associated with that | |||
client ID. | client ID. | |||
Since the verifier will be changed by the client upon each | Since the verifier will be changed by the client upon each | |||
initialization, the server can compare a new verifier to the verifier | initialization, the server can compare a new verifier to the verifier | |||
associated with currently held locks and determine that they do not | associated with currently held locks and determine that they do not | |||
match. This signifies the client's new instantiation and subsequent | match. This signifies the client's new instantiation and subsequent | |||
loss (upon confirmation of the new client ID) of locking state. As a | loss (upon confirmation of the new client ID) of locking state. As a | |||
result, the server is free to release all locks held which are | result, the server is free to release all locks held which are | |||
associated with the old client ID which was derived from the old | associated with the old client ID which was derived from the old | |||
skipping to change at page 163, line 38 | skipping to change at page 163, line 38 | |||
For a server to provide simple, valid handling during the grace | For a server to provide simple, valid handling during the grace | |||
period, the easiest method is to simply reject all non-reclaim | period, the easiest method is to simply reject all non-reclaim | |||
locking requests and READ and WRITE operations by returning the | locking requests and READ and WRITE operations by returning the | |||
NFS4ERR_GRACE error. However, a server may keep information about | NFS4ERR_GRACE error. However, a server may keep information about | |||
granted locks in stable storage. With this information, the server | granted locks in stable storage. With this information, the server | |||
could determine if a regular lock or READ or WRITE operation can be | could determine if a regular lock or READ or WRITE operation can be | |||
safely processed. | safely processed. | |||
For example, if the server maintained on stable storage summary | For example, if the server maintained on stable storage summary | |||
information on whether mandatory locks exist, either mandatory record | information on whether mandatory locks exist, either mandatory byte- | |||
locks, or share reservations specifying deny modes, many requests | range locks, or share reservations specifying deny modes, many | |||
could be allowed during the grace period. If it is known that no | requests could be allowed during the grace period. If it is known | |||
such share reservations exist, OPEN request that do not specify deny | that no such share reservations exist, OPEN request that do not | |||
modes may be safely granted. If, in addition, it is known that no | specify deny modes may be safely granted. If, in addition, it is | |||
mandatory record locks exist, either through information stored on | known that no mandatory byte-range locks exist, either through | |||
stable storage or simply because the server does not support such | information stored on stable storage or simply because the server | |||
locks, READ and WRITE requests may be safely processed during the | does not support such locks, READ and WRITE requests may be safely | |||
grace period. Another important case is where it is known that no | processed during the grace period. Another important case is where | |||
mandatory byte-range locks exist, either because the server does not | it is known that no mandatory byte-range locks exist, either because | |||
provide support for them, or because their absence is known from | the server does not provide support for them, or because their | |||
persistently recorded data. In this case, READ and WRITE operations | absence is known from persistently recorded data. In this case, READ | |||
specifying stateids derived from reclaim-type operation may be | and WRITE operations specifying stateids derived from reclaim-type | |||
validly processed during the grace period because the fact of the | operation may be validly processed during the grace period because | |||
valid reclaim ensures that no lock subsequently granted can prevent | the fact of the valid reclaim ensures that no lock subsequently | |||
the I/O. | granted can prevent the I/O. | |||
To reiterate, for a server that allows non-reclaim lock and I/O | To reiterate, for a server that allows non-reclaim lock and I/O | |||
requests to be processed during the grace period, it MUST determine | requests to be processed during the grace period, it MUST determine | |||
that no lock subsequently reclaimed will be rejected and that no lock | that no lock subsequently reclaimed will be rejected and that no lock | |||
subsequently reclaimed would have prevented any I/O operation | subsequently reclaimed would have prevented any I/O operation | |||
processed during the grace period. | processed during the grace period. | |||
Clients should be prepared for the return of NFS4ERR_GRACE errors for | Clients should be prepared for the return of NFS4ERR_GRACE errors for | |||
non-reclaim lock and I/O requests. In this case the client should | non-reclaim lock and I/O requests. In this case the client should | |||
employ a retry mechanism for the request. A delay (on the order of | employ a retry mechanism for the request. A delay (on the order of | |||
skipping to change at page 167, line 41 | skipping to change at page 167, line 41 | |||
reclaims, requires that the server record in stable storage | reclaims, requires that the server record in stable storage | |||
information some minimal information. For example, a server | information some minimal information. For example, a server | |||
implementation could, for each client, save in stable storage a | implementation could, for each client, save in stable storage a | |||
record containing: | record containing: | |||
o the co_ownerid field from the client_owner4 presented in the | o the co_ownerid field from the client_owner4 presented in the | |||
EXCHANGE_ID operation. | EXCHANGE_ID operation. | |||
o a boolean that indicates if the client's lease expired or if there | o a boolean that indicates if the client's lease expired or if there | |||
was administrative intervention (see Section 8.5) to revoke a | was administrative intervention (see Section 8.5) to revoke a | |||
record lock, share reservation, or delegation and there has been | byte-range lock, share reservation, or delegation and there has | |||
no acknowledgement, via FREE_STATEID, of such revocation. | been no acknowledgement, via FREE_STATEID, of such revocation. | |||
o a boolean that indicates whether the client may have locks that it | o a boolean that indicates whether the client may have locks that it | |||
believes to be reclaimable in situations which the grace period | believes to be reclaimable in situations which the grace period | |||
was terminated, making the server's view of lock reclaimability | was terminated, making the server's view of lock reclaimability | |||
suspect. The server will set this for any client record in stable | suspect. The server will set this for any client record in stable | |||
storage where the client has not done a suitable RECLAIM_COMPLETE | storage where the client has not done a suitable RECLAIM_COMPLETE | |||
(global or file system-specific depending on the target of the | (global or file system-specific depending on the target of the | |||
lock request) before it grants any new (i.e. not reclaimed) lock | lock request) before it grants any new (i.e. not reclaimed) lock | |||
to any client. | to any client. | |||
Assuming the above record keeping, for the first edge condition, | Assuming the above record keeping, for the first edge condition, | |||
after the server restarts, the record that client A's lease expired | after the server restarts, the record that client A's lease expired | |||
means that another client could have acquired a conflicting record | means that another client could have acquired a conflicting byte- | |||
lock, share reservation, or delegation. Hence the server must reject | range lock, share reservation, or delegation. Hence the server must | |||
a reclaim from client A with the error NFS4ERR_NO_GRACE. | reject a reclaim from client A with the error NFS4ERR_NO_GRACE. | |||
For the second edge condition, after the server restarts for a second | For the second edge condition, after the server restarts for a second | |||
time, the indication that the client had not completed its reclaims | time, the indication that the client had not completed its reclaims | |||
at the time at which the grace period ended means that the server | at the time at which the grace period ended means that the server | |||
must reject a reclaim from client A with the error NFS4ERR_NO_GRACE. | must reject a reclaim from client A with the error NFS4ERR_NO_GRACE. | |||
When either edge condition occurs, the client's attempt to reclaim | When either edge condition occurs, the client's attempt to reclaim | |||
locks will result in the error NFS4ERR_NO_GRACE. When this is | locks will result in the error NFS4ERR_NO_GRACE. When this is | |||
received, or after the client restarts with no lock state, the client | received, or after the client restarts with no lock state, the client | |||
will send a global RECLAIM_COMPLETE. When the RECLAIM_COMPLETE is | will send a global RECLAIM_COMPLETE. When the RECLAIM_COMPLETE is | |||
received, the server and client are again in agreement regarding | received, the server and client are again in agreement regarding | |||
reclaimable locks and both booleans in persistent storage can be | reclaimable locks and both booleans in persistent storage can be | |||
reset, to be set again only when there is a subsequent event that | reset, to be set again only when there is a subsequent event that | |||
causes lock reclaim operations to be questionable. | causes lock reclaim operations to be questionable. | |||
Regardless of the level and approach to record keeping, the server | Regardless of the level and approach to record keeping, the server | |||
MUST implement one of the following strategies (which apply to | MUST implement one of the following strategies (which apply to | |||
reclaims of share reservations, record locks, and delegations): | reclaims of share reservations, byte-range locks, and delegations): | |||
1. Reject all reclaims with NFS4ERR_NO_GRACE. This is extremely | 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is extremely | |||
unforgiving, but necessary if the server does not record lock | unforgiving, but necessary if the server does not record lock | |||
state in stable storage. | state in stable storage. | |||
2. Record sufficient state in stable storage such that all known | 2. Record sufficient state in stable storage such that all known | |||
edge conditions involving server restart, including the two noted | edge conditions involving server restart, including the two noted | |||
in this section, are detected. It is acceptable to erroneously | in this section, are detected. It is acceptable to erroneously | |||
recognize an edge condition and not allow a reclaim, when, with | recognize an edge condition and not allow a reclaim, when, with | |||
sufficient knowledge it would be allowed. The error the server | sufficient knowledge it would be allowed. The error the server | |||
skipping to change at page 169, line 8 | skipping to change at page 169, line 8 | |||
outside the scope of this specification, since the strategies for | outside the scope of this specification, since the strategies for | |||
such handling are very dependent on the client's operating | such handling are very dependent on the client's operating | |||
environment. However, one potential approach is described below. | environment. However, one potential approach is described below. | |||
When the client receives NFS4ERR_NO_GRACE, it could examine the | When the client receives NFS4ERR_NO_GRACE, it could examine the | |||
change attribute of the objects the client is trying to reclaim state | change attribute of the objects the client is trying to reclaim state | |||
for, and use that to determine whether to re-establish the state via | for, and use that to determine whether to re-establish the state via | |||
normal OPEN or LOCK requests. This is acceptable provided the | normal OPEN or LOCK requests. This is acceptable provided the | |||
client's operating environment allows it. In other words, the client | client's operating environment allows it. In other words, the client | |||
implementor is advised to document for his users the behavior. The | implementor is advised to document for his users the behavior. The | |||
client could also inform the application that its record lock or | client could also inform the application that its byte-range lock or | |||
share reservations (whether they were delegated or not) have been | share reservations (whether they were delegated or not) have been | |||
lost, such as via a UNIX signal, a GUI pop-up window, etc. See | lost, such as via a UNIX signal, a GUI pop-up window, etc. See | |||
Section 10.5 for a discussion of what the client should do for | Section 10.5 for a discussion of what the client should do for | |||
dealing with unreclaimed delegations on client state. | dealing with unreclaimed delegations on client state. | |||
For further discussion of revocation of locks see Section 8.5. | For further discussion of revocation of locks see Section 8.5. | |||
8.5. Server Revocation of Locks | 8.5. Server Revocation of Locks | |||
At any point, the server can revoke locks held by a client and the | At any point, the server can revoke locks held by a client and the | |||
skipping to change at page 172, line 38 | skipping to change at page 172, line 38 | |||
It is assumed that manipulating a byte-range lock is rare when | It is assumed that manipulating a byte-range lock is rare when | |||
compared to READ and WRITE operations. It is also assumed that | compared to READ and WRITE operations. It is also assumed that | |||
server restarts and network partitions are relatively rare. | server restarts and network partitions are relatively rare. | |||
Therefore it is important that the READ and WRITE operations have a | Therefore it is important that the READ and WRITE operations have a | |||
lightweight mechanism to indicate if they possess a held lock. A | lightweight mechanism to indicate if they possess a held lock. A | |||
byte-range lock request contains the heavyweight information required | byte-range lock request contains the heavyweight information required | |||
to establish a lock and uniquely define the owner of the lock. | to establish a lock and uniquely define the owner of the lock. | |||
9.1.1. State-owner Definition | 9.1.1. State-owner Definition | |||
When opening a file or requesting a record lock, the client must | When opening a file or requesting a byte-range lock, the client must | |||
specify an identifier which represents the owner of the requested | specify an identifier which represents the owner of the requested | |||
lock. This identifier is in the form of a state-owner, represented | lock. This identifier is in the form of a state-owner, represented | |||
in the protocol by a state_owner4, a variable-length opaque array | in the protocol by a state_owner4, a variable-length opaque array | |||
which, when concatenated with the current client ID uniquely defines | which, when concatenated with the current client ID uniquely defines | |||
the owner of lock managed by the client. This may be a thread id, | the owner of lock managed by the client. This may be a thread id, | |||
process id, or other unique value. | process id, or other unique value. | |||
Owners of opens and owners of record locks are separate entities and | Owners of opens and owners of byte-range locks are separate entities | |||
remain separate even if the same opaque arrays are used to designate | and remain separate even if the same opaque arrays are used to | |||
owners of each. The protocol distinguishes between open-owners | designate owners of each. The protocol distinguishes between open- | |||
(represented by open_owner4 structures) and lock-owners (represented | owners (represented by open_owner4 structures) and lock-owners | |||
by lock_owner4 structures). | (represented by lock_owner4 structures). | |||
Each open is associated with a specific open-owner while each record | Each open is associated with a specific open-owner while each byte- | |||
lock is associated with a lock-owner and an open-owner, the latter | range lock is associated with a lock-owner and an open-owner, the | |||
being the open-owner associated with the open file under which the | latter being the open-owner associated with the open file under which | |||
LOCK operation was done. Delegations and layouts, on the other hand, | the LOCK operation was done. Delegations and layouts, on the other | |||
are not associated with a specific owner but are associated with the | hand, are not associated with a specific owner but are associated | |||
client as a whole (identified by a client ID). | with the client as a whole (identified by a client ID). | |||
9.1.2. Use of the Stateid and Locking | 9.1.2. Use of the Stateid and Locking | |||
All READ, WRITE and SETATTR operations contain a stateid. For the | All READ, WRITE and SETATTR operations contain a stateid. For the | |||
purposes of this section, SETATTR operations which change the size | purposes of this section, SETATTR operations which change the size | |||
attribute of a file are treated as if they are writing the area | attribute of a file are treated as if they are writing the area | |||
between the old and new size (i.e. the range truncated or added to | between the old and new size (i.e. the range truncated or added to | |||
the file by means of the SETATTR), even where SETATTR is not | the file by means of the SETATTR), even where SETATTR is not | |||
explicitly mentioned in the text. The stateid passed to one of these | explicitly mentioned in the text. The stateid passed to one of these | |||
operations must be one that represents an open, a set of byte-range | operations must be one that represents an open, a set of byte-range | |||
locks, or a delegation, or it may be a special stateid representing | locks, or a delegation, or it may be a special stateid representing | |||
anonymous access or the special bypass stateid. | anonymous access or the special bypass stateid. | |||
If the state-owner performs a READ or WRITE in a situation in which | If the state-owner performs a READ or WRITE in a situation in which | |||
it has established a byte-range lock or share reservation on the | it has established a byte-range lock or share reservation on the | |||
server (any OPEN constitutes a share reservation) the stateid | server (any OPEN constitutes a share reservation) the stateid | |||
(previously returned by the server) must be used to indicate what | (previously returned by the server) must be used to indicate what | |||
locks, including both record locks and share reservations, are held | locks, including both byte-range locks and share reservations, are | |||
by the state-owner. If no state is established by the client, either | held by the state-owner. If no state is established by the client, | |||
record lock or share reservation, a special stateid for anonymous | either byte-range lock or share reservation, a special stateid for | |||
state (zero as "other" and "seqid") is used. (See Section 8.2.3 for | anonymous state (zero as "other" and "seqid") is used. (See | |||
a description of 'special' stateids in general.) Regardless whether | Section 8.2.3 for a description of 'special' stateids in general.) | |||
a stateid for anonymous state or a stateid returned by the server is | Regardless whether a stateid for anonymous state or a stateid | |||
used, if there is a conflicting share reservation or mandatory record | returned by the server is used, if there is a conflicting share | |||
lock held on the file, the server MUST refuse to service the READ or | reservation or mandatory byte-range lock held on the file, the server | |||
WRITE operation. | MUST refuse to service the READ or WRITE operation. | |||
Share reservations are established by OPEN operations and by their | Share reservations are established by OPEN operations and by their | |||
nature are mandatory in that when the OPEN denies READ or WRITE | nature are mandatory in that when the OPEN denies READ or WRITE | |||
operations, that denial results in such operations being rejected | operations, that denial results in such operations being rejected | |||
with error NFS4ERR_LOCKED. Record locks may be implemented by the | with error NFS4ERR_LOCKED. Byte-range locks may be implemented by | |||
server as either mandatory or advisory, or the choice of mandatory or | the server as either mandatory or advisory, or the choice of | |||
advisory behavior may be determined by the server on the basis of the | mandatory or advisory behavior may be determined by the server on the | |||
file being accessed (for example, some UNIX-based servers support a | basis of the file being accessed (for example, some UNIX-based | |||
"mandatory lock bit" on the mode attribute such that if set, record | servers support a "mandatory lock bit" on the mode attribute such | |||
locks are required on the file before I/O is possible). When record | that if set, byte-range locks are required on the file before I/O is | |||
locks are advisory, they only prevent the granting of conflicting | possible). When byte-range locks are advisory, they only prevent the | |||
lock requests and have no effect on READs or WRITEs. Mandatory | granting of conflicting lock requests and have no effect on READs or | |||
record locks, however, prevent conflicting I/O operations. When they | WRITEs. Mandatory byte-range locks, however, prevent conflicting I/O | |||
are attempted, they are rejected with NFS4ERR_LOCKED. When the | operations. When they are attempted, they are rejected with | |||
client gets NFS4ERR_LOCKED on a file it knows it has the proper share | NFS4ERR_LOCKED. When the client gets NFS4ERR_LOCKED on a file it | |||
reservation for, it will need to send a LOCK request on the region of | knows it has the proper share reservation for, it will need to send a | |||
the file that includes the region the I/O was to be performed on, | LOCK request on the region of the file that includes the region the | |||
with an appropriate locktype (i.e. READ*_LT for a READ operation, | I/O was to be performed on, with an appropriate locktype (i.e. | |||
WRITE*_LT for a WRITE operation). | READ*_LT for a READ operation, WRITE*_LT for a WRITE operation). | |||
Note that for UNIX environments that support mandatory file locking, | Note that for UNIX environments that support mandatory file locking, | |||
the distinction between advisory and mandatory locking is subtle. In | the distinction between advisory and mandatory locking is subtle. In | |||
fact, advisory and mandatory record locks are exactly the same in so | fact, advisory and mandatory byte-range locks are exactly the same in | |||
far as the APIs and requirements on implementation. If the mandatory | so far as the APIs and requirements on implementation. If the | |||
lock attribute is set on the file, the server checks to see if the | mandatory lock attribute is set on the file, the server checks to see | |||
lock-owner has an appropriate shared (read) or exclusive (write) | if the lock-owner has an appropriate shared (read) or exclusive | |||
record lock on the region it wishes to read or write to. If there is | (write) byte-range lock on the region it wishes to read or write to. | |||
no appropriate lock, the server checks if there is a conflicting lock | If there is no appropriate lock, the server checks if there is a | |||
(which can be done by attempting to acquire the conflicting lock on | conflicting lock (which can be done by attempting to acquire the | |||
behalf of the lock-owner, and if successful, release the lock after | conflicting lock on behalf of the lock-owner, and if successful, | |||
the READ or WRITE is done), and if there is, the server returns | release the lock after the READ or WRITE is done), and if there is, | |||
NFS4ERR_LOCKED. | the server returns NFS4ERR_LOCKED. | |||
For Windows environments, record locks are always mandatory, so the | For Windows environments, byte-range locks are always mandatory, so | |||
server always checks for record locks during I/O requests. | the server always checks for byte-range locks during I/O requests. | |||
Thus, the NFSv4.1 LOCK operation does not need to distinguish between | Thus, the NFSv4.1 LOCK operation does not need to distinguish between | |||
advisory and mandatory record locks. It is the NFSv4.1 server's | advisory and mandatory byte-range locks. It is the NFSv4.1 server's | |||
processing of the READ and WRITE operations that introduces the | processing of the READ and WRITE operations that introduces the | |||
distinction. | distinction. | |||
Every stateid which is validly passed to READ, WRITE or SETATTR, with | Every stateid which is validly passed to READ, WRITE or SETATTR, with | |||
the exception of special stateid values, defines an access mode for | the exception of special stateid values, defines an access mode for | |||
the file (i.e. READ, WRITE, or READ-WRITE) | the file (i.e. READ, WRITE, or READ-WRITE) | |||
o For stateids associated with opens, this is the mode defined by | o For stateids associated with opens, this is the mode defined by | |||
the original OPEN which caused the allocation of the open stateid | the original OPEN which caused the allocation of the open stateid | |||
and as modified by subsequent OPENs and OPEN_DOWNGRADEs for the | and as modified by subsequent OPENs and OPEN_DOWNGRADEs for the | |||
same open-owner/file pair. | same open-owner/file pair. | |||
o For stateids returned by record lock requests, the appropriate | o For stateids returned by byte-range lock requests, the appropriate | |||
mode is the access mode for the open stateid associated with the | mode is the access mode for the open stateid associated with the | |||
lock set represented by the stateid. | lock set represented by the stateid. | |||
o For delegation stateids the access mode is based on the type of | o For delegation stateids the access mode is based on the type of | |||
delegation. | delegation. | |||
When a READ, WRITE, or SETATTR (which specifies the size attribute) | When a READ, WRITE, or SETATTR (which specifies the size attribute) | |||
is done, the operation is subject to checking against the access mode | is done, the operation is subject to checking against the access mode | |||
to verify that the operation is appropriate given the stateid with | to verify that the operation is appropriate given the stateid with | |||
which the operation is associated. | which the operation is associated. | |||
skipping to change at page 176, line 35 | skipping to change at page 176, line 35 | |||
ranges that happen to be adjacent into a single request since the | ranges that happen to be adjacent into a single request since the | |||
server may not support sub-range requests and for reasons related to | server may not support sub-range requests and for reasons related to | |||
the recovery of file locking state in the event of server failure. | the recovery of file locking state in the event of server failure. | |||
As discussed in Section 8.4.2, the server may employ certain | As discussed in Section 8.4.2, the server may employ certain | |||
optimizations during recovery that work effectively only when the | optimizations during recovery that work effectively only when the | |||
client's behavior during lock recovery is similar to the client's | client's behavior during lock recovery is similar to the client's | |||
locking behavior prior to server failure. | locking behavior prior to server failure. | |||
9.3. Upgrading and Downgrading Locks | 9.3. Upgrading and Downgrading Locks | |||
If a client has a write lock on a record, it can request an atomic | If a client has a write lock on a byte-range, it can request an | |||
downgrade of the lock to a read lock via the LOCK request, by setting | atomic downgrade of the lock to a read lock via the LOCK request, by | |||
the type to READ_LT. If the server supports atomic downgrade, the | setting the type to READ_LT. If the server supports atomic | |||
request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. | downgrade, the request will succeed. If not, it will return | |||
The client should be prepared to receive this error, and if | NFS4ERR_LOCK_NOTSUPP. The client should be prepared to receive this | |||
appropriate, report the error to the requesting application. | error, and if appropriate, report the error to the requesting | |||
application. | ||||
If a client has a read lock on a record, it can request an atomic | If a client has a read lock on a byte-range, it can request an atomic | |||
upgrade of the lock to a write lock via the LOCK request by setting | upgrade of the lock to a write lock via the LOCK request by setting | |||
the type to WRITE_LT or WRITEW_LT. If the server does not support | the type to WRITE_LT or WRITEW_LT. If the server does not support | |||
atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade | atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade | |||
can be achieved without an existing conflict, the request will | can be achieved without an existing conflict, the request will | |||
succeed. Otherwise, the server will return either NFS4ERR_DENIED or | succeed. Otherwise, the server will return either NFS4ERR_DENIED or | |||
NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the | NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the | |||
client sent the LOCK request with the type set to WRITEW_LT and the | client sent the LOCK request with the type set to WRITEW_LT and the | |||
server has detected a deadlock. The client should be prepared to | server has detected a deadlock. The client should be prepared to | |||
receive such errors and if appropriate, report the error to the | receive such errors and if appropriate, report the error to the | |||
requesting application. | requesting application. | |||
skipping to change at page 179, line 12 | skipping to change at page 179, line 12 | |||
lock, since the greater latency that might occur is likely to be | lock, since the greater latency that might occur is likely to be | |||
eliminated given a prompt callback, but it still needs to poll. When | eliminated given a prompt callback, but it still needs to poll. When | |||
it receives a CB_NOTIFY_LOCK it should promptly try to obtain the | it receives a CB_NOTIFY_LOCK it should promptly try to obtain the | |||
lock, but it should be aware that other clients may polling and the | lock, but it should be aware that other clients may polling and the | |||
server is under no obligation to reserve the lock for that particular | server is under no obligation to reserve the lock for that particular | |||
client. | client. | |||
9.7. Share Reservations | 9.7. Share Reservations | |||
A share reservation is a mechanism to control access to a file. It | A share reservation is a mechanism to control access to a file. It | |||
is a separate and independent mechanism from record locking. When a | is a separate and independent mechanism from byte-range locking. | |||
client opens a file, it sends an OPEN operation to the server | When a client opens a file, it sends an OPEN operation to the server | |||
specifying the type of access required (READ, WRITE, or BOTH) and the | specifying the type of access required (READ, WRITE, or BOTH) and the | |||
type of access to deny others (deny NONE, READ, WRITE, or BOTH). If | type of access to deny others (deny NONE, READ, WRITE, or BOTH). If | |||
the OPEN fails the client will fail the application's open request. | the OPEN fails the client will fail the application's open request. | |||
Pseudo-code definition of the semantics: | Pseudo-code definition of the semantics: | |||
if (request.access == 0) { | if (request.access == 0) { | |||
return (NFS4ERR_INVAL) | return (NFS4ERR_INVAL) | |||
} else { | } else { | |||
if ((request.access & file_state.deny)) || | if ((request.access & file_state.deny)) || | |||
skipping to change at page 180, line 14 | skipping to change at page 180, line 14 | |||
still obtain the filehandle for the regular file with the OPEN | still obtain the filehandle for the regular file with the OPEN | |||
operation so the appropriate share semantics can be applied. For | operation so the appropriate share semantics can be applied. For | |||
clients that do not have a deny mode built into their open | clients that do not have a deny mode built into their open | |||
programming interfaces, deny equal to NONE should be used. | programming interfaces, deny equal to NONE should be used. | |||
The OPEN operation with the CREATE flag, also subsumes the CREATE | The OPEN operation with the CREATE flag, also subsumes the CREATE | |||
operation for regular files as used in previous versions of the NFS | operation for regular files as used in previous versions of the NFS | |||
protocol. This allows a create with a share to be done atomically. | protocol. This allows a create with a share to be done atomically. | |||
The CLOSE operation removes all share reservations held by the open- | The CLOSE operation removes all share reservations held by the open- | |||
owner on that file. If record locks are held, the client SHOULD | owner on that file. If byte-range locks are held, the client SHOULD | |||
release all locks before issuing a CLOSE. The server MAY free all | release all locks before issuing a CLOSE. The server MAY free all | |||
outstanding locks on CLOSE but some servers may not support the CLOSE | outstanding locks on CLOSE but some servers may not support the CLOSE | |||
of a file that still has record locks held. The server MUST return | of a file that still has byte-range locks held. The server MUST | |||
failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the | return failure, NFS4ERR_LOCKS_HELD, if any locks would exist after | |||
CLOSE. | the CLOSE. | |||
The LOOKUP operation will return a filehandle without establishing | The LOOKUP operation will return a filehandle without establishing | |||
any lock state on the server. Without a valid stateid, the server | any lock state on the server. Without a valid stateid, the server | |||
will assume the client has the least access. For example, a file | will assume the client has the least access. For example, a file | |||
opened with deny READ/WRITE using a filehandle obtained through | opened with deny READ/WRITE using a filehandle obtained through | |||
LOOKUP could only be read using the special read bypass stateid and | LOOKUP could only be read using the special read bypass stateid and | |||
could not be written at all because it would not have a valid stateid | could not be written at all because it would not have a valid stateid | |||
and the special anonymous stateid would not be allowed access. | and the special anonymous stateid would not be allowed access. | |||
9.9. Open Upgrade and Downgrade | 9.9. Open Upgrade and Downgrade | |||
skipping to change at page 186, line 33 | skipping to change at page 186, line 33 | |||
There are three situations that delegation recovery must deal with: | There are three situations that delegation recovery must deal with: | |||
o Client restart | o Client restart | |||
o Server restart | o Server restart | |||
o Network partition (full or backchannel-only) | o Network partition (full or backchannel-only) | |||
In the event the client restarts, the failure to renew the lease will | In the event the client restarts, the failure to renew the lease will | |||
result in the revocation of record locks and share reservations. | result in the revocation of byte-range locks and share reservations. | |||
Delegations, however, may be treated a bit differently. | Delegations, however, may be treated a bit differently. | |||
There will be situations in which delegations will need to be | There will be situations in which delegations will need to be | |||
reestablished after a client restarts. The reason for this is the | reestablished after a client restarts. The reason for this is the | |||
client may have file data stored locally and this data was associated | client may have file data stored locally and this data was associated | |||
with the previously held delegations. The client will need to | with the previously held delegations. The client will need to | |||
reestablish the appropriate file state on the server. | reestablish the appropriate file state on the server. | |||
To allow for this type of client recovery, the server MAY extend the | To allow for this type of client recovery, the server MAY extend the | |||
period for delegation recovery beyond the typical lease expiration | period for delegation recovery beyond the typical lease expiration | |||
skipping to change at page 187, line 18 | skipping to change at page 187, line 18 | |||
A server MAY support claim types of CLAIM_DELEGATE_PREV and | A server MAY support claim types of CLAIM_DELEGATE_PREV and | |||
CLAIM_DELEG_PREV_FH, and if it does, it MUST NOT remove delegations | CLAIM_DELEG_PREV_FH, and if it does, it MUST NOT remove delegations | |||
upon a CREATE_SESSION that confirms a client ID created by | upon a CREATE_SESSION that confirms a client ID created by | |||
EXCHANGE_ID, and instead MUST, for a period of time no less than that | EXCHANGE_ID, and instead MUST, for a period of time no less than that | |||
of the value of the lease_time attribute, maintain the client's | of the value of the lease_time attribute, maintain the client's | |||
delegations to allow time for the client to send CLAIM_DELEGATE_PREV | delegations to allow time for the client to send CLAIM_DELEGATE_PREV | |||
requests. The server that supports CLAIM_DELEGATE_PREV and/or | requests. The server that supports CLAIM_DELEGATE_PREV and/or | |||
CLAIM_DELEG_PREV_FH MUST support the DELEGPURGE operation. | CLAIM_DELEG_PREV_FH MUST support the DELEGPURGE operation. | |||
When the server restarts, delegations are reclaimed (using the OPEN | When the server restarts, delegations are reclaimed (using the OPEN | |||
operation with CLAIM_PREVIOUS) in a similar fashion to record locks | operation with CLAIM_PREVIOUS) in a similar fashion to byte-range | |||
and share reservations. However, there is a slight semantic | locks and share reservations. However, there is a slight semantic | |||
difference. In the normal case if the server decides that a | difference. In the normal case if the server decides that a | |||
delegation should not be granted, it performs the requested action | delegation should not be granted, it performs the requested action | |||
(e.g. OPEN) without granting any delegation. For reclaim, the | (e.g. OPEN) without granting any delegation. For reclaim, the | |||
server grants the delegation but a special designation is applied so | server grants the delegation but a special designation is applied so | |||
that the client treats the delegation as having been granted but | that the client treats the delegation as having been granted but | |||
recalled by the server. Because of this, the client has the duty to | recalled by the server. Because of this, the client has the duty to | |||
write all modified state to the server and then return the | write all modified state to the server and then return the | |||
delegation. This process of handling delegation reclaim reconciles | delegation. This process of handling delegation reclaim reconciles | |||
three principles of the NFSv4.1 protocol: | three principles of the NFSv4.1 protocol: | |||
skipping to change at page 188, line 41 | skipping to change at page 188, line 41 | |||
notified about the revocation. | notified about the revocation. | |||
10.3. Data Caching | 10.3. Data Caching | |||
When applications share access to a set of files, they need to be | When applications share access to a set of files, they need to be | |||
implemented so as to take account of the possibility of conflicting | implemented so as to take account of the possibility of conflicting | |||
access by another application. This is true whether the applications | access by another application. This is true whether the applications | |||
in question execute on different clients or reside on the same | in question execute on different clients or reside on the same | |||
client. | client. | |||
Share reservations and record locks are the facilities the NFSv4.1 | Share reservations and byte-range locks are the facilities the | |||
protocol provides to allow applications to coordinate access by using | NFSv4.1 protocol provides to allow applications to coordinate access | |||
mutual exclusion facilities. The NFSv4.1 protocol's data caching | by using mutual exclusion facilities. The NFSv4.1 protocol's data | |||
must be implemented such that it does not invalidate the assumptions | caching must be implemented such that it does not invalidate the | |||
that those using these facilities depend upon. | assumptions that those using these facilities depend upon. | |||
10.3.1. Data Caching and OPENs | 10.3.1. Data Caching and OPENs | |||
In order to avoid invalidating the sharing assumptions that | In order to avoid invalidating the sharing assumptions that | |||
applications rely on, NFSv4.1 clients should not provide cached data | applications rely on, NFSv4.1 clients should not provide cached data | |||
to applications or modify it on behalf of an application when it | to applications or modify it on behalf of an application when it | |||
would not be valid to obtain or modify that same data via a READ or | would not be valid to obtain or modify that same data via a READ or | |||
WRITE operation. | WRITE operation. | |||
Furthermore, in the absence of open delegation (see Section 10.4), | Furthermore, in the absence of open delegation (see Section 10.4), | |||
skipping to change at page 191, line 9 | skipping to change at page 191, line 9 | |||
The data that is written to the server as a prerequisite to the | The data that is written to the server as a prerequisite to the | |||
unlocking of a region must be written, at the server, to stable | unlocking of a region must be written, at the server, to stable | |||
storage. The client may accomplish this either with synchronous | storage. The client may accomplish this either with synchronous | |||
writes or by following asynchronous writes with a COMMIT operation. | writes or by following asynchronous writes with a COMMIT operation. | |||
This is required because retransmission of the modified data after a | This is required because retransmission of the modified data after a | |||
server restart might conflict with a lock held by another client. | server restart might conflict with a lock held by another client. | |||
A client implementation may choose to accommodate applications which | A client implementation may choose to accommodate applications which | |||
use record locking in non-standard ways (e.g. using a record lock as | use byte-range locking in non-standard ways (e.g. using a byte-range | |||
a global semaphore) by flushing to the server more data upon an LOCKU | lock as a global semaphore) by flushing to the server more data upon | |||
than is covered by the locked range. This may include modified data | an LOCKU than is covered by the locked range. This may include | |||
within files other than the one for which the unlocks are being done. | modified data within files other than the one for which the unlocks | |||
In such cases, the client must not interfere with applications whose | are being done. In such cases, the client must not interfere with | |||
READs and WRITEs are being done only within the bounds of record | applications whose READs and WRITEs are being done only within the | |||
locks which the application holds. For example, an application locks | bounds of byte-range locks which the application holds. For example, | |||
a single byte of a file and proceeds to write that single byte. A | an application locks a single byte of a file and proceeds to write | |||
client that chose to handle a LOCKU by flushing all modified data to | that single byte. A client that chose to handle a LOCKU by flushing | |||
the server could validly write that single byte in response to an | all modified data to the server could validly write that single byte | |||
unrelated unlock. However, it would not be valid to write the entire | in response to an unrelated unlock. However, it would not be valid | |||
block in which that single written byte was located since it includes | to write the entire block in which that single written byte was | |||
an area that is not locked and might be locked by another client. | located since it includes an area that is not locked and might be | |||
Client implementations can avoid this problem by dividing files with | locked by another client. Client implementations can avoid this | |||
modified data into those for which all modifications are done to | problem by dividing files with modified data into those for which all | |||
areas covered by an appropriate record lock and those for which there | modifications are done to areas covered by an appropriate byte-range | |||
are modifications not covered by a record lock. Any writes done for | lock and those for which there are modifications not covered by a | |||
the former class of files must not include areas not locked and thus | byte-range lock. Any writes done for the former class of files must | |||
not modified on the client. | not include areas not locked and thus not modified on the client. | |||
10.3.3. Data Caching and Mandatory File Locking | 10.3.3. Data Caching and Mandatory File Locking | |||
Client side data caching needs to respect mandatory file locking when | Client side data caching needs to respect mandatory file locking when | |||
it is in effect. The presence of mandatory file locking for a given | it is in effect. The presence of mandatory file locking for a given | |||
file is indicated when the client gets back NFS4ERR_LOCKED from a | file is indicated when the client gets back NFS4ERR_LOCKED from a | |||
READ or WRITE on a file it has an appropriate share reservation for. | READ or WRITE on a file it has an appropriate share reservation for. | |||
When mandatory locking is in effect for a file, the client must check | When mandatory locking is in effect for a file, the client must check | |||
for an appropriate file lock for data being read or written. If a | for an appropriate file lock for data being read or written. If a | |||
lock exists for the range being read or written, the client may | lock exists for the range being read or written, the client may | |||
skipping to change at page 209, line 8 | skipping to change at page 209, line 8 | |||
virtual memory management systems on each client only know a page is | virtual memory management systems on each client only know a page is | |||
modified, not that a subset of the page corresponding to the | modified, not that a subset of the page corresponding to the | |||
respective lock regions has been modified. So it is not possible for | respective lock regions has been modified. So it is not possible for | |||
each client to do the right thing, which is to only write to the | each client to do the right thing, which is to only write to the | |||
server that portion of the page that is locked. For example, if | server that portion of the page that is locked. For example, if | |||
client A simply writes out the page, and then client B writes out the | client A simply writes out the page, and then client B writes out the | |||
page, client A's data is lost. | page, client A's data is lost. | |||
Moreover, if mandatory locking is enabled on the file, then we have a | Moreover, if mandatory locking is enabled on the file, then we have a | |||
different problem. When clients A and B execute the STORE | different problem. When clients A and B execute the STORE | |||
instructions, the resulting page faults require a record lock on the | instructions, the resulting page faults require a byte-range lock on | |||
entire page. Each client then tries to extend their locked range to | the entire page. Each client then tries to extend their locked range | |||
the entire page, which results in a deadlock. Communicating the | to the entire page, which results in a deadlock. Communicating the | |||
NFS4ERR_DEADLOCK error to a STORE instruction is difficult at best. | NFS4ERR_DEADLOCK error to a STORE instruction is difficult at best. | |||
If a client is locking the entire memory mapped file, there is no | If a client is locking the entire memory mapped file, there is no | |||
problem with advisory or mandatory record locking, at least until the | problem with advisory or mandatory byte-range locking, at least until | |||
client unlocks a region in the middle of the file. | the client unlocks a region in the middle of the file. | |||
Given the above issues the following are permitted: | Given the above issues the following are permitted: | |||
o Clients and servers MAY deny memory mapping a file they know there | o Clients and servers MAY deny memory mapping a file they know there | |||
are record locks for. | are byte-range locks for. | |||
o Clients and servers MAY deny a record lock on a file they know is | o Clients and servers MAY deny a byte-range lock on a file they know | |||
memory mapped. | is memory mapped. | |||
o A client MAY deny memory mapping a file that it knows requires | o A client MAY deny memory mapping a file that it knows requires | |||
mandatory locking for I/O. If mandatory locking is enabled after | mandatory locking for I/O. If mandatory locking is enabled after | |||
the file is opened and mapped, the client MAY deny the application | the file is opened and mapped, the client MAY deny the application | |||
further access to its mapped file. | further access to its mapped file. | |||
10.8. Name and Directory Caching without Directory Delegations | 10.8. Name and Directory Caching without Directory Delegations | |||
The NFSv4.1 directory delegation facility (described in Section 10.9 | The NFSv4.1 directory delegation facility (described in Section 10.9 | |||
below) is OPTIONAL for servers to implement. Even where it is | below) is OPTIONAL for servers to implement. Even where it is | |||
skipping to change at page 264, line 16 | skipping to change at page 264, line 16 | |||
pNFS takes the form of OPTIONAL operations that manage protocol | pNFS takes the form of OPTIONAL operations that manage protocol | |||
objects called 'layouts' which contain data location information. | objects called 'layouts' which contain data location information. | |||
The layout is managed in a similar fashion as NFSv4.1 data | The layout is managed in a similar fashion as NFSv4.1 data | |||
delegations are managed. For example, the layout is leased, | delegations are managed. For example, the layout is leased, | |||
recallable and revocable. However, layouts are distinct abstractions | recallable and revocable. However, layouts are distinct abstractions | |||
and are manipulated with new operations. When a client holds a | and are manipulated with new operations. When a client holds a | |||
layout, it is granted the ability to access the data location | layout, it is granted the ability to access the data location | |||
directly using the location information specified in the layout. | directly using the location information specified in the layout. | |||
There are interactions between layouts and other NFSv4.1 abstractions | There are interactions between layouts and other NFSv4.1 abstractions | |||
such as data delegations and record locking. Delegation issues are | such as data delegations and byte-range locking. Delegation issues | |||
discussed in Section 12.5.5. Byte range locking issues are discussed | are discussed in Section 12.5.5. Byte range locking issues are | |||
in Section 12.2.9 and Section 12.5.1. | discussed in Section 12.2.9 and Section 12.5.1. | |||
The NFSv4.1 pNFS feature has been structured to allow for a variety | The NFSv4.1 pNFS feature has been structured to allow for a variety | |||
of storage protocols to be defined and used. As noted in the diagram | of storage protocols to be defined and used. As noted in the diagram | |||
above, the storage protocol is the method used by the client to store | above, the storage protocol is the method used by the client to store | |||
and retrieve data directly from the storage devices. The NFSv4.1 | and retrieve data directly from the storage devices. The NFSv4.1 | |||
protocol directly defines one storage protocol, the NFSv4.1 storage | protocol directly defines one storage protocol, the NFSv4.1 storage | |||
type, and its use. | type, and its use. | |||
Examples of other storage protocols that could be used with NFSv4.1's | Examples of other storage protocols that could be used with NFSv4.1's | |||
pNFS are: | pNFS are: | |||
skipping to change at page 268, line 8 | skipping to change at page 268, line 8 | |||
and performs a WRITE to a storage device, the storage device is | and performs a WRITE to a storage device, the storage device is | |||
allowed to reject that WRITE. | allowed to reject that WRITE. | |||
The iomode does not conflict with OPEN share modes or lock requests; | The iomode does not conflict with OPEN share modes or lock requests; | |||
open mode and lock conflicts are enforced as they are without the use | open mode and lock conflicts are enforced as they are without the use | |||
of pNFS, and are logically separate from the pNFS layout level. As | of pNFS, and are logically separate from the pNFS layout level. As | |||
well, open modes and locks are the preferred method for restricting | well, open modes and locks are the preferred method for restricting | |||
user access to data files. For example, an OPEN of read, deny-write | user access to data files. For example, an OPEN of read, deny-write | |||
does not conflict with a LAYOUTGET containing an iomode of READ/WRITE | does not conflict with a LAYOUTGET containing an iomode of READ/WRITE | |||
performed by another client. Applications that depend on writing | performed by another client. Applications that depend on writing | |||
into the same file concurrently may use record locking to serialize | into the same file concurrently may use byte-range locking to | |||
their accesses. | serialize their accesses. | |||
12.2.10. Device IDs | 12.2.10. Device IDs | |||
The device ID (data type deviceid4, see Section 3.3.14) names a group | The device ID (data type deviceid4, see Section 3.3.14) names a group | |||
of storage devices. The scope of a device ID is per pair of client | of storage devices. The scope of a device ID is per pair of client | |||
ID and layout type. In practice, a significant amount of information | ID and layout type. In practice, a significant amount of information | |||
may be required to fully address a storage device. Rather than | may be required to fully address a storage device. Rather than | |||
embedding all such information in a layout, layouts embed device IDs. | embedding all such information in a layout, layouts embed device IDs. | |||
The NFSv4.1 operation GETDEVICEINFO (Section 18.40) is used to | The NFSv4.1 operation GETDEVICEINFO (Section 18.40) is used to | |||
retrieve the complete address information (including all device | retrieve the complete address information (including all device | |||
skipping to change at page 290, line 27 | skipping to change at page 290, line 27 | |||
As mentioned previously, some operations, namely WRITE and LAYOUTGET | As mentioned previously, some operations, namely WRITE and LAYOUTGET | |||
may be rejected during the metadata server's grace period, because to | may be rejected during the metadata server's grace period, because to | |||
provide simple, valid handling during the grace period, the easiest | provide simple, valid handling during the grace period, the easiest | |||
method is to simply reject all non-reclaim pNFS requests and WRITE | method is to simply reject all non-reclaim pNFS requests and WRITE | |||
operations by returning the NFS4ERR_GRACE error. However, depending | operations by returning the NFS4ERR_GRACE error. However, depending | |||
on the storage protocol (which is specific to the layout type) and | on the storage protocol (which is specific to the layout type) and | |||
metadata server implementation, the metadata server may be able to | metadata server implementation, the metadata server may be able to | |||
determine that a particular request is safe. For example, a metadata | determine that a particular request is safe. For example, a metadata | |||
server may save provisional allocation mappings for each file to | server may save provisional allocation mappings for each file to | |||
stable storage, as well as information about potentially conflicting | stable storage, as well as information about potentially conflicting | |||
OPEN share modes and mandatory record locks that might have been in | OPEN share modes and mandatory byte-range locks that might have been | |||
effect at the time of restart, and use this information during the | in effect at the time of restart, and use this information during the | |||
recovery grace period to determine that a WRITE request is safe. | recovery grace period to determine that a WRITE request is safe. | |||
12.7.6. Storage Device Recovery | 12.7.6. Storage Device Recovery | |||
Recovery from storage device restart is mostly dependent upon the | Recovery from storage device restart is mostly dependent upon the | |||
layout type in use. However, there are a few general techniques a | layout type in use. However, there are a few general techniques a | |||
client can use if it discovers a storage device has crashed while | client can use if it discovers a storage device has crashed while | |||
holding modified, uncommitted data that was asynchronously written. | holding modified, uncommitted data that was asynchronously written. | |||
First and foremost, it is important to realize that the client is the | First and foremost, it is important to realize that the client is the | |||
only one which has the information necessary to recover non-committed | only one which has the information necessary to recover non-committed | |||
skipping to change at page 313, line 27 | skipping to change at page 313, line 27 | |||
the data servers, even though the details of the control protocol may | the data servers, even though the details of the control protocol may | |||
avoid actual transfer of the state under certain circumstances. | avoid actual transfer of the state under certain circumstances. | |||
On the other hand, since advisory lock state is not used for checking | On the other hand, since advisory lock state is not used for checking | |||
I/O accesses at the data servers, there is no semantic reason for | I/O accesses at the data servers, there is no semantic reason for | |||
propagating advisory lock state to the data servers. Since updates | propagating advisory lock state to the data servers. Since updates | |||
to advisory locks neither confer nor remove privileges, these changes | to advisory locks neither confer nor remove privileges, these changes | |||
need not be propagated immediately, and may not need to be propagated | need not be propagated immediately, and may not need to be propagated | |||
promptly. The updates to advisory locks need only be propagated when | promptly. The updates to advisory locks need only be propagated when | |||
the data server needs to resolve a question about a stateid. In | the data server needs to resolve a question about a stateid. In | |||
fact, if record locking is not mandatory (i.e., is advisory) the | fact, if byte-range locking is not mandatory (i.e., is advisory) the | |||
clients are advised not to use the lock-based stateids for I/O at | clients are advised not to use the lock-based stateids for I/O at | |||
all. The stateids returned by open are sufficient and eliminate | all. The stateids returned by open are sufficient and eliminate | |||
overhead for this kind of state propagation. | overhead for this kind of state propagation. | |||
If a client gets back an NFS4ERR_LOCKED error from a data server, | If a client gets back an NFS4ERR_LOCKED error from a data server, | |||
this is an indication that mandatory record locking is in force. The | this is an indication that mandatory byte-range locking is in force. | |||
client recovers from this by getting a record lock that covers the | The client recovers from this by getting a byte-range lock that | |||
affected range and re-sends the I/O with the stateid of the record | covers the affected range and re-sends the I/O with the stateid of | |||
lock. | the byte-range lock. | |||
13.9.2.2. Open and Deny Mode Validation | 13.9.2.2. Open and Deny Mode Validation | |||
Open and deny mode validation MUST be performed against the open and | Open and deny mode validation MUST be performed against the open and | |||
deny mode(s) held by the data servers. When access is reduced or a | deny mode(s) held by the data servers. When access is reduced or a | |||
deny mode made more restrictive (because of CLOSE or DOWNGRADE) the | deny mode made more restrictive (because of CLOSE or DOWNGRADE) the | |||
data server MUST prevent any I/Os that would be denied if performed | data server MUST prevent any I/Os that would be denied if performed | |||
on the metadata server. When access is expanded, the data server | on the metadata server. When access is expanded, the data server | |||
MUST make sure that no requests are subsequently rejected because of | MUST make sure that no requests are subsequently rejected because of | |||
open or deny issues that no longer apply, given the previous | open or deny issues that no longer apply, given the previous | |||
skipping to change at page 392, line 7 | skipping to change at page 392, line 7 | |||
}; | }; | |||
18.2.3. DESCRIPTION | 18.2.3. DESCRIPTION | |||
The CLOSE operation releases share reservations for the regular or | The CLOSE operation releases share reservations for the regular or | |||
named attribute file as specified by the current filehandle. The | named attribute file as specified by the current filehandle. The | |||
share reservations and other state information released at the server | share reservations and other state information released at the server | |||
as a result of this CLOSE is only that associated with the supplied | as a result of this CLOSE is only that associated with the supplied | |||
stateid. State associated with other OPENs is not affected. | stateid. State associated with other OPENs is not affected. | |||
If record locks are held, the client SHOULD release all locks before | If byte-range locks are held, the client SHOULD release all locks | |||
issuing a CLOSE. The server MAY free all outstanding locks on CLOSE | before issuing a CLOSE. The server MAY free all outstanding locks on | |||
but some servers may not support the CLOSE of a file that still has | CLOSE but some servers may not support the CLOSE of a file that still | |||
record locks held. The server MUST return failure if any locks would | has byte-range locks held. The server MUST return failure if any | |||
exist after the CLOSE. | locks would exist after the CLOSE. | |||
The argument seqid MAY have any value and the server MUST ignore | The argument seqid MAY have any value and the server MUST ignore | |||
seqid. | seqid. | |||
On success, the current filehandle retains its value. | On success, the current filehandle retains its value. | |||
The server MAY require that the principal, security flavor, and | The server MAY require that the principal, security flavor, and | |||
applicable, the GSS mechanism, combination that sent the OPEN request | applicable, the GSS mechanism, combination that sent the OPEN request | |||
also be the one to CLOSE the file. This might not be possible if | also be the one to CLOSE the file. This might not be possible if | |||
credentials for the principal are no longer available. The server | credentials for the principal are no longer available. The server | |||
skipping to change at page 406, line 29 | skipping to change at page 406, line 29 | |||
case NFS4_OK: | case NFS4_OK: | |||
LOCK4resok resok4; | LOCK4resok resok4; | |||
case NFS4ERR_DENIED: | case NFS4ERR_DENIED: | |||
LOCK4denied denied; | LOCK4denied denied; | |||
default: | default: | |||
void; | void; | |||
}; | }; | |||
18.10.3. DESCRIPTION | 18.10.3. DESCRIPTION | |||
The LOCK operation requests a record lock for the byte range | The LOCK operation requests a byte-range lock for the byte range | |||
specified by the offset and length parameters. The lock type is also | specified by the offset and length parameters. The lock type is also | |||
specified to be one of the nfs_lock_type4s. If this is a reclaim | specified to be one of the nfs_lock_type4s. If this is a reclaim | |||
request, the reclaim parameter will be TRUE. | request, the reclaim parameter will be TRUE. | |||
Bytes in a file may be locked even if those bytes are not currently | Bytes in a file may be locked even if those bytes are not currently | |||
allocated to the file. To lock the file from a specific offset | allocated to the file. To lock the file from a specific offset | |||
through the end-of-file (no matter how long the file actually is) use | through the end-of-file (no matter how long the file actually is) use | |||
a length field with all bits set to 1 (one). If the length is zero, | a length field with all bits set to 1 (one). If the length is zero, | |||
or if a length which is not all bits set to one is specified, and | or if a length which is not all bits set to one is specified, and | |||
length when added to the offset exceeds the maximum 64-bit unsigned | length when added to the offset exceeds the maximum 64-bit unsigned | |||
skipping to change at page 407, line 7 | skipping to change at page 407, line 7 | |||
client specifies a range that overlaps one or more bytes beyond | client specifies a range that overlaps one or more bytes beyond | |||
offset 0xFFFFFFFF, but does not end at the maximum 64 bit offset | offset 0xFFFFFFFF, but does not end at the maximum 64 bit offset | |||
(i.e. 0xFFFFFFFFFFFFFFFF), such a 32-bit server MUST return the error | (i.e. 0xFFFFFFFFFFFFFFFF), such a 32-bit server MUST return the error | |||
NFS4ERR_BAD_RANGE. | NFS4ERR_BAD_RANGE. | |||
If the server returns NFS4ERR_DENIED, owner, offset, and length of a | If the server returns NFS4ERR_DENIED, owner, offset, and length of a | |||
conflicting lock are returned. | conflicting lock are returned. | |||
The locker argument specifies the lock-owner that is associated with | The locker argument specifies the lock-owner that is associated with | |||
the LOCK request. The locker4 structure is a switched union that | the LOCK request. The locker4 structure is a switched union that | |||
indicates whether the client has already created record locking state | indicates whether the client has already created byte-range locking | |||
associated with the current open file and lock-owner. In the case in | state associated with the current open file and lock-owner. In the | |||
which it has, the argument is just a stateid for the set of locks | case in which it has, the argument is just a stateid for the set of | |||
associated with that open file and lock-owner, together with a | locks associated with that open file and lock-owner, together with a | |||
lock_seqid value which MAY be any value and MUST be ignored by the | lock_seqid value which MAY be any value and MUST be ignored by the | |||
server. In the case where no such state has been established, or the | server. In the case where no such state has been established, or the | |||
client does not have the stateid available, the argument contains the | client does not have the stateid available, the argument contains the | |||
stateid of the open file with which this lock is to be associated, | stateid of the open file with which this lock is to be associated, | |||
together with the lock-owner with which the lock is to be associated. | together with the lock-owner with which the lock is to be associated. | |||
The open_to_lock_owner case covers the very first lock done by a | The open_to_lock_owner case covers the very first lock done by a | |||
lock-owner for a given open file and offers a method to use the | lock-owner for a given open file and offers a method to use the | |||
established state of the open_stateid to transition to the use of a | established state of the open_stateid to transition to the use of a | |||
lock stateid. | lock stateid. | |||
skipping to change at page 408, line 15 | skipping to change at page 408, line 15 | |||
includes multiple locks already granted to that lock-owner, in whole | includes multiple locks already granted to that lock-owner, in whole | |||
or in part, and the server does not support such locking operations | or in part, and the server does not support such locking operations | |||
(i.e. does not support POSIX locking semantics), the server will | (i.e. does not support POSIX locking semantics), the server will | |||
return the error NFS4ERR_LOCK_RANGE. In that case, the client may | return the error NFS4ERR_LOCK_RANGE. In that case, the client may | |||
return an error, or it may emulate the required operations, using | return an error, or it may emulate the required operations, using | |||
only LOCK for ranges that do not include any bytes already locked by | only LOCK for ranges that do not include any bytes already locked by | |||
that lock-owner and LOCKU of locks held by that lock-owner | that lock-owner and LOCKU of locks held by that lock-owner | |||
(specifying an exactly-matching range and type). Similarly, when the | (specifying an exactly-matching range and type). Similarly, when the | |||
client makes a lock request that amounts to upgrading (changing from | client makes a lock request that amounts to upgrading (changing from | |||
a read lock to a write lock) or downgrading (changing from write lock | a read lock to a write lock) or downgrading (changing from write lock | |||
to a read lock) an existing record lock, and the server does not | to a read lock) an existing byte-range lock, and the server does not | |||
support such a lock, the server will return NFS4ERR_LOCK_NOTSUPP. | support such a lock, the server will return NFS4ERR_LOCK_NOTSUPP. | |||
Such operations may not perfectly reflect the required semantics in | Such operations may not perfectly reflect the required semantics in | |||
the face of conflicting lock requests from other clients. | the face of conflicting lock requests from other clients. | |||
When a client holds a write delegation, the client holding that | When a client holds a write delegation, the client holding that | |||
delegation is assured that there are no opens by other clients. | delegation is assured that there are no opens by other clients. | |||
Thus, there can be no conflicting LOCK requests from such clients. | Thus, there can be no conflicting LOCK requests from such clients. | |||
Therefore, the client may be handling locking requests locally, | Therefore, the client may be handling locking requests locally, | |||
without doing LOCK operations on the server. If it does that, it | without doing LOCK operations on the server. If it does that, it | |||
must be prepared to update the lock status on the server, by doing | must be prepared to update the lock status on the server, by doing | |||
skipping to change at page 410, line 47 | skipping to change at page 410, line 47 | |||
union LOCKU4res switch (nfsstat4 status) { | union LOCKU4res switch (nfsstat4 status) { | |||
case NFS4_OK: | case NFS4_OK: | |||
stateid4 lock_stateid; | stateid4 lock_stateid; | |||
default: | default: | |||
void; | void; | |||
}; | }; | |||
18.12.3. DESCRIPTION | 18.12.3. DESCRIPTION | |||
The LOCKU operation unlocks the record lock specified by the | The LOCKU operation unlocks the byte-range lock specified by the | |||
parameters. The client may set the locktype field to any value that | parameters. The client may set the locktype field to any value that | |||
is legal for the nfs_lock_type4 enumerated type, and the server MUST | is legal for the nfs_lock_type4 enumerated type, and the server MUST | |||
accept any legal value for locktype. Any legal value for locktype | accept any legal value for locktype. Any legal value for locktype | |||
has no effect on the success or failure of the LOCKU operation. | has no effect on the success or failure of the LOCKU operation. | |||
The ranges are specified as for LOCK. The NFS4ERR_INVAL and | The ranges are specified as for LOCK. The NFS4ERR_INVAL and | |||
NFS4ERR_BAD_RANGE errors are returned under the same circumstances as | NFS4ERR_BAD_RANGE errors are returned under the same circumstances as | |||
for LOCK. | for LOCK. | |||
The seqid parameter MAY be any value and the server MUST ignore it. | The seqid parameter MAY be any value and the server MUST ignore it. | |||
skipping to change at page 440, line 47 | skipping to change at page 440, line 47 | |||
is returned with a data length set to 0 (zero) and eof is set to | is returned with a data length set to 0 (zero) and eof is set to | |||
TRUE. The READ is subject to access permissions checking. | TRUE. The READ is subject to access permissions checking. | |||
If the client specifies a count value of 0 (zero), the READ succeeds | If the client specifies a count value of 0 (zero), the READ succeeds | |||
and returns 0 (zero) bytes of data again subject to access | and returns 0 (zero) bytes of data again subject to access | |||
permissions checking. The server may choose to return fewer bytes | permissions checking. The server may choose to return fewer bytes | |||
than specified by the client. The client needs to check for this | than specified by the client. The client needs to check for this | |||
condition and handle the condition appropriately. | condition and handle the condition appropriately. | |||
Except when special stateids are used, the stateid value for a READ | Except when special stateids are used, the stateid value for a READ | |||
request represents a value returned from a previous record lock or | request represents a value returned from a previous byte-range lock | |||
share reservation request or the stateid associated with a | or share reservation request or the stateid associated with a | |||
delegation. The stateid identifies the associated owners if any and | delegation. The stateid identifies the associated owners if any and | |||
is used by the server to verify that the associated locks are still | is used by the server to verify that the associated locks are still | |||
valid (e.g. have not been revoked). | valid (e.g. have not been revoked). | |||
If the read ended at the end-of-file (formally, in a correctly formed | If the read ended at the end-of-file (formally, in a correctly formed | |||
READ request, if offset + count is equal to the size of the file), or | READ request, if offset + count is equal to the size of the file), or | |||
the read request extends beyond the size of the file (if offset + | the read request extends beyond the size of the file (if offset + | |||
count is greater than the size of the file), eof is returned as TRUE; | count is greater than the size of the file), eof is returned as TRUE; | |||
otherwise it is FALSE. A successful READ of an empty file will | otherwise it is FALSE. A successful READ of an empty file will | |||
always return eof as TRUE. | always return eof as TRUE. | |||
skipping to change at page 441, line 45 | skipping to change at page 441, line 45 | |||
what the requesting client believes to be the case. This would | what the requesting client believes to be the case. This would | |||
reduce the actual amount of data available to the client. It is | reduce the actual amount of data available to the client. It is | |||
possible that the server may back off the transfer size and reduce | possible that the server may back off the transfer size and reduce | |||
the read request return. Server resource exhaustion may also occur | the read request return. Server resource exhaustion may also occur | |||
necessitating a smaller read return. | necessitating a smaller read return. | |||
If mandatory file locking is in effect for the file, and if the | If mandatory file locking is in effect for the file, and if the | |||
region corresponding to the data to be read from file is write locked | region corresponding to the data to be read from file is write locked | |||
by an owner not associated the stateid, the server will return the | by an owner not associated the stateid, the server will return the | |||
NFS4ERR_LOCKED error. The client should try to get the appropriate | NFS4ERR_LOCKED error. The client should try to get the appropriate | |||
read record lock via the LOCK operation before re-attempting the | read byte-range lock via the LOCK operation before re-attempting the | |||
READ. When the READ completes, the client should release the record | READ. When the READ completes, the client should release the byte- | |||
lock via LOCKU. | range lock via LOCKU. | |||
If another client has a write delegation for the file being read, the | If another client has a write delegation for the file being read, the | |||
delegation must be recalled, and the operation cannot proceed until | delegation must be recalled, and the operation cannot proceed until | |||
that delegation is returned or revoked. Except where this happens | that delegation is returned or revoked. Except where this happens | |||
very quickly, one or more NFS4ERR_DELAY errors will be returned to | very quickly, one or more NFS4ERR_DELAY errors will be returned to | |||
requests made while the delegation remains outstanding. Normally, | requests made while the delegation remains outstanding. Normally, | |||
delegations will not be recalled as a result of a READ operation | delegations will not be recalled as a result of a READ operation | |||
since the recall will occur as a result of an earlier OPEN. However, | since the recall will occur as a result of an earlier OPEN. However, | |||
since it is possible for a READ to be done with a special stateid, | since it is possible for a READ to be done with a special stateid, | |||
the server needs to check for this case even though the client should | the server needs to check for this case even though the client should | |||
skipping to change at page 458, line 26 | skipping to change at page 458, line 26 | |||
the attributes that follow the bitmap in bit order. | the attributes that follow the bitmap in bit order. | |||
The stateid argument for SETATTR is used to provide file locking | The stateid argument for SETATTR is used to provide file locking | |||
context that is necessary for SETATTR requests that set the size | context that is necessary for SETATTR requests that set the size | |||
attribute. Since setting the size attribute modifies the file's | attribute. Since setting the size attribute modifies the file's | |||
data, it has the same locking requirements as a corresponding WRITE. | data, it has the same locking requirements as a corresponding WRITE. | |||
Any SETATTR that sets the size attribute is incompatible with a share | Any SETATTR that sets the size attribute is incompatible with a share | |||
reservation that specifies DENY_WRITE. The area between the old end- | reservation that specifies DENY_WRITE. The area between the old end- | |||
of-file and the new end-of-file is considered to be modified just as | of-file and the new end-of-file is considered to be modified just as | |||
would have been the case had the area in question been specified as | would have been the case had the area in question been specified as | |||
the target of WRITE, for the purpose of checking conflicts with | the target of WRITE, for the purpose of checking conflicts with byte- | |||
record locks, for those cases in which a server is implementing | range locks, for those cases in which a server is implementing | |||
mandatory record locking behavior. A valid stateid should always be | mandatory byte-range locking behavior. A valid stateid should always | |||
specified. When the file size attribute is not set, the special | be specified. When the file size attribute is not set, the special | |||
stateid consisting of all bits zero should be passed. | stateid consisting of all bits zero should be passed. | |||
On either success or failure of the operation, the server will return | On either success or failure of the operation, the server will return | |||
the attrsset bitmask to represent what (if any) attributes were | the attrsset bitmask to represent what (if any) attributes were | |||
successfully set. The attrsset in the response is a subset of the | successfully set. The attrsset in the response is a subset of the | |||
bitmap4 that is part of the obj_attributes in the argument. | bitmap4 that is part of the obj_attributes in the argument. | |||
On success, the current filehandle retains its value. | On success, the current filehandle retains its value. | |||
18.30.4. IMPLEMENTATION | 18.30.4. IMPLEMENTATION | |||
skipping to change at page 463, line 22 | skipping to change at page 463, line 22 | |||
UNSTABLE4, the server is free to commit any part of the data and the | UNSTABLE4, the server is free to commit any part of the data and the | |||
metadata to stable storage, including all or none, before returning a | metadata to stable storage, including all or none, before returning a | |||
reply to the client. There is no guarantee whether or when any | reply to the client. There is no guarantee whether or when any | |||
uncommitted data will subsequently be committed to stable storage. | uncommitted data will subsequently be committed to stable storage. | |||
The only guarantees made by the server are that it will not destroy | The only guarantees made by the server are that it will not destroy | |||
any data without changing the value of verf and that it will not | any data without changing the value of verf and that it will not | |||
commit the data and metadata at a level less than that requested by | commit the data and metadata at a level less than that requested by | |||
the client. | the client. | |||
Except when special stateids are used, the stateid value for a WRITE | Except when special stateids are used, the stateid value for a WRITE | |||
request represents a value returned from a previous record lock or | request represents a value returned from a previous byte-range lock | |||
share reservation request or the stateid associated with a | or share reservation request or the stateid associated with a | |||
delegation. The stateid identifies the associated owners if any and | delegation. The stateid identifies the associated owners if any and | |||
is used by the server to verify that the associated locks are still | is used by the server to verify that the associated locks are still | |||
valid (e.g. have not been revoked). | valid (e.g. have not been revoked). | |||
Upon successful completion, the following results are returned. The | Upon successful completion, the following results are returned. The | |||
count result is the number of bytes of data written to the file. The | count result is the number of bytes of data written to the file. The | |||
server may write fewer bytes than requested. If so, the actual | server may write fewer bytes than requested. If so, the actual | |||
number of bytes written starting at location, offset, is returned. | number of bytes written starting at location, offset, is returned. | |||
The server also returns an indication of the level of commitment of | The server also returns an indication of the level of commitment of | |||
skipping to change at page 465, line 33 | skipping to change at page 465, line 33 | |||
been committed on the server. | been committed on the server. | |||
Some implementations may return NFS4ERR_NOSPC instead of | Some implementations may return NFS4ERR_NOSPC instead of | |||
NFS4ERR_DQUOT when a user's quota is exceeded. In the case that the | NFS4ERR_DQUOT when a user's quota is exceeded. In the case that the | |||
current filehandle is of type NF4DIR, the server will return | current filehandle is of type NF4DIR, the server will return | |||
NFS4ERR_ISDIR. If the current file is a symbolic link, the error | NFS4ERR_ISDIR. If the current file is a symbolic link, the error | |||
NFS4ERR_SYMLINK will be returned. Otherwise, if the current | NFS4ERR_SYMLINK will be returned. Otherwise, if the current | |||
filehandle does not designate an ordinary file, the server will | filehandle does not designate an ordinary file, the server will | |||
return NFS4ERR_WRONG_TYPE. | return NFS4ERR_WRONG_TYPE. | |||
If mandatory file locking is on for the file, and corresponding | If mandatory file locking is on for the file, and corresponding byte- | |||
record of the data to be written file is read or write locked by an | range of the data to be written file is read or write locked by an | |||
owner that is not associated with the stateid, the server will return | owner that is not associated with the stateid, the server will return | |||
NFS4ERR_LOCKED. If so, the client must check if the owner | NFS4ERR_LOCKED. If so, the client must check if the owner | |||
corresponding to the stateid used with the WRITE operation has a | corresponding to the stateid used with the WRITE operation has a | |||
conflicting read lock that overlaps with the region that was to be | conflicting read lock that overlaps with the region that was to be | |||
written. If the stateid's owner has no conflicting read lock, then | written. If the stateid's owner has no conflicting read lock, then | |||
the client should try to get the appropriate write record lock via | the client should try to get the appropriate write byte-range lock | |||
the LOCK operation before re-attempting the WRITE. When the WRITE | via the LOCK operation before re-attempting the WRITE. When the | |||
completes, the client should release the record lock via LOCKU. | WRITE completes, the client should release the byte-range lock via | |||
LOCKU. | ||||
If the stateid's owner had a conflicting read lock, then the client | If the stateid's owner had a conflicting read lock, then the client | |||
has no choice but to return an error to the application that | has no choice but to return an error to the application that | |||
attempted the WRITE. The reason is that since the stateid's owner | attempted the WRITE. The reason is that since the stateid's owner | |||
had a read lock, the server either attempted to temporarily | had a read lock, the server either attempted to temporarily | |||
effectively upgrade this read lock to a write lock, or the server has | effectively upgrade this read lock to a write lock, or the server has | |||
no upgrade capability. If the server attempted to upgrade the read | no upgrade capability. If the server attempted to upgrade the read | |||
lock and failed, it is pointless for the client to re-attempt the | lock and failed, it is pointless for the client to re-attempt the | |||
upgrade via the LOCK operation, because there might be another client | upgrade via the LOCK operation, because there might be another client | |||
also trying to upgrade. If two clients are blocked trying upgrade | also trying to upgrade. If two clients are blocked trying upgrade | |||
skipping to change at page 498, line 35 | skipping to change at page 498, line 35 | |||
18.38.2. RESULT | 18.38.2. RESULT | |||
struct FREE_STATEID4res { | struct FREE_STATEID4res { | |||
nfsstat4 fsr_status; | nfsstat4 fsr_status; | |||
}; | }; | |||
18.38.3. DESCRIPTION | 18.38.3. DESCRIPTION | |||
The FREE_STATEID operation is used to free a stateid which no longer | The FREE_STATEID operation is used to free a stateid which no longer | |||
has any associated locks (including opens, record locks, delegations, | has any associated locks (including opens, byte-range locks, | |||
layouts). This may be because of client unlock operations or because | delegations, layouts). This may be because of client unlock | |||
of server revocation. If there are valid locks (of any kind) | operations or because of server revocation. If there are valid locks | |||
associated with the stateid in question, the error NFS4ERR_LOCKS_HELD | (of any kind) associated with the stateid in question, the error | |||
will be returned, and the associated stateid will not be freed. | NFS4ERR_LOCKS_HELD will be returned, and the associated stateid will | |||
not be freed. | ||||
When a stateid is freed which had been associated with revoked locks, | When a stateid is freed which had been associated with revoked locks, | |||
the client, by doing the FREE_STATEID acknowledges the loss of those | the client, by doing the FREE_STATEID acknowledges the loss of those | |||
locks. This allows the server, once all such revoked state is | locks. This allows the server, once all such revoked state is | |||
acknowledged, to allow that client again to reclaim locks, without | acknowledged, to allow that client again to reclaim locks, without | |||
encountering the edge conditions discussed in Section 8.4.2. | encountering the edge conditions discussed in Section 8.4.2. | |||
Once a successful FREE_STATEID is done for a given stateid, any | Once a successful FREE_STATEID is done for a given stateid, any | |||
subsequent use of that stateid will result in an NFS4ERR_BAD_STATEID | subsequent use of that stateid will result in an NFS4ERR_BAD_STATEID | |||
error. | error. | |||
skipping to change at page 512, line 27 | skipping to change at page 512, line 27 | |||
1 overlaps two or more striping patterns. In which case, | 1 overlaps two or more striping patterns. In which case, | |||
logr_layout will contain two or more elements, and the sum of the | logr_layout will contain two or more elements, and the sum of the | |||
lo_length fields of each element MUST be at least loga_minlength | lo_length fields of each element MUST be at least loga_minlength | |||
unless the first exception also applies. | unless the first exception also applies. | |||
If this requirement cannot be met, the server MUST NOT return a | If this requirement cannot be met, the server MUST NOT return a | |||
layout and the error NFS4ERR_BADLAYOUT MUST be returned. | layout and the error NFS4ERR_BADLAYOUT MUST be returned. | |||
The loga_stateid field specifies a valid stateid. If a layout is not | The loga_stateid field specifies a valid stateid. If a layout is not | |||
currently held by the client, the loga_stateid field represents a | currently held by the client, the loga_stateid field represents a | |||
stateid reflecting the correspondingly valid open, record lock, or | stateid reflecting the correspondingly valid open, byte-range lock, | |||
delegation stateid. Once a layout is held by the client for the | or delegation stateid. Once a layout is held by the client for the | |||
file, the loga_stateid field is a stateid as returned from a previous | file, the loga_stateid field is a stateid as returned from a previous | |||
LAYOUTGET or LAYOUTRETURN operation or provided by a CB_LAYOUTRECALL | LAYOUTGET or LAYOUTRETURN operation or provided by a CB_LAYOUTRECALL | |||
operation (see Section 12.5.3). | operation (see Section 12.5.3). | |||
The loga_maxcount field specifies the maximum layout size (in bytes) | The loga_maxcount field specifies the maximum layout size (in bytes) | |||
that the client can handle. If the size of the layout structure | that the client can handle. If the size of the layout structure | |||
exceeds the size specified by maxcount, the metadata server will | exceeds the size specified by maxcount, the metadata server will | |||
return the NFS4ERR_TOOSMALL error. | return the NFS4ERR_TOOSMALL error. | |||
The returned layout is expressed as an array, logr_layout, with each | The returned layout is expressed as an array, logr_layout, with each | |||
End of changes. 52 change blocks. | ||||
180 lines changed or deleted | 183 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |