Found wdiff, but it reported no recognisable version. Falling back to builtin diff colouring...
| draft-pre-record-byte-range.txt | draft-ietf-nfsv4-minorversion1-22.txt | |||
|---|---|---|---|---|
| skipping to change at page 14, line 5 | skipping to change at page 14, line 5 | |||
| is irrevocably granted a lock. At the end of a lease period the | is irrevocably granted a lock. At the end of a lease period the | |||
| lock may be revoked if the lease has not been extended. The lock | lock may be revoked if the lease has not been extended. The lock | |||
| must be revoked if a conflicting lock has been granted after the | must be revoked if a conflicting lock has been granted after the | |||
| lease interval. | lease interval. | |||
| All leases granted by a server have the same fixed interval. Note | All leases granted by a server have the same fixed interval. Note | |||
| that the fixed interval was chosen to alleviate the expense a | that the fixed interval was chosen to alleviate the expense a | |||
| server would have in maintaining state about variable length | server would have in maintaining state about variable length | |||
| leases across server failures. | leases across server failures. | |||
| Lock The term "lock" is used to refer to record (byte-range) locks, | Lock The term "lock" is used to refer to byte-range (in UNIX | |||
| share reservations, delegations, or layouts unless specifically | environments, also known as record) locks, share reservations, | |||
| stated otherwise. | delegations, or layouts unless specifically stated otherwise. | |||
| Server The "Server" is the entity responsible for coordinating | Server The "Server" is the entity responsible for coordinating | |||
| client access to a set of file systems and is identified by a | client access to a set of file systems and is identified by a | |||
| Server owner. A server can span multiple network addresses. | Server owner. A server can span multiple network addresses. | |||
| Server Owner The "Server Owner" identifies the server to the client. | Server Owner The "Server Owner" identifies the server to the client. | |||
| The server owner consists of a major and minor identifier. When | The server owner consists of a major and minor identifier. When | |||
| the client has two connections each to a peer with the same major | the client has two connections each to a peer with the same major | |||
| identifier, the client assumes both peers are the same server (the | identifier, the client assumes both peers are the same server (the | |||
| server namespace is the same via each connection), and assumes and | server namespace is the same via each connection), and assumes and | |||
| skipping to change at page 147, line 36 | skipping to change at page 147, line 36 | |||
| Where there is concern about the security of data on the network, | Where there is concern about the security of data on the network, | |||
| clients should use strong security mechanisms to access the pseudo | clients should use strong security mechanisms to access the pseudo | |||
| file system in order to prevent man-in-the-middle attacks. | file system in order to prevent man-in-the-middle attacks. | |||
| 8. State Management | 8. State Management | |||
| Integrating locking into the NFS protocol necessarily causes it to be | Integrating locking into the NFS protocol necessarily causes it to be | |||
| stateful. With the inclusion of such features as share reservations, | stateful. With the inclusion of such features as share reservations, | |||
| file and directory delegations, recallable layouts, and support for | file and directory delegations, recallable layouts, and support for | |||
| mandatory record locking, the protocol becomes substantially more | mandatory byte-range locking, the protocol becomes substantially more | |||
| dependent on proper management of state than the traditional | dependent on proper management of state than the traditional | |||
| combination of NFS and NLM [36]. These features include expanded | combination of NFS and NLM [36]. These features include expanded | |||
| locking facilities, which provide some measure of interclient | locking facilities, which provide some measure of interclient | |||
| exclusion, but the state also offers features not readily providable | exclusion, but the state also offers features not readily providable | |||
| using a stateless model. There are three components to making this | using a stateless model. There are three components to making this | |||
| state manageable: | state manageable: | |||
| o Clear division between client and server | o Clear division between client and server | |||
| o Ability to reliably detect inconsistency in state between client | o Ability to reliably detect inconsistency in state between client | |||
| skipping to change at page 148, line 41 | skipping to change at page 148, line 41 | |||
| For some types of locking interactions, the client will represent | For some types of locking interactions, the client will represent | |||
| some number of internal locking entities called "owners", which | some number of internal locking entities called "owners", which | |||
| normally correspond to processes internal to the client. For other | normally correspond to processes internal to the client. For other | |||
| types of locking-related objects, such as delegations and layouts, no | types of locking-related objects, such as delegations and layouts, no | |||
| such intermediate entities are provided for, and the locking-related | such intermediate entities are provided for, and the locking-related | |||
| objects are considered to be transferred directly between the server | objects are considered to be transferred directly between the server | |||
| and a unitary client. | and a unitary client. | |||
| 8.2. Stateid Definition | 8.2. Stateid Definition | |||
| When the server grants a lock of any type (including opens, record | When the server grants a lock of any type (including opens, byte- | |||
| locks, delegations, and layouts) it responds with a unique stateid, | range locks, delegations, and layouts) it responds with a unique | |||
| that represents a set of locks (often a single lock) for the same | stateid, that represents a set of locks (often a single lock) for the | |||
| file, of the same type, and sharing the same ownership | same file, of the same type, and sharing the same ownership | |||
| characteristics. Thus opens of the same file by different open- | characteristics. Thus opens of the same file by different open- | |||
| owners each have an identifying stateid. Similarly, each set of | owners each have an identifying stateid. Similarly, each set of | |||
| record locks on a file owned by a specific lock-owner has its own | byte-range locks on a file owned by a specific lock-owner has its own | |||
| identifying stateid. Delegations and layouts also have associated | identifying stateid. Delegations and layouts also have associated | |||
| stateids by which they may be referenced. The stateid is used as a | stateids by which they may be referenced. The stateid is used as a | |||
| shorthand reference to a lock or set of locks and given a stateid the | shorthand reference to a lock or set of locks and given a stateid the | |||
| server can determine the associated state-owner or state-owners (in | server can determine the associated state-owner or state-owners (in | |||
| the case of an open-owner/lock-owner pair) and the associated | the case of an open-owner/lock-owner pair) and the associated | |||
| filehandle. When stateids are used, the current filehandle must be | filehandle. When stateids are used, the current filehandle must be | |||
| the one associated with that stateid. | the one associated with that stateid. | |||
| All stateids associated with a given client ID are associated with a | All stateids associated with a given client ID are associated with a | |||
| common lease which represents the claim of those stateids and the | common lease which represents the claim of those stateids and the | |||
| skipping to change at page 153, line 14 | skipping to change at page 153, line 14 | |||
| the operation to which the stateid is passed will return | the operation to which the stateid is passed will return | |||
| NFS4ERR_BAD_STATEID. | NFS4ERR_BAD_STATEID. | |||
| 8.2.4. Stateid Lifetime and Validation | 8.2.4. Stateid Lifetime and Validation | |||
| Stateids must remain valid until either a client restart or a server | Stateids must remain valid until either a client restart or a server | |||
| restart or until the client returns all of the locks associated with | restart or until the client returns all of the locks associated with | |||
| the stateid by means of an operation such as CLOSE or DELEGRETURN. | the stateid by means of an operation such as CLOSE or DELEGRETURN. | |||
| If the locks are lost due to revocation the stateid remains a valid | If the locks are lost due to revocation the stateid remains a valid | |||
| designation of that revoked state until the client frees it by using | designation of that revoked state until the client frees it by using | |||
| FREE_STATEID. Stateids associated with record locks are an | FREE_STATEID. Stateids associated with byte-range locks are an | |||
| exception. They remain valid even if a LOCKU frees all remaining | exception. They remain valid even if a LOCKU frees all remaining | |||
| locks, so long as the open file with which they are associated | locks, so long as the open file with which they are associated | |||
| remains open, unless the client does a FREE_STATEID to cause the | remains open, unless the client does a FREE_STATEID to cause the | |||
| stateid to be freed. | stateid to be freed. | |||
| It should be noted that there are situations in which the client's | It should be noted that there are situations in which the client's | |||
| locks become invalid, without the client requesting they be returned. | locks become invalid, without the client requesting they be returned. | |||
| These include lease expiration and a number of forms of lock | These include lease expiration and a number of forms of lock | |||
| revocation within the lease period. It is important to note that in | revocation within the lease period. It is important to note that in | |||
| these situations, the stateid remains valid and the client can use it | these situations, the stateid remains valid and the client can use it | |||
| skipping to change at page 154, line 5 | skipping to change at page 154, line 5 | |||
| And then store in each table entry, | And then store in each table entry, | |||
| o The client ID with which the stateid is associated. | o The client ID with which the stateid is associated. | |||
| o The current generation number for the (at most one) valid stateid | o The current generation number for the (at most one) valid stateid | |||
| sharing this index value. | sharing this index value. | |||
| o The filehandle of the file on which the locks are taken. | o The filehandle of the file on which the locks are taken. | |||
| o An indication of the type of stateid (open, record lock, file | o An indication of the type of stateid (open, byte-range lock, file | |||
| delegation, directory delegation, layout). | delegation, directory delegation, layout). | |||
| o The last "seqid" value returned corresponding to the current | o The last "seqid" value returned corresponding to the current | |||
| "other" value. | "other" value. | |||
| o An indication of the current status of the locks associated with | o An indication of the current status of the locks associated with | |||
| this stateid. In particular, whether these have been revoked and | this stateid. In particular, whether these have been revoked and | |||
| if so, for what reason. | if so, for what reason. | |||
| With this information, an incoming stateid can be validated and the | With this information, an incoming stateid can be validated and the | |||
| skipping to change at page 160, line 30 | skipping to change at page 160, line 30 | |||
| establishes its lease before expiration occurs, requests for | establishes its lease before expiration occurs, requests for | |||
| conflicting locks will not be granted. | conflicting locks will not be granted. | |||
| To minimize client delay upon restart, lock requests are associated | To minimize client delay upon restart, lock requests are associated | |||
| with an instance of the client by a client-supplied verifier. This | with an instance of the client by a client-supplied verifier. This | |||
| verifier is part of the client_owner4 sent in the initial EXCHANGE_ID | verifier is part of the client_owner4 sent in the initial EXCHANGE_ID | |||
| call made by the client. The server returns a client ID as a result | call made by the client. The server returns a client ID as a result | |||
| of the EXCHANGE_ID operation. The client then confirms the use of | of the EXCHANGE_ID operation. The client then confirms the use of | |||
| the client ID by establishing a session associated with that client | the client ID by establishing a session associated with that client | |||
| ID (see Section 18.36.3 for a description how this is done). All | ID (see Section 18.36.3 for a description how this is done). All | |||
| locks, including opens, record locks, delegations, and layouts | locks, including opens, byte-range locks, delegations, and layouts | |||
| obtained by sessions using that client ID are associated with that | obtained by sessions using that client ID are associated with that | |||
| client ID. | client ID. | |||
| Since the verifier will be changed by the client upon each | Since the verifier will be changed by the client upon each | |||
| initialization, the server can compare a new verifier to the verifier | initialization, the server can compare a new verifier to the verifier | |||
| associated with currently held locks and determine that they do not | associated with currently held locks and determine that they do not | |||
| match. This signifies the client's new instantiation and subsequent | match. This signifies the client's new instantiation and subsequent | |||
| loss (upon confirmation of the new client ID) of locking state. As a | loss (upon confirmation of the new client ID) of locking state. As a | |||
| result, the server is free to release all locks held which are | result, the server is free to release all locks held which are | |||
| associated with the old client ID which was derived from the old | associated with the old client ID which was derived from the old | |||
| skipping to change at page 163, line 38 | skipping to change at page 163, line 38 | |||
| For a server to provide simple, valid handling during the grace | For a server to provide simple, valid handling during the grace | |||
| period, the easiest method is to simply reject all non-reclaim | period, the easiest method is to simply reject all non-reclaim | |||
| locking requests and READ and WRITE operations by returning the | locking requests and READ and WRITE operations by returning the | |||
| NFS4ERR_GRACE error. However, a server may keep information about | NFS4ERR_GRACE error. However, a server may keep information about | |||
| granted locks in stable storage. With this information, the server | granted locks in stable storage. With this information, the server | |||
| could determine if a regular lock or READ or WRITE operation can be | could determine if a regular lock or READ or WRITE operation can be | |||
| safely processed. | safely processed. | |||
| For example, if the server maintained on stable storage summary | For example, if the server maintained on stable storage summary | |||
| information on whether mandatory locks exist, either mandatory record | information on whether mandatory locks exist, either mandatory byte- | |||
| locks, or share reservations specifying deny modes, many requests | range locks, or share reservations specifying deny modes, many | |||
| could be allowed during the grace period. If it is known that no | requests could be allowed during the grace period. If it is known | |||
| such share reservations exist, OPEN request that do not specify deny | that no such share reservations exist, OPEN request that do not | |||
| modes may be safely granted. If, in addition, it is known that no | specify deny modes may be safely granted. If, in addition, it is | |||
| mandatory record locks exist, either through information stored on | known that no mandatory byte-range locks exist, either through | |||
| stable storage or simply because the server does not support such | information stored on stable storage or simply because the server | |||
| locks, READ and WRITE requests may be safely processed during the | does not support such locks, READ and WRITE requests may be safely | |||
| grace period. Another important case is where it is known that no | processed during the grace period. Another important case is where | |||
| mandatory byte-range locks exist, either because the server does not | it is known that no mandatory byte-range locks exist, either because | |||
| provide support for them, or because their absence is known from | the server does not provide support for them, or because their | |||
| persistently recorded data. In this case, READ and WRITE operations | absence is known from persistently recorded data. In this case, READ | |||
| specifying stateids derived from reclaim-type operation may be | and WRITE operations specifying stateids derived from reclaim-type | |||
| validly processed during the grace period because the fact of the | operation may be validly processed during the grace period because | |||
| valid reclaim ensures that no lock subsequently granted can prevent | the fact of the valid reclaim ensures that no lock subsequently | |||
| the I/O. | granted can prevent the I/O. | |||
| To reiterate, for a server that allows non-reclaim lock and I/O | To reiterate, for a server that allows non-reclaim lock and I/O | |||
| requests to be processed during the grace period, it MUST determine | requests to be processed during the grace period, it MUST determine | |||
| that no lock subsequently reclaimed will be rejected and that no lock | that no lock subsequently reclaimed will be rejected and that no lock | |||
| subsequently reclaimed would have prevented any I/O operation | subsequently reclaimed would have prevented any I/O operation | |||
| processed during the grace period. | processed during the grace period. | |||
| Clients should be prepared for the return of NFS4ERR_GRACE errors for | Clients should be prepared for the return of NFS4ERR_GRACE errors for | |||
| non-reclaim lock and I/O requests. In this case the client should | non-reclaim lock and I/O requests. In this case the client should | |||
| employ a retry mechanism for the request. A delay (on the order of | employ a retry mechanism for the request. A delay (on the order of | |||
| skipping to change at page 167, line 41 | skipping to change at page 167, line 41 | |||
| reclaims, requires that the server record in stable storage | reclaims, requires that the server record in stable storage | |||
| information some minimal information. For example, a server | information some minimal information. For example, a server | |||
| implementation could, for each client, save in stable storage a | implementation could, for each client, save in stable storage a | |||
| record containing: | record containing: | |||
| o the co_ownerid field from the client_owner4 presented in the | o the co_ownerid field from the client_owner4 presented in the | |||
| EXCHANGE_ID operation. | EXCHANGE_ID operation. | |||
| o a boolean that indicates if the client's lease expired or if there | o a boolean that indicates if the client's lease expired or if there | |||
| was administrative intervention (see Section 8.5) to revoke a | was administrative intervention (see Section 8.5) to revoke a | |||
| record lock, share reservation, or delegation and there has been | byte-range lock, share reservation, or delegation and there has | |||
| no acknowledgement, via FREE_STATEID, of such revocation. | been no acknowledgement, via FREE_STATEID, of such revocation. | |||
| o a boolean that indicates whether the client may have locks that it | o a boolean that indicates whether the client may have locks that it | |||
| believes to be reclaimable in situations which the grace period | believes to be reclaimable in situations which the grace period | |||
| was terminated, making the server's view of lock reclaimability | was terminated, making the server's view of lock reclaimability | |||
| suspect. The server will set this for any client record in stable | suspect. The server will set this for any client record in stable | |||
| storage where the client has not done a suitable RECLAIM_COMPLETE | storage where the client has not done a suitable RECLAIM_COMPLETE | |||
| (global or file system-specific depending on the target of the | (global or file system-specific depending on the target of the | |||
| lock request) before it grants any new (i.e. not reclaimed) lock | lock request) before it grants any new (i.e. not reclaimed) lock | |||
| to any client. | to any client. | |||
| Assuming the above record keeping, for the first edge condition, | Assuming the above record keeping, for the first edge condition, | |||
| after the server restarts, the record that client A's lease expired | after the server restarts, the record that client A's lease expired | |||
| means that another client could have acquired a conflicting record | means that another client could have acquired a conflicting byte- | |||
| lock, share reservation, or delegation. Hence the server must reject | range lock, share reservation, or delegation. Hence the server must | |||
| a reclaim from client A with the error NFS4ERR_NO_GRACE. | reject a reclaim from client A with the error NFS4ERR_NO_GRACE. | |||
| For the second edge condition, after the server restarts for a second | For the second edge condition, after the server restarts for a second | |||
| time, the indication that the client had not completed its reclaims | time, the indication that the client had not completed its reclaims | |||
| at the time at which the grace period ended means that the server | at the time at which the grace period ended means that the server | |||
| must reject a reclaim from client A with the error NFS4ERR_NO_GRACE. | must reject a reclaim from client A with the error NFS4ERR_NO_GRACE. | |||
| When either edge condition occurs, the client's attempt to reclaim | When either edge condition occurs, the client's attempt to reclaim | |||
| locks will result in the error NFS4ERR_NO_GRACE. When this is | locks will result in the error NFS4ERR_NO_GRACE. When this is | |||
| received, or after the client restarts with no lock state, the client | received, or after the client restarts with no lock state, the client | |||
| will send a global RECLAIM_COMPLETE. When the RECLAIM_COMPLETE is | will send a global RECLAIM_COMPLETE. When the RECLAIM_COMPLETE is | |||
| received, the server and client are again in agreement regarding | received, the server and client are again in agreement regarding | |||
| reclaimable locks and both booleans in persistent storage can be | reclaimable locks and both booleans in persistent storage can be | |||
| reset, to be set again only when there is a subsequent event that | reset, to be set again only when there is a subsequent event that | |||
| causes lock reclaim operations to be questionable. | causes lock reclaim operations to be questionable. | |||
| Regardless of the level and approach to record keeping, the server | Regardless of the level and approach to record keeping, the server | |||
| MUST implement one of the following strategies (which apply to | MUST implement one of the following strategies (which apply to | |||
| reclaims of share reservations, record locks, and delegations): | reclaims of share reservations, byte-range locks, and delegations): | |||
| 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is extremely | 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is extremely | |||
| unforgiving, but necessary if the server does not record lock | unforgiving, but necessary if the server does not record lock | |||
| state in stable storage. | state in stable storage. | |||
| 2. Record sufficient state in stable storage such that all known | 2. Record sufficient state in stable storage such that all known | |||
| edge conditions involving server restart, including the two noted | edge conditions involving server restart, including the two noted | |||
| in this section, are detected. It is acceptable to erroneously | in this section, are detected. It is acceptable to erroneously | |||
| recognize an edge condition and not allow a reclaim, when, with | recognize an edge condition and not allow a reclaim, when, with | |||
| sufficient knowledge it would be allowed. The error the server | sufficient knowledge it would be allowed. The error the server | |||
| skipping to change at page 169, line 8 | skipping to change at page 169, line 8 | |||
| outside the scope of this specification, since the strategies for | outside the scope of this specification, since the strategies for | |||
| such handling are very dependent on the client's operating | such handling are very dependent on the client's operating | |||
| environment. However, one potential approach is described below. | environment. However, one potential approach is described below. | |||
| When the client receives NFS4ERR_NO_GRACE, it could examine the | When the client receives NFS4ERR_NO_GRACE, it could examine the | |||
| change attribute of the objects the client is trying to reclaim state | change attribute of the objects the client is trying to reclaim state | |||
| for, and use that to determine whether to re-establish the state via | for, and use that to determine whether to re-establish the state via | |||
| normal OPEN or LOCK requests. This is acceptable provided the | normal OPEN or LOCK requests. This is acceptable provided the | |||
| client's operating environment allows it. In other words, the client | client's operating environment allows it. In other words, the client | |||
| implementor is advised to document for his users the behavior. The | implementor is advised to document for his users the behavior. The | |||
| client could also inform the application that its record lock or | client could also inform the application that its byte-range lock or | |||
| share reservations (whether they were delegated or not) have been | share reservations (whether they were delegated or not) have been | |||
| lost, such as via a UNIX signal, a GUI pop-up window, etc. See | lost, such as via a UNIX signal, a GUI pop-up window, etc. See | |||
| Section 10.5 for a discussion of what the client should do for | Section 10.5 for a discussion of what the client should do for | |||
| dealing with unreclaimed delegations on client state. | dealing with unreclaimed delegations on client state. | |||
| For further discussion of revocation of locks see Section 8.5. | For further discussion of revocation of locks see Section 8.5. | |||
| 8.5. Server Revocation of Locks | 8.5. Server Revocation of Locks | |||
| At any point, the server can revoke locks held by a client and the | At any point, the server can revoke locks held by a client and the | |||
| skipping to change at page 172, line 38 | skipping to change at page 172, line 38 | |||
| It is assumed that manipulating a byte-range lock is rare when | It is assumed that manipulating a byte-range lock is rare when | |||
| compared to READ and WRITE operations. It is also assumed that | compared to READ and WRITE operations. It is also assumed that | |||
| server restarts and network partitions are relatively rare. | server restarts and network partitions are relatively rare. | |||
| Therefore it is important that the READ and WRITE operations have a | Therefore it is important that the READ and WRITE operations have a | |||
| lightweight mechanism to indicate if they possess a held lock. A | lightweight mechanism to indicate if they possess a held lock. A | |||
| byte-range lock request contains the heavyweight information required | byte-range lock request contains the heavyweight information required | |||
| to establish a lock and uniquely define the owner of the lock. | to establish a lock and uniquely define the owner of the lock. | |||
| 9.1.1. State-owner Definition | 9.1.1. State-owner Definition | |||
| When opening a file or requesting a record lock, the client must | When opening a file or requesting a byte-range lock, the client must | |||
| specify an identifier which represents the owner of the requested | specify an identifier which represents the owner of the requested | |||
| lock. This identifier is in the form of a state-owner, represented | lock. This identifier is in the form of a state-owner, represented | |||
| in the protocol by a state_owner4, a variable-length opaque array | in the protocol by a state_owner4, a variable-length opaque array | |||
| which, when concatenated with the current client ID uniquely defines | which, when concatenated with the current client ID uniquely defines | |||
| the owner of lock managed by the client. This may be a thread id, | the owner of lock managed by the client. This may be a thread id, | |||
| process id, or other unique value. | process id, or other unique value. | |||
| Owners of opens and owners of record locks are separate entities and | Owners of opens and owners of byte-range locks are separate entities | |||
| remain separate even if the same opaque arrays are used to designate | and remain separate even if the same opaque arrays are used to | |||
| owners of each. The protocol distinguishes between open-owners | designate owners of each. The protocol distinguishes between open- | |||
| (represented by open_owner4 structures) and lock-owners (represented | owners (represented by open_owner4 structures) and lock-owners | |||
| by lock_owner4 structures). | (represented by lock_owner4 structures). | |||
| Each open is associated with a specific open-owner while each record | Each open is associated with a specific open-owner while each byte- | |||
| lock is associated with a lock-owner and an open-owner, the latter | range lock is associated with a lock-owner and an open-owner, the | |||
| being the open-owner associated with the open file under which the | latter being the open-owner associated with the open file under which | |||
| LOCK operation was done. Delegations and layouts, on the other hand, | the LOCK operation was done. Delegations and layouts, on the other | |||
| are not associated with a specific owner but are associated with the | hand, are not associated with a specific owner but are associated | |||
| client as a whole (identified by a client ID). | with the client as a whole (identified by a client ID). | |||
| 9.1.2. Use of the Stateid and Locking | 9.1.2. Use of the Stateid and Locking | |||
| All READ, WRITE and SETATTR operations contain a stateid. For the | All READ, WRITE and SETATTR operations contain a stateid. For the | |||
| purposes of this section, SETATTR operations which change the size | purposes of this section, SETATTR operations which change the size | |||
| attribute of a file are treated as if they are writing the area | attribute of a file are treated as if they are writing the area | |||
| between the old and new size (i.e. the range truncated or added to | between the old and new size (i.e. the range truncated or added to | |||
| the file by means of the SETATTR), even where SETATTR is not | the file by means of the SETATTR), even where SETATTR is not | |||
| explicitly mentioned in the text. The stateid passed to one of these | explicitly mentioned in the text. The stateid passed to one of these | |||
| operations must be one that represents an open, a set of byte-range | operations must be one that represents an open, a set of byte-range | |||
| locks, or a delegation, or it may be a special stateid representing | locks, or a delegation, or it may be a special stateid representing | |||
| anonymous access or the special bypass stateid. | anonymous access or the special bypass stateid. | |||
| If the state-owner performs a READ or WRITE in a situation in which | If the state-owner performs a READ or WRITE in a situation in which | |||
| it has established a byte-range lock or share reservation on the | it has established a byte-range lock or share reservation on the | |||
| server (any OPEN constitutes a share reservation) the stateid | server (any OPEN constitutes a share reservation) the stateid | |||
| (previously returned by the server) must be used to indicate what | (previously returned by the server) must be used to indicate what | |||
| locks, including both record locks and share reservations, are held | locks, including both byte-range locks and share reservations, are | |||
| by the state-owner. If no state is established by the client, either | held by the state-owner. If no state is established by the client, | |||
| record lock or share reservation, a special stateid for anonymous | either byte-range lock or share reservation, a special stateid for | |||
| state (zero as "other" and "seqid") is used. (See Section 8.2.3 for | anonymous state (zero as "other" and "seqid") is used. (See | |||
| a description of 'special' stateids in general.) Regardless whether | Section 8.2.3 for a description of 'special' stateids in general.) | |||
| a stateid for anonymous state or a stateid returned by the server is | Regardless whether a stateid for anonymous state or a stateid | |||
| used, if there is a conflicting share reservation or mandatory record | returned by the server is used, if there is a conflicting share | |||
| lock held on the file, the server MUST refuse to service the READ or | reservation or mandatory byte-range lock held on the file, the server | |||
| WRITE operation. | MUST refuse to service the READ or WRITE operation. | |||
| Share reservations are established by OPEN operations and by their | Share reservations are established by OPEN operations and by their | |||
| nature are mandatory in that when the OPEN denies READ or WRITE | nature are mandatory in that when the OPEN denies READ or WRITE | |||
| operations, that denial results in such operations being rejected | operations, that denial results in such operations being rejected | |||
| with error NFS4ERR_LOCKED. Record locks may be implemented by the | with error NFS4ERR_LOCKED. Byte-range locks may be implemented by | |||
| server as either mandatory or advisory, or the choice of mandatory or | the server as either mandatory or advisory, or the choice of | |||
| advisory behavior may be determined by the server on the basis of the | mandatory or advisory behavior may be determined by the server on the | |||
| file being accessed (for example, some UNIX-based servers support a | basis of the file being accessed (for example, some UNIX-based | |||
| "mandatory lock bit" on the mode attribute such that if set, record | servers support a "mandatory lock bit" on the mode attribute such | |||
| locks are required on the file before I/O is possible). When record | that if set, byte-range locks are required on the file before I/O is | |||
| locks are advisory, they only prevent the granting of conflicting | possible). When byte-range locks are advisory, they only prevent the | |||
| lock requests and have no effect on READs or WRITEs. Mandatory | granting of conflicting lock requests and have no effect on READs or | |||
| record locks, however, prevent conflicting I/O operations. When they | WRITEs. Mandatory byte-range locks, however, prevent conflicting I/O | |||
| are attempted, they are rejected with NFS4ERR_LOCKED. When the | operations. When they are attempted, they are rejected with | |||
| client gets NFS4ERR_LOCKED on a file it knows it has the proper share | NFS4ERR_LOCKED. When the client gets NFS4ERR_LOCKED on a file it | |||
| reservation for, it will need to send a LOCK request on the region of | knows it has the proper share reservation for, it will need to send a | |||
| the file that includes the region the I/O was to be performed on, | LOCK request on the region of the file that includes the region the | |||
| with an appropriate locktype (i.e. READ*_LT for a READ operation, | I/O was to be performed on, with an appropriate locktype (i.e. | |||
| WRITE*_LT for a WRITE operation). | READ*_LT for a READ operation, WRITE*_LT for a WRITE operation). | |||
| Note that for UNIX environments that support mandatory file locking, | Note that for UNIX environments that support mandatory file locking, | |||
| the distinction between advisory and mandatory locking is subtle. In | the distinction between advisory and mandatory locking is subtle. In | |||
| fact, advisory and mandatory record locks are exactly the same in so | fact, advisory and mandatory byte-range locks are exactly the same in | |||
| far as the APIs and requirements on implementation. If the mandatory | so far as the APIs and requirements on implementation. If the | |||
| lock attribute is set on the file, the server checks to see if the | mandatory lock attribute is set on the file, the server checks to see | |||
| lock-owner has an appropriate shared (read) or exclusive (write) | if the lock-owner has an appropriate shared (read) or exclusive | |||
| record lock on the region it wishes to read or write to. If there is | (write) byte-range lock on the region it wishes to read or write to. | |||
| no appropriate lock, the server checks if there is a conflicting lock | If there is no appropriate lock, the server checks if there is a | |||
| (which can be done by attempting to acquire the conflicting lock on | conflicting lock (which can be done by attempting to acquire the | |||
| behalf of the lock-owner, and if successful, release the lock after | conflicting lock on behalf of the lock-owner, and if successful, | |||
| the READ or WRITE is done), and if there is, the server returns | release the lock after the READ or WRITE is done), and if there is, | |||
| NFS4ERR_LOCKED. | the server returns NFS4ERR_LOCKED. | |||
| For Windows environments, record locks are always mandatory, so the | For Windows environments, byte-range locks are always mandatory, so | |||
| server always checks for record locks during I/O requests. | the server always checks for byte-range locks during I/O requests. | |||
| Thus, the NFSv4.1 LOCK operation does not need to distinguish between | Thus, the NFSv4.1 LOCK operation does not need to distinguish between | |||
| advisory and mandatory record locks. It is the NFSv4.1 server's | advisory and mandatory byte-range locks. It is the NFSv4.1 server's | |||
| processing of the READ and WRITE operations that introduces the | processing of the READ and WRITE operations that introduces the | |||
| distinction. | distinction. | |||
| Every stateid which is validly passed to READ, WRITE or SETATTR, with | Every stateid which is validly passed to READ, WRITE or SETATTR, with | |||
| the exception of special stateid values, defines an access mode for | the exception of special stateid values, defines an access mode for | |||
| the file (i.e. READ, WRITE, or READ-WRITE) | the file (i.e. READ, WRITE, or READ-WRITE) | |||
| o For stateids associated with opens, this is the mode defined by | o For stateids associated with opens, this is the mode defined by | |||
| the original OPEN which caused the allocation of the open stateid | the original OPEN which caused the allocation of the open stateid | |||
| and as modified by subsequent OPENs and OPEN_DOWNGRADEs for the | and as modified by subsequent OPENs and OPEN_DOWNGRADEs for the | |||
| same open-owner/file pair. | same open-owner/file pair. | |||
| o For stateids returned by record lock requests, the appropriate | o For stateids returned by byte-range lock requests, the appropriate | |||
| mode is the access mode for the open stateid associated with the | mode is the access mode for the open stateid associated with the | |||
| lock set represented by the stateid. | lock set represented by the stateid. | |||
| o For delegation stateids the access mode is based on the type of | o For delegation stateids the access mode is based on the type of | |||
| delegation. | delegation. | |||
| When a READ, WRITE, or SETATTR (which specifies the size attribute) | When a READ, WRITE, or SETATTR (which specifies the size attribute) | |||
| is done, the operation is subject to checking against the access mode | is done, the operation is subject to checking against the access mode | |||
| to verify that the operation is appropriate given the stateid with | to verify that the operation is appropriate given the stateid with | |||
| which the operation is associated. | which the operation is associated. | |||
| skipping to change at page 176, line 35 | skipping to change at page 176, line 35 | |||
| ranges that happen to be adjacent into a single request since the | ranges that happen to be adjacent into a single request since the | |||
| server may not support sub-range requests and for reasons related to | server may not support sub-range requests and for reasons related to | |||
| the recovery of file locking state in the event of server failure. | the recovery of file locking state in the event of server failure. | |||
| As discussed in Section 8.4.2, the server may employ certain | As discussed in Section 8.4.2, the server may employ certain | |||
| optimizations during recovery that work effectively only when the | optimizations during recovery that work effectively only when the | |||
| client's behavior during lock recovery is similar to the client's | client's behavior during lock recovery is similar to the client's | |||
| locking behavior prior to server failure. | locking behavior prior to server failure. | |||
| 9.3. Upgrading and Downgrading Locks | 9.3. Upgrading and Downgrading Locks | |||
| If a client has a write lock on a record, it can request an atomic | If a client has a write lock on a byte-range, it can request an | |||
| downgrade of the lock to a read lock via the LOCK request, by setting | atomic downgrade of the lock to a read lock via the LOCK request, by | |||
| the type to READ_LT. If the server supports atomic downgrade, the | setting the type to READ_LT. If the server supports atomic | |||
| request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. | downgrade, the request will succeed. If not, it will return | |||
| The client should be prepared to receive this error, and if | NFS4ERR_LOCK_NOTSUPP. The client should be prepared to receive this | |||
| appropriate, report the error to the requesting application. | error, and if appropriate, report the error to the requesting | |||
| application. | ||||
| If a client has a read lock on a record, it can request an atomic | If a client has a read lock on a byte-range, it can request an atomic | |||
| upgrade of the lock to a write lock via the LOCK request by setting | upgrade of the lock to a write lock via the LOCK request by setting | |||
| the type to WRITE_LT or WRITEW_LT. If the server does not support | the type to WRITE_LT or WRITEW_LT. If the server does not support | |||
| atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade | atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade | |||
| can be achieved without an existing conflict, the request will | can be achieved without an existing conflict, the request will | |||
| succeed. Otherwise, the server will return either NFS4ERR_DENIED or | succeed. Otherwise, the server will return either NFS4ERR_DENIED or | |||
| NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the | NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the | |||
| client sent the LOCK request with the type set to WRITEW_LT and the | client sent the LOCK request with the type set to WRITEW_LT and the | |||
| server has detected a deadlock. The client should be prepared to | server has detected a deadlock. The client should be prepared to | |||
| receive such errors and if appropriate, report the error to the | receive such errors and if appropriate, report the error to the | |||
| requesting application. | requesting application. | |||
| skipping to change at page 179, line 12 | skipping to change at page 179, line 12 | |||
| lock, since the greater latency that might occur is likely to be | lock, since the greater latency that might occur is likely to be | |||
| eliminated given a prompt callback, but it still needs to poll. When | eliminated given a prompt callback, but it still needs to poll. When | |||
| it receives a CB_NOTIFY_LOCK it should promptly try to obtain the | it receives a CB_NOTIFY_LOCK it should promptly try to obtain the | |||
| lock, but it should be aware that other clients may polling and the | lock, but it should be aware that other clients may polling and the | |||
| server is under no obligation to reserve the lock for that particular | server is under no obligation to reserve the lock for that particular | |||
| client. | client. | |||
| 9.7. Share Reservations | 9.7. Share Reservations | |||
| A share reservation is a mechanism to control access to a file. It | A share reservation is a mechanism to control access to a file. It | |||
| is a separate and independent mechanism from record locking. When a | is a separate and independent mechanism from byte-range locking. | |||
| client opens a file, it sends an OPEN operation to the server | When a client opens a file, it sends an OPEN operation to the server | |||
| specifying the type of access required (READ, WRITE, or BOTH) and the | specifying the type of access required (READ, WRITE, or BOTH) and the | |||
| type of access to deny others (deny NONE, READ, WRITE, or BOTH). If | type of access to deny others (deny NONE, READ, WRITE, or BOTH). If | |||
| the OPEN fails the client will fail the application's open request. | the OPEN fails the client will fail the application's open request. | |||
| Pseudo-code definition of the semantics: | Pseudo-code definition of the semantics: | |||
| if (request.access == 0) { | if (request.access == 0) { | |||
| return (NFS4ERR_INVAL) | return (NFS4ERR_INVAL) | |||
| } else { | } else { | |||
| if ((request.access & file_state.deny)) || | if ((request.access & file_state.deny)) || | |||
| skipping to change at page 180, line 14 | skipping to change at page 180, line 14 | |||
| still obtain the filehandle for the regular file with the OPEN | still obtain the filehandle for the regular file with the OPEN | |||
| operation so the appropriate share semantics can be applied. For | operation so the appropriate share semantics can be applied. For | |||
| clients that do not have a deny mode built into their open | clients that do not have a deny mode built into their open | |||
| programming interfaces, deny equal to NONE should be used. | programming interfaces, deny equal to NONE should be used. | |||
| The OPEN operation with the CREATE flag, also subsumes the CREATE | The OPEN operation with the CREATE flag, also subsumes the CREATE | |||
| operation for regular files as used in previous versions of the NFS | operation for regular files as used in previous versions of the NFS | |||
| protocol. This allows a create with a share to be done atomically. | protocol. This allows a create with a share to be done atomically. | |||
| The CLOSE operation removes all share reservations held by the open- | The CLOSE operation removes all share reservations held by the open- | |||
| owner on that file. If record locks are held, the client SHOULD | owner on that file. If byte-range locks are held, the client SHOULD | |||
| release all locks before issuing a CLOSE. The server MAY free all | release all locks before issuing a CLOSE. The server MAY free all | |||
| outstanding locks on CLOSE but some servers may not support the CLOSE | outstanding locks on CLOSE but some servers may not support the CLOSE | |||
| of a file that still has record locks held. The server MUST return | of a file that still has byte-range locks held. The server MUST | |||
| failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the | return failure, NFS4ERR_LOCKS_HELD, if any locks would exist after | |||
| CLOSE. | the CLOSE. | |||
| The LOOKUP operation will return a filehandle without establishing | The LOOKUP operation will return a filehandle without establishing | |||
| any lock state on the server. Without a valid stateid, the server | any lock state on the server. Without a valid stateid, the server | |||
| will assume the client has the least access. For example, a file | will assume the client has the least access. For example, a file | |||
| opened with deny READ/WRITE using a filehandle obtained through | opened with deny READ/WRITE using a filehandle obtained through | |||
| LOOKUP could only be read using the special read bypass stateid and | LOOKUP could only be read using the special read bypass stateid and | |||
| could not be written at all because it would not have a valid stateid | could not be written at all because it would not have a valid stateid | |||
| and the special anonymous stateid would not be allowed access. | and the special anonymous stateid would not be allowed access. | |||
| 9.9. Open Upgrade and Downgrade | 9.9. Open Upgrade and Downgrade | |||
| skipping to change at page 186, line 33 | skipping to change at page 186, line 33 | |||
| There are three situations that delegation recovery must deal with: | There are three situations that delegation recovery must deal with: | |||
| o Client restart | o Client restart | |||
| o Server restart | o Server restart | |||
| o Network partition (full or backchannel-only) | o Network partition (full or backchannel-only) | |||
| In the event the client restarts, the failure to renew the lease will | In the event the client restarts, the failure to renew the lease will | |||
| result in the revocation of record locks and share reservations. | result in the revocation of byte-range locks and share reservations. | |||
| Delegations, however, may be treated a bit differently. | Delegations, however, may be treated a bit differently. | |||
| There will be situations in which delegations will need to be | There will be situations in which delegations will need to be | |||
| reestablished after a client restarts. The reason for this is the | reestablished after a client restarts. The reason for this is the | |||
| client may have file data stored locally and this data was associated | client may have file data stored locally and this data was associated | |||
| with the previously held delegations. The client will need to | with the previously held delegations. The client will need to | |||
| reestablish the appropriate file state on the server. | reestablish the appropriate file state on the server. | |||
| To allow for this type of client recovery, the server MAY extend the | To allow for this type of client recovery, the server MAY extend the | |||
| period for delegation recovery beyond the typical lease expiration | period for delegation recovery beyond the typical lease expiration | |||
| skipping to change at page 187, line 18 | skipping to change at page 187, line 18 | |||
| A server MAY support claim types of CLAIM_DELEGATE_PREV and | A server MAY support claim types of CLAIM_DELEGATE_PREV and | |||
| CLAIM_DELEG_PREV_FH, and if it does, it MUST NOT remove delegations | CLAIM_DELEG_PREV_FH, and if it does, it MUST NOT remove delegations | |||
| upon a CREATE_SESSION that confirms a client ID created by | upon a CREATE_SESSION that confirms a client ID created by | |||
| EXCHANGE_ID, and instead MUST, for a period of time no less than that | EXCHANGE_ID, and instead MUST, for a period of time no less than that | |||
| of the value of the lease_time attribute, maintain the client's | of the value of the lease_time attribute, maintain the client's | |||
| delegations to allow time for the client to send CLAIM_DELEGATE_PREV | delegations to allow time for the client to send CLAIM_DELEGATE_PREV | |||
| requests. The server that supports CLAIM_DELEGATE_PREV and/or | requests. The server that supports CLAIM_DELEGATE_PREV and/or | |||
| CLAIM_DELEG_PREV_FH MUST support the DELEGPURGE operation. | CLAIM_DELEG_PREV_FH MUST support the DELEGPURGE operation. | |||
| When the server restarts, delegations are reclaimed (using the OPEN | When the server restarts, delegations are reclaimed (using the OPEN | |||
| operation with CLAIM_PREVIOUS) in a similar fashion to record locks | operation with CLAIM_PREVIOUS) in a similar fashion to byte-range | |||
| and share reservations. However, there is a slight semantic | locks and share reservations. However, there is a slight semantic | |||
| difference. In the normal case if the server decides that a | difference. In the normal case if the server decides that a | |||
| delegation should not be granted, it performs the requested action | delegation should not be granted, it performs the requested action | |||
| (e.g. OPEN) without granting any delegation. For reclaim, the | (e.g. OPEN) without granting any delegation. For reclaim, the | |||
| server grants the delegation but a special designation is applied so | server grants the delegation but a special designation is applied so | |||
| that the client treats the delegation as having been granted but | that the client treats the delegation as having been granted but | |||
| recalled by the server. Because of this, the client has the duty to | recalled by the server. Because of this, the client has the duty to | |||
| write all modified state to the server and then return the | write all modified state to the server and then return the | |||
| delegation. This process of handling delegation reclaim reconciles | delegation. This process of handling delegation reclaim reconciles | |||
| three principles of the NFSv4.1 protocol: | three principles of the NFSv4.1 protocol: | |||
| skipping to change at page 188, line 41 | skipping to change at page 188, line 41 | |||
| notified about the revocation. | notified about the revocation. | |||
| 10.3. Data Caching | 10.3. Data Caching | |||
| When applications share access to a set of files, they need to be | When applications share access to a set of files, they need to be | |||
| implemented so as to take account of the possibility of conflicting | implemented so as to take account of the possibility of conflicting | |||
| access by another application. This is true whether the applications | access by another application. This is true whether the applications | |||
| in question execute on different clients or reside on the same | in question execute on different clients or reside on the same | |||
| client. | client. | |||
| Share reservations and record locks are the facilities the NFSv4.1 | Share reservations and byte-range locks are the facilities the | |||
| protocol provides to allow applications to coordinate access by using | NFSv4.1 protocol provides to allow applications to coordinate access | |||
| mutual exclusion facilities. The NFSv4.1 protocol's data caching | by using mutual exclusion facilities. The NFSv4.1 protocol's data | |||
| must be implemented such that it does not invalidate the assumptions | caching must be implemented such that it does not invalidate the | |||
| that those using these facilities depend upon. | assumptions that those using these facilities depend upon. | |||
| 10.3.1. Data Caching and OPENs | 10.3.1. Data Caching and OPENs | |||
| In order to avoid invalidating the sharing assumptions that | In order to avoid invalidating the sharing assumptions that | |||
| applications rely on, NFSv4.1 clients should not provide cached data | applications rely on, NFSv4.1 clients should not provide cached data | |||
| to applications or modify it on behalf of an application when it | to applications or modify it on behalf of an application when it | |||
| would not be valid to obtain or modify that same data via a READ or | would not be valid to obtain or modify that same data via a READ or | |||
| WRITE operation. | WRITE operation. | |||
| Furthermore, in the absence of open delegation (see Section 10.4), | Furthermore, in the absence of open delegation (see Section 10.4), | |||
| skipping to change at page 191, line 9 | skipping to change at page 191, line 9 | |||
| The data that is written to the server as a prerequisite to the | The data that is written to the server as a prerequisite to the | |||
| unlocking of a region must be written, at the server, to stable | unlocking of a region must be written, at the server, to stable | |||
| storage. The client may accomplish this either with synchronous | storage. The client may accomplish this either with synchronous | |||
| writes or by following asynchronous writes with a COMMIT operation. | writes or by following asynchronous writes with a COMMIT operation. | |||
| This is required because retransmission of the modified data after a | This is required because retransmission of the modified data after a | |||
| server restart might conflict with a lock held by another client. | server restart might conflict with a lock held by another client. | |||
| A client implementation may choose to accommodate applications which | A client implementation may choose to accommodate applications which | |||
| use record locking in non-standard ways (e.g. using a record lock as | use byte-range locking in non-standard ways (e.g. using a byte-range | |||
| a global semaphore) by flushing to the server more data upon an LOCKU | lock as a global semaphore) by flushing to the server more data upon | |||
| than is covered by the locked range. This may include modified data | an LOCKU than is covered by the locked range. This may include | |||
| within files other than the one for which the unlocks are being done. | modified data within files other than the one for which the unlocks | |||
| In such cases, the client must not interfere with applications whose | are being done. In such cases, the client must not interfere with | |||
| READs and WRITEs are being done only within the bounds of record | applications whose READs and WRITEs are being done only within the | |||
| locks which the application holds. For example, an application locks | bounds of byte-range locks which the application holds. For example, | |||
| a single byte of a file and proceeds to write that single byte. A | an application locks a single byte of a file and proceeds to write | |||
| client that chose to handle a LOCKU by flushing all modified data to | that single byte. A client that chose to handle a LOCKU by flushing | |||
| the server could validly write that single byte in response to an | all modified data to the server could validly write that single byte | |||
| unrelated unlock. However, it would not be valid to write the entire | in response to an unrelated unlock. However, it would not be valid | |||
| block in which that single written byte was located since it includes | to write the entire block in which that single written byte was | |||
| an area that is not locked and might be locked by another client. | located since it includes an area that is not locked and might be | |||
| Client implementations can avoid this problem by dividing files with | locked by another client. Client implementations can avoid this | |||
| modified data into those for which all modifications are done to | problem by dividing files with modified data into those for which all | |||
| areas covered by an appropriate record lock and those for which there | modifications are done to areas covered by an appropriate byte-range | |||
| are modifications not covered by a record lock. Any writes done for | lock and those for which there are modifications not covered by a | |||
| the former class of files must not include areas not locked and thus | byte-range lock. Any writes done for the former class of files must | |||
| not modified on the client. | not include areas not locked and thus not modified on the client. | |||
| 10.3.3. Data Caching and Mandatory File Locking | 10.3.3. Data Caching and Mandatory File Locking | |||
| Client side data caching needs to respect mandatory file locking when | Client side data caching needs to respect mandatory file locking when | |||
| it is in effect. The presence of mandatory file locking for a given | it is in effect. The presence of mandatory file locking for a given | |||
| file is indicated when the client gets back NFS4ERR_LOCKED from a | file is indicated when the client gets back NFS4ERR_LOCKED from a | |||
| READ or WRITE on a file it has an appropriate share reservation for. | READ or WRITE on a file it has an appropriate share reservation for. | |||
| When mandatory locking is in effect for a file, the client must check | When mandatory locking is in effect for a file, the client must check | |||
| for an appropriate file lock for data being read or written. If a | for an appropriate file lock for data being read or written. If a | |||
| lock exists for the range being read or written, the client may | lock exists for the range being read or written, the client may | |||
| skipping to change at page 209, line 8 | skipping to change at page 209, line 8 | |||
| virtual memory management systems on each client only know a page is | virtual memory management systems on each client only know a page is | |||
| modified, not that a subset of the page corresponding to the | modified, not that a subset of the page corresponding to the | |||
| respective lock regions has been modified. So it is not possible for | respective lock regions has been modified. So it is not possible for | |||
| each client to do the right thing, which is to only write to the | each client to do the right thing, which is to only write to the | |||
| server that portion of the page that is locked. For example, if | server that portion of the page that is locked. For example, if | |||
| client A simply writes out the page, and then client B writes out the | client A simply writes out the page, and then client B writes out the | |||
| page, client A's data is lost. | page, client A's data is lost. | |||
| Moreover, if mandatory locking is enabled on the file, then we have a | Moreover, if mandatory locking is enabled on the file, then we have a | |||
| different problem. When clients A and B execute the STORE | different problem. When clients A and B execute the STORE | |||
| instructions, the resulting page faults require a record lock on the | instructions, the resulting page faults require a byte-range lock on | |||
| entire page. Each client then tries to extend their locked range to | the entire page. Each client then tries to extend their locked range | |||
| the entire page, which results in a deadlock. Communicating the | to the entire page, which results in a deadlock. Communicating the | |||
| NFS4ERR_DEADLOCK error to a STORE instruction is difficult at best. | NFS4ERR_DEADLOCK error to a STORE instruction is difficult at best. | |||
| If a client is locking the entire memory mapped file, there is no | If a client is locking the entire memory mapped file, there is no | |||
| problem with advisory or mandatory record locking, at least until the | problem with advisory or mandatory byte-range locking, at least until | |||
| client unlocks a region in the middle of the file. | the client unlocks a region in the middle of the file. | |||
| Given the above issues the following are permitted: | Given the above issues the following are permitted: | |||
| o Clients and servers MAY deny memory mapping a file they know there | o Clients and servers MAY deny memory mapping a file they know there | |||
| are record locks for. | are byte-range locks for. | |||
| o Clients and servers MAY deny a record lock on a file they know is | o Clients and servers MAY deny a byte-range lock on a file they know | |||
| memory mapped. | is memory mapped. | |||
| o A client MAY deny memory mapping a file that it knows requires | o A client MAY deny memory mapping a file that it knows requires | |||
| mandatory locking for I/O. If mandatory locking is enabled after | mandatory locking for I/O. If mandatory locking is enabled after | |||
| the file is opened and mapped, the client MAY deny the application | the file is opened and mapped, the client MAY deny the application | |||
| further access to its mapped file. | further access to its mapped file. | |||
| 10.8. Name and Directory Caching without Directory Delegations | 10.8. Name and Directory Caching without Directory Delegations | |||
| The NFSv4.1 directory delegation facility (described in Section 10.9 | The NFSv4.1 directory delegation facility (described in Section 10.9 | |||
| below) is OPTIONAL for servers to implement. Even where it is | below) is OPTIONAL for servers to implement. Even where it is | |||
| skipping to change at page 264, line 16 | skipping to change at page 264, line 16 | |||
| pNFS takes the form of OPTIONAL operations that manage protocol | pNFS takes the form of OPTIONAL operations that manage protocol | |||
| objects called 'layouts' which contain data location information. | objects called 'layouts' which contain data location information. | |||
| The layout is managed in a similar fashion as NFSv4.1 data | The layout is managed in a similar fashion as NFSv4.1 data | |||
| delegations are managed. For example, the layout is leased, | delegations are managed. For example, the layout is leased, | |||
| recallable and revocable. However, layouts are distinct abstractions | recallable and revocable. However, layouts are distinct abstractions | |||
| and are manipulated with new operations. When a client holds a | and are manipulated with new operations. When a client holds a | |||
| layout, it is granted the ability to access the data location | layout, it is granted the ability to access the data location | |||
| directly using the location information specified in the layout. | directly using the location information specified in the layout. | |||
| There are interactions between layouts and other NFSv4.1 abstractions | There are interactions between layouts and other NFSv4.1 abstractions | |||
| such as data delegations and record locking. Delegation issues are | such as data delegations and byte-range locking. Delegation issues | |||
| discussed in Section 12.5.5. Byte range locking issues are discussed | are discussed in Section 12.5.5. Byte range locking issues are | |||
| in Section 12.2.9 and Section 12.5.1. | discussed in Section 12.2.9 and Section 12.5.1. | |||
| The NFSv4.1 pNFS feature has been structured to allow for a variety | The NFSv4.1 pNFS feature has been structured to allow for a variety | |||
| of storage protocols to be defined and used. As noted in the diagram | of storage protocols to be defined and used. As noted in the diagram | |||
| above, the storage protocol is the method used by the client to store | above, the storage protocol is the method used by the client to store | |||
| and retrieve data directly from the storage devices. The NFSv4.1 | and retrieve data directly from the storage devices. The NFSv4.1 | |||
| protocol directly defines one storage protocol, the NFSv4.1 storage | protocol directly defines one storage protocol, the NFSv4.1 storage | |||
| type, and its use. | type, and its use. | |||
| Examples of other storage protocols that could be used with NFSv4.1's | Examples of other storage protocols that could be used with NFSv4.1's | |||
| pNFS are: | pNFS are: | |||
| skipping to change at page 268, line 8 | skipping to change at page 268, line 8 | |||
| and performs a WRITE to a storage device, the storage device is | and performs a WRITE to a storage device, the storage device is | |||
| allowed to reject that WRITE. | allowed to reject that WRITE. | |||
| The iomode does not conflict with OPEN share modes or lock requests; | The iomode does not conflict with OPEN share modes or lock requests; | |||
| open mode and lock conflicts are enforced as they are without the use | open mode and lock conflicts are enforced as they are without the use | |||
| of pNFS, and are logically separate from the pNFS layout level. As | of pNFS, and are logically separate from the pNFS layout level. As | |||
| well, open modes and locks are the preferred method for restricting | well, open modes and locks are the preferred method for restricting | |||
| user access to data files. For example, an OPEN of read, deny-write | user access to data files. For example, an OPEN of read, deny-write | |||
| does not conflict with a LAYOUTGET containing an iomode of READ/WRITE | does not conflict with a LAYOUTGET containing an iomode of READ/WRITE | |||
| performed by another client. Applications that depend on writing | performed by another client. Applications that depend on writing | |||
| into the same file concurrently may use record locking to serialize | into the same file concurrently may use byte-range locking to | |||
| their accesses. | serialize their accesses. | |||
| 12.2.10. Device IDs | 12.2.10. Device IDs | |||
| The device ID (data type deviceid4, see Section 3.3.14) names a group | The device ID (data type deviceid4, see Section 3.3.14) names a group | |||
| of storage devices. The scope of a device ID is per pair of client | of storage devices. The scope of a device ID is per pair of client | |||
| ID and layout type. In practice, a significant amount of information | ID and layout type. In practice, a significant amount of information | |||
| may be required to fully address a storage device. Rather than | may be required to fully address a storage device. Rather than | |||
| embedding all such information in a layout, layouts embed device IDs. | embedding all such information in a layout, layouts embed device IDs. | |||
| The NFSv4.1 operation GETDEVICEINFO (Section 18.40) is used to | The NFSv4.1 operation GETDEVICEINFO (Section 18.40) is used to | |||
| retrieve the complete address information (including all device | retrieve the complete address information (including all device | |||
| skipping to change at page 290, line 27 | skipping to change at page 290, line 27 | |||
| As mentioned previously, some operations, namely WRITE and LAYOUTGET | As mentioned previously, some operations, namely WRITE and LAYOUTGET | |||
| may be rejected during the metadata server's grace period, because to | may be rejected during the metadata server's grace period, because to | |||
| provide simple, valid handling during the grace period, the easiest | provide simple, valid handling during the grace period, the easiest | |||
| method is to simply reject all non-reclaim pNFS requests and WRITE | method is to simply reject all non-reclaim pNFS requests and WRITE | |||
| operations by returning the NFS4ERR_GRACE error. However, depending | operations by returning the NFS4ERR_GRACE error. However, depending | |||
| on the storage protocol (which is specific to the layout type) and | on the storage protocol (which is specific to the layout type) and | |||
| metadata server implementation, the metadata server may be able to | metadata server implementation, the metadata server may be able to | |||
| determine that a particular request is safe. For example, a metadata | determine that a particular request is safe. For example, a metadata | |||
| server may save provisional allocation mappings for each file to | server may save provisional allocation mappings for each file to | |||
| stable storage, as well as information about potentially conflicting | stable storage, as well as information about potentially conflicting | |||
| OPEN share modes and mandatory record locks that might have been in | OPEN share modes and mandatory byte-range locks that might have been | |||
| effect at the time of restart, and use this information during the | in effect at the time of restart, and use this information during the | |||
| recovery grace period to determine that a WRITE request is safe. | recovery grace period to determine that a WRITE request is safe. | |||
| 12.7.6. Storage Device Recovery | 12.7.6. Storage Device Recovery | |||
| Recovery from storage device restart is mostly dependent upon the | Recovery from storage device restart is mostly dependent upon the | |||
| layout type in use. However, there are a few general techniques a | layout type in use. However, there are a few general techniques a | |||
| client can use if it discovers a storage device has crashed while | client can use if it discovers a storage device has crashed while | |||
| holding modified, uncommitted data that was asynchronously written. | holding modified, uncommitted data that was asynchronously written. | |||
| First and foremost, it is important to realize that the client is the | First and foremost, it is important to realize that the client is the | |||
| only one which has the information necessary to recover non-committed | only one which has the information necessary to recover non-committed | |||
| skipping to change at page 313, line 27 | skipping to change at page 313, line 27 | |||
| the data servers, even though the details of the control protocol may | the data servers, even though the details of the control protocol may | |||
| avoid actual transfer of the state under certain circumstances. | avoid actual transfer of the state under certain circumstances. | |||
| On the other hand, since advisory lock state is not used for checking | On the other hand, since advisory lock state is not used for checking | |||
| I/O accesses at the data servers, there is no semantic reason for | I/O accesses at the data servers, there is no semantic reason for | |||
| propagating advisory lock state to the data servers. Since updates | propagating advisory lock state to the data servers. Since updates | |||
| to advisory locks neither confer nor remove privileges, these changes | to advisory locks neither confer nor remove privileges, these changes | |||
| need not be propagated immediately, and may not need to be propagated | need not be propagated immediately, and may not need to be propagated | |||
| promptly. The updates to advisory locks need only be propagated when | promptly. The updates to advisory locks need only be propagated when | |||
| the data server needs to resolve a question about a stateid. In | the data server needs to resolve a question about a stateid. In | |||
| fact, if record locking is not mandatory (i.e., is advisory) the | fact, if byte-range locking is not mandatory (i.e., is advisory) the | |||
| clients are advised not to use the lock-based stateids for I/O at | clients are advised not to use the lock-based stateids for I/O at | |||
| all. The stateids returned by open are sufficient and eliminate | all. The stateids returned by open are sufficient and eliminate | |||
| overhead for this kind of state propagation. | overhead for this kind of state propagation. | |||
| If a client gets back an NFS4ERR_LOCKED error from a data server, | If a client gets back an NFS4ERR_LOCKED error from a data server, | |||
| this is an indication that mandatory record locking is in force. The | this is an indication that mandatory byte-range locking is in force. | |||
| client recovers from this by getting a record lock that covers the | The client recovers from this by getting a byte-range lock that | |||
| affected range and re-sends the I/O with the stateid of the record | covers the affected range and re-sends the I/O with the stateid of | |||
| lock. | the byte-range lock. | |||
| 13.9.2.2. Open and Deny Mode Validation | 13.9.2.2. Open and Deny Mode Validation | |||
| Open and deny mode validation MUST be performed against the open and | Open and deny mode validation MUST be performed against the open and | |||
| deny mode(s) held by the data servers. When access is reduced or a | deny mode(s) held by the data servers. When access is reduced or a | |||
| deny mode made more restrictive (because of CLOSE or DOWNGRADE) the | deny mode made more restrictive (because of CLOSE or DOWNGRADE) the | |||
| data server MUST prevent any I/Os that would be denied if performed | data server MUST prevent any I/Os that would be denied if performed | |||
| on the metadata server. When access is expanded, the data server | on the metadata server. When access is expanded, the data server | |||
| MUST make sure that no requests are subsequently rejected because of | MUST make sure that no requests are subsequently rejected because of | |||
| open or deny issues that no longer apply, given the previous | open or deny issues that no longer apply, given the previous | |||
| skipping to change at page 392, line 7 | skipping to change at page 392, line 7 | |||
| }; | }; | |||
| 18.2.3. DESCRIPTION | 18.2.3. DESCRIPTION | |||
| The CLOSE operation releases share reservations for the regular or | The CLOSE operation releases share reservations for the regular or | |||
| named attribute file as specified by the current filehandle. The | named attribute file as specified by the current filehandle. The | |||
| share reservations and other state information released at the server | share reservations and other state information released at the server | |||
| as a result of this CLOSE is only that associated with the supplied | as a result of this CLOSE is only that associated with the supplied | |||
| stateid. State associated with other OPENs is not affected. | stateid. State associated with other OPENs is not affected. | |||
| If record locks are held, the client SHOULD release all locks before | If byte-range locks are held, the client SHOULD release all locks | |||
| issuing a CLOSE. The server MAY free all outstanding locks on CLOSE | before issuing a CLOSE. The server MAY free all outstanding locks on | |||
| but some servers may not support the CLOSE of a file that still has | CLOSE but some servers may not support the CLOSE of a file that still | |||
| record locks held. The server MUST return failure if any locks would | has byte-range locks held. The server MUST return failure if any | |||
| exist after the CLOSE. | locks would exist after the CLOSE. | |||
| The argument seqid MAY have any value and the server MUST ignore | The argument seqid MAY have any value and the server MUST ignore | |||
| seqid. | seqid. | |||
| On success, the current filehandle retains its value. | On success, the current filehandle retains its value. | |||
| The server MAY require that the principal, security flavor, and | The server MAY require that the principal, security flavor, and | |||
| applicable, the GSS mechanism, combination that sent the OPEN request | applicable, the GSS mechanism, combination that sent the OPEN request | |||
| also be the one to CLOSE the file. This might not be possible if | also be the one to CLOSE the file. This might not be possible if | |||
| credentials for the principal are no longer available. The server | credentials for the principal are no longer available. The server | |||
| skipping to change at page 406, line 29 | skipping to change at page 406, line 29 | |||
| case NFS4_OK: | case NFS4_OK: | |||
| LOCK4resok resok4; | LOCK4resok resok4; | |||
| case NFS4ERR_DENIED: | case NFS4ERR_DENIED: | |||
| LOCK4denied denied; | LOCK4denied denied; | |||
| default: | default: | |||
| void; | void; | |||
| }; | }; | |||
| 18.10.3. DESCRIPTION | 18.10.3. DESCRIPTION | |||
| The LOCK operation requests a record lock for the byte range | The LOCK operation requests a byte-range lock for the byte range | |||
| specified by the offset and length parameters. The lock type is also | specified by the offset and length parameters. The lock type is also | |||
| specified to be one of the nfs_lock_type4s. If this is a reclaim | specified to be one of the nfs_lock_type4s. If this is a reclaim | |||
| request, the reclaim parameter will be TRUE. | request, the reclaim parameter will be TRUE. | |||
| Bytes in a file may be locked even if those bytes are not currently | Bytes in a file may be locked even if those bytes are not currently | |||
| allocated to the file. To lock the file from a specific offset | allocated to the file. To lock the file from a specific offset | |||
| through the end-of-file (no matter how long the file actually is) use | through the end-of-file (no matter how long the file actually is) use | |||
| a length field with all bits set to 1 (one). If the length is zero, | a length field with all bits set to 1 (one). If the length is zero, | |||
| or if a length which is not all bits set to one is specified, and | or if a length which is not all bits set to one is specified, and | |||
| length when added to the offset exceeds the maximum 64-bit unsigned | length when added to the offset exceeds the maximum 64-bit unsigned | |||
| skipping to change at page 407, line 7 | skipping to change at page 407, line 7 | |||
| client specifies a range that overlaps one or more bytes beyond | client specifies a range that overlaps one or more bytes beyond | |||
| offset 0xFFFFFFFF, but does not end at the maximum 64 bit offset | offset 0xFFFFFFFF, but does not end at the maximum 64 bit offset | |||
| (i.e. 0xFFFFFFFFFFFFFFFF), such a 32-bit server MUST return the error | (i.e. 0xFFFFFFFFFFFFFFFF), such a 32-bit server MUST return the error | |||
| NFS4ERR_BAD_RANGE. | NFS4ERR_BAD_RANGE. | |||
| If the server returns NFS4ERR_DENIED, owner, offset, and length of a | If the server returns NFS4ERR_DENIED, owner, offset, and length of a | |||
| conflicting lock are returned. | conflicting lock are returned. | |||
| The locker argument specifies the lock-owner that is associated with | The locker argument specifies the lock-owner that is associated with | |||
| the LOCK request. The locker4 structure is a switched union that | the LOCK request. The locker4 structure is a switched union that | |||
| indicates whether the client has already created record locking state | indicates whether the client has already created byte-range locking | |||
| associated with the current open file and lock-owner. In the case in | state associated with the current open file and lock-owner. In the | |||
| which it has, the argument is just a stateid for the set of locks | case in which it has, the argument is just a stateid for the set of | |||
| associated with that open file and lock-owner, together with a | locks associated with that open file and lock-owner, together with a | |||
| lock_seqid value which MAY be any value and MUST be ignored by the | lock_seqid value which MAY be any value and MUST be ignored by the | |||
| server. In the case where no such state has been established, or the | server. In the case where no such state has been established, or the | |||
| client does not have the stateid available, the argument contains the | client does not have the stateid available, the argument contains the | |||
| stateid of the open file with which this lock is to be associated, | stateid of the open file with which this lock is to be associated, | |||
| together with the lock-owner with which the lock is to be associated. | together with the lock-owner with which the lock is to be associated. | |||
| The open_to_lock_owner case covers the very first lock done by a | The open_to_lock_owner case covers the very first lock done by a | |||
| lock-owner for a given open file and offers a method to use the | lock-owner for a given open file and offers a method to use the | |||
| established state of the open_stateid to transition to the use of a | established state of the open_stateid to transition to the use of a | |||
| lock stateid. | lock stateid. | |||
| skipping to change at page 408, line 15 | skipping to change at page 408, line 15 | |||
| includes multiple locks already granted to that lock-owner, in whole | includes multiple locks already granted to that lock-owner, in whole | |||
| or in part, and the server does not support such locking operations | or in part, and the server does not support such locking operations | |||
| (i.e. does not support POSIX locking semantics), the server will | (i.e. does not support POSIX locking semantics), the server will | |||
| return the error NFS4ERR_LOCK_RANGE. In that case, the client may | return the error NFS4ERR_LOCK_RANGE. In that case, the client may | |||
| return an error, or it may emulate the required operations, using | return an error, or it may emulate the required operations, using | |||
| only LOCK for ranges that do not include any bytes already locked by | only LOCK for ranges that do not include any bytes already locked by | |||
| that lock-owner and LOCKU of locks held by that lock-owner | that lock-owner and LOCKU of locks held by that lock-owner | |||
| (specifying an exactly-matching range and type). Similarly, when the | (specifying an exactly-matching range and type). Similarly, when the | |||
| client makes a lock request that amounts to upgrading (changing from | client makes a lock request that amounts to upgrading (changing from | |||
| a read lock to a write lock) or downgrading (changing from write lock | a read lock to a write lock) or downgrading (changing from write lock | |||
| to a read lock) an existing record lock, and the server does not | to a read lock) an existing byte-range lock, and the server does not | |||
| support such a lock, the server will return NFS4ERR_LOCK_NOTSUPP. | support such a lock, the server will return NFS4ERR_LOCK_NOTSUPP. | |||
| Such operations may not perfectly reflect the required semantics in | Such operations may not perfectly reflect the required semantics in | |||
| the face of conflicting lock requests from other clients. | the face of conflicting lock requests from other clients. | |||
| When a client holds a write delegation, the client holding that | When a client holds a write delegation, the client holding that | |||
| delegation is assured that there are no opens by other clients. | delegation is assured that there are no opens by other clients. | |||
| Thus, there can be no conflicting LOCK requests from such clients. | Thus, there can be no conflicting LOCK requests from such clients. | |||
| Therefore, the client may be handling locking requests locally, | Therefore, the client may be handling locking requests locally, | |||
| without doing LOCK operations on the server. If it does that, it | without doing LOCK operations on the server. If it does that, it | |||
| must be prepared to update the lock status on the server, by doing | must be prepared to update the lock status on the server, by doing | |||
| skipping to change at page 410, line 47 | skipping to change at page 410, line 47 | |||
| union LOCKU4res switch (nfsstat4 status) { | union LOCKU4res switch (nfsstat4 status) { | |||
| case NFS4_OK: | case NFS4_OK: | |||
| stateid4 lock_stateid; | stateid4 lock_stateid; | |||
| default: | default: | |||
| void; | void; | |||
| }; | }; | |||
| 18.12.3. DESCRIPTION | 18.12.3. DESCRIPTION | |||
| The LOCKU operation unlocks the record lock specified by the | The LOCKU operation unlocks the byte-range lock specified by the | |||
| parameters. The client may set the locktype field to any value that | parameters. The client may set the locktype field to any value that | |||
| is legal for the nfs_lock_type4 enumerated type, and the server MUST | is legal for the nfs_lock_type4 enumerated type, and the server MUST | |||
| accept any legal value for locktype. Any legal value for locktype | accept any legal value for locktype. Any legal value for locktype | |||
| has no effect on the success or failure of the LOCKU operation. | has no effect on the success or failure of the LOCKU operation. | |||
| The ranges are specified as for LOCK. The NFS4ERR_INVAL and | The ranges are specified as for LOCK. The NFS4ERR_INVAL and | |||
| NFS4ERR_BAD_RANGE errors are returned under the same circumstances as | NFS4ERR_BAD_RANGE errors are returned under the same circumstances as | |||
| for LOCK. | for LOCK. | |||
| The seqid parameter MAY be any value and the server MUST ignore it. | The seqid parameter MAY be any value and the server MUST ignore it. | |||
| skipping to change at page 440, line 47 | skipping to change at page 440, line 47 | |||
| is returned with a data length set to 0 (zero) and eof is set to | is returned with a data length set to 0 (zero) and eof is set to | |||
| TRUE. The READ is subject to access permissions checking. | TRUE. The READ is subject to access permissions checking. | |||
| If the client specifies a count value of 0 (zero), the READ succeeds | If the client specifies a count value of 0 (zero), the READ succeeds | |||
| and returns 0 (zero) bytes of data again subject to access | and returns 0 (zero) bytes of data again subject to access | |||
| permissions checking. The server may choose to return fewer bytes | permissions checking. The server may choose to return fewer bytes | |||
| than specified by the client. The client needs to check for this | than specified by the client. The client needs to check for this | |||
| condition and handle the condition appropriately. | condition and handle the condition appropriately. | |||
| Except when special stateids are used, the stateid value for a READ | Except when special stateids are used, the stateid value for a READ | |||
| request represents a value returned from a previous record lock or | request represents a value returned from a previous byte-range lock | |||
| share reservation request or the stateid associated with a | or share reservation request or the stateid associated with a | |||
| delegation. The stateid identifies the associated owners if any and | delegation. The stateid identifies the associated owners if any and | |||
| is used by the server to verify that the associated locks are still | is used by the server to verify that the associated locks are still | |||
| valid (e.g. have not been revoked). | valid (e.g. have not been revoked). | |||
| If the read ended at the end-of-file (formally, in a correctly formed | If the read ended at the end-of-file (formally, in a correctly formed | |||
| READ request, if offset + count is equal to the size of the file), or | READ request, if offset + count is equal to the size of the file), or | |||
| the read request extends beyond the size of the file (if offset + | the read request extends beyond the size of the file (if offset + | |||
| count is greater than the size of the file), eof is returned as TRUE; | count is greater than the size of the file), eof is returned as TRUE; | |||
| otherwise it is FALSE. A successful READ of an empty file will | otherwise it is FALSE. A successful READ of an empty file will | |||
| always return eof as TRUE. | always return eof as TRUE. | |||
| skipping to change at page 441, line 45 | skipping to change at page 441, line 45 | |||
| what the requesting client believes to be the case. This would | what the requesting client believes to be the case. This would | |||
| reduce the actual amount of data available to the client. It is | reduce the actual amount of data available to the client. It is | |||
| possible that the server may back off the transfer size and reduce | possible that the server may back off the transfer size and reduce | |||
| the read request return. Server resource exhaustion may also occur | the read request return. Server resource exhaustion may also occur | |||
| necessitating a smaller read return. | necessitating a smaller read return. | |||
| If mandatory file locking is in effect for the file, and if the | If mandatory file locking is in effect for the file, and if the | |||
| region corresponding to the data to be read from file is write locked | region corresponding to the data to be read from file is write locked | |||
| by an owner not associated the stateid, the server will return the | by an owner not associated the stateid, the server will return the | |||
| NFS4ERR_LOCKED error. The client should try to get the appropriate | NFS4ERR_LOCKED error. The client should try to get the appropriate | |||
| read record lock via the LOCK operation before re-attempting the | read byte-range lock via the LOCK operation before re-attempting the | |||
| READ. When the READ completes, the client should release the record | READ. When the READ completes, the client should release the byte- | |||
| lock via LOCKU. | range lock via LOCKU. | |||
| If another client has a write delegation for the file being read, the | If another client has a write delegation for the file being read, the | |||
| delegation must be recalled, and the operation cannot proceed until | delegation must be recalled, and the operation cannot proceed until | |||
| that delegation is returned or revoked. Except where this happens | that delegation is returned or revoked. Except where this happens | |||
| very quickly, one or more NFS4ERR_DELAY errors will be returned to | very quickly, one or more NFS4ERR_DELAY errors will be returned to | |||
| requests made while the delegation remains outstanding. Normally, | requests made while the delegation remains outstanding. Normally, | |||
| delegations will not be recalled as a result of a READ operation | delegations will not be recalled as a result of a READ operation | |||
| since the recall will occur as a result of an earlier OPEN. However, | since the recall will occur as a result of an earlier OPEN. However, | |||
| since it is possible for a READ to be done with a special stateid, | since it is possible for a READ to be done with a special stateid, | |||
| the server needs to check for this case even though the client should | the server needs to check for this case even though the client should | |||
| skipping to change at page 458, line 26 | skipping to change at page 458, line 26 | |||
| the attributes that follow the bitmap in bit order. | the attributes that follow the bitmap in bit order. | |||
| The stateid argument for SETATTR is used to provide file locking | The stateid argument for SETATTR is used to provide file locking | |||
| context that is necessary for SETATTR requests that set the size | context that is necessary for SETATTR requests that set the size | |||
| attribute. Since setting the size attribute modifies the file's | attribute. Since setting the size attribute modifies the file's | |||
| data, it has the same locking requirements as a corresponding WRITE. | data, it has the same locking requirements as a corresponding WRITE. | |||
| Any SETATTR that sets the size attribute is incompatible with a share | Any SETATTR that sets the size attribute is incompatible with a share | |||
| reservation that specifies DENY_WRITE. The area between the old end- | reservation that specifies DENY_WRITE. The area between the old end- | |||
| of-file and the new end-of-file is considered to be modified just as | of-file and the new end-of-file is considered to be modified just as | |||
| would have been the case had the area in question been specified as | would have been the case had the area in question been specified as | |||
| the target of WRITE, for the purpose of checking conflicts with | the target of WRITE, for the purpose of checking conflicts with byte- | |||
| record locks, for those cases in which a server is implementing | range locks, for those cases in which a server is implementing | |||
| mandatory record locking behavior. A valid stateid should always be | mandatory byte-range locking behavior. A valid stateid should always | |||
| specified. When the file size attribute is not set, the special | be specified. When the file size attribute is not set, the special | |||
| stateid consisting of all bits zero should be passed. | stateid consisting of all bits zero should be passed. | |||
| On either success or failure of the operation, the server will return | On either success or failure of the operation, the server will return | |||
| the attrsset bitmask to represent what (if any) attributes were | the attrsset bitmask to represent what (if any) attributes were | |||
| successfully set. The attrsset in the response is a subset of the | successfully set. The attrsset in the response is a subset of the | |||
| bitmap4 that is part of the obj_attributes in the argument. | bitmap4 that is part of the obj_attributes in the argument. | |||
| On success, the current filehandle retains its value. | On success, the current filehandle retains its value. | |||
| 18.30.4. IMPLEMENTATION | 18.30.4. IMPLEMENTATION | |||
| skipping to change at page 463, line 22 | skipping to change at page 463, line 22 | |||
| UNSTABLE4, the server is free to commit any part of the data and the | UNSTABLE4, the server is free to commit any part of the data and the | |||
| metadata to stable storage, including all or none, before returning a | metadata to stable storage, including all or none, before returning a | |||
| reply to the client. There is no guarantee whether or when any | reply to the client. There is no guarantee whether or when any | |||
| uncommitted data will subsequently be committed to stable storage. | uncommitted data will subsequently be committed to stable storage. | |||
| The only guarantees made by the server are that it will not destroy | The only guarantees made by the server are that it will not destroy | |||
| any data without changing the value of verf and that it will not | any data without changing the value of verf and that it will not | |||
| commit the data and metadata at a level less than that requested by | commit the data and metadata at a level less than that requested by | |||
| the client. | the client. | |||
| Except when special stateids are used, the stateid value for a WRITE | Except when special stateids are used, the stateid value for a WRITE | |||
| request represents a value returned from a previous record lock or | request represents a value returned from a previous byte-range lock | |||
| share reservation request or the stateid associated with a | or share reservation request or the stateid associated with a | |||
| delegation. The stateid identifies the associated owners if any and | delegation. The stateid identifies the associated owners if any and | |||
| is used by the server to verify that the associated locks are still | is used by the server to verify that the associated locks are still | |||
| valid (e.g. have not been revoked). | valid (e.g. have not been revoked). | |||
| Upon successful completion, the following results are returned. The | Upon successful completion, the following results are returned. The | |||
| count result is the number of bytes of data written to the file. The | count result is the number of bytes of data written to the file. The | |||
| server may write fewer bytes than requested. If so, the actual | server may write fewer bytes than requested. If so, the actual | |||
| number of bytes written starting at location, offset, is returned. | number of bytes written starting at location, offset, is returned. | |||
| The server also returns an indication of the level of commitment of | The server also returns an indication of the level of commitment of | |||
| skipping to change at page 465, line 33 | skipping to change at page 465, line 33 | |||
| been committed on the server. | been committed on the server. | |||
| Some implementations may return NFS4ERR_NOSPC instead of | Some implementations may return NFS4ERR_NOSPC instead of | |||
| NFS4ERR_DQUOT when a user's quota is exceeded. In the case that the | NFS4ERR_DQUOT when a user's quota is exceeded. In the case that the | |||
| current filehandle is of type NF4DIR, the server will return | current filehandle is of type NF4DIR, the server will return | |||
| NFS4ERR_ISDIR. If the current file is a symbolic link, the error | NFS4ERR_ISDIR. If the current file is a symbolic link, the error | |||
| NFS4ERR_SYMLINK will be returned. Otherwise, if the current | NFS4ERR_SYMLINK will be returned. Otherwise, if the current | |||
| filehandle does not designate an ordinary file, the server will | filehandle does not designate an ordinary file, the server will | |||
| return NFS4ERR_WRONG_TYPE. | return NFS4ERR_WRONG_TYPE. | |||
| If mandatory file locking is on for the file, and corresponding | If mandatory file locking is on for the file, and corresponding byte- | |||
| record of the data to be written file is read or write locked by an | range of the data to be written file is read or write locked by an | |||
| owner that is not associated with the stateid, the server will return | owner that is not associated with the stateid, the server will return | |||
| NFS4ERR_LOCKED. If so, the client must check if the owner | NFS4ERR_LOCKED. If so, the client must check if the owner | |||
| corresponding to the stateid used with the WRITE operation has a | corresponding to the stateid used with the WRITE operation has a | |||
| conflicting read lock that overlaps with the region that was to be | conflicting read lock that overlaps with the region that was to be | |||
| written. If the stateid's owner has no conflicting read lock, then | written. If the stateid's owner has no conflicting read lock, then | |||
| the client should try to get the appropriate write record lock via | the client should try to get the appropriate write byte-range lock | |||
| the LOCK operation before re-attempting the WRITE. When the WRITE | via the LOCK operation before re-attempting the WRITE. When the | |||
| completes, the client should release the record lock via LOCKU. | WRITE completes, the client should release the byte-range lock via | |||
| LOCKU. | ||||
| If the stateid's owner had a conflicting read lock, then the client | If the stateid's owner had a conflicting read lock, then the client | |||
| has no choice but to return an error to the application that | has no choice but to return an error to the application that | |||
| attempted the WRITE. The reason is that since the stateid's owner | attempted the WRITE. The reason is that since the stateid's owner | |||
| had a read lock, the server either attempted to temporarily | had a read lock, the server either attempted to temporarily | |||
| effectively upgrade this read lock to a write lock, or the server has | effectively upgrade this read lock to a write lock, or the server has | |||
| no upgrade capability. If the server attempted to upgrade the read | no upgrade capability. If the server attempted to upgrade the read | |||
| lock and failed, it is pointless for the client to re-attempt the | lock and failed, it is pointless for the client to re-attempt the | |||
| upgrade via the LOCK operation, because there might be another client | upgrade via the LOCK operation, because there might be another client | |||
| also trying to upgrade. If two clients are blocked trying upgrade | also trying to upgrade. If two clients are blocked trying upgrade | |||
| skipping to change at page 498, line 35 | skipping to change at page 498, line 35 | |||
| 18.38.2. RESULT | 18.38.2. RESULT | |||
| struct FREE_STATEID4res { | struct FREE_STATEID4res { | |||
| nfsstat4 fsr_status; | nfsstat4 fsr_status; | |||
| }; | }; | |||
| 18.38.3. DESCRIPTION | 18.38.3. DESCRIPTION | |||
| The FREE_STATEID operation is used to free a stateid which no longer | The FREE_STATEID operation is used to free a stateid which no longer | |||
| has any associated locks (including opens, record locks, delegations, | has any associated locks (including opens, byte-range locks, | |||
| layouts). This may be because of client unlock operations or because | delegations, layouts). This may be because of client unlock | |||
| of server revocation. If there are valid locks (of any kind) | operations or because of server revocation. If there are valid locks | |||
| associated with the stateid in question, the error NFS4ERR_LOCKS_HELD | (of any kind) associated with the stateid in question, the error | |||
| will be returned, and the associated stateid will not be freed. | NFS4ERR_LOCKS_HELD will be returned, and the associated stateid will | |||
| not be freed. | ||||
| When a stateid is freed which had been associated with revoked locks, | When a stateid is freed which had been associated with revoked locks, | |||
| the client, by doing the FREE_STATEID acknowledges the loss of those | the client, by doing the FREE_STATEID acknowledges the loss of those | |||
| locks. This allows the server, once all such revoked state is | locks. This allows the server, once all such revoked state is | |||
| acknowledged, to allow that client again to reclaim locks, without | acknowledged, to allow that client again to reclaim locks, without | |||
| encountering the edge conditions discussed in Section 8.4.2. | encountering the edge conditions discussed in Section 8.4.2. | |||
| Once a successful FREE_STATEID is done for a given stateid, any | Once a successful FREE_STATEID is done for a given stateid, any | |||
| subsequent use of that stateid will result in an NFS4ERR_BAD_STATEID | subsequent use of that stateid will result in an NFS4ERR_BAD_STATEID | |||
| error. | error. | |||
| skipping to change at page 512, line 27 | skipping to change at page 512, line 27 | |||
| 1 overlaps two or more striping patterns. In which case, | 1 overlaps two or more striping patterns. In which case, | |||
| logr_layout will contain two or more elements, and the sum of the | logr_layout will contain two or more elements, and the sum of the | |||
| lo_length fields of each element MUST be at least loga_minlength | lo_length fields of each element MUST be at least loga_minlength | |||
| unless the first exception also applies. | unless the first exception also applies. | |||
| If this requirement cannot be met, the server MUST NOT return a | If this requirement cannot be met, the server MUST NOT return a | |||
| layout and the error NFS4ERR_BADLAYOUT MUST be returned. | layout and the error NFS4ERR_BADLAYOUT MUST be returned. | |||
| The loga_stateid field specifies a valid stateid. If a layout is not | The loga_stateid field specifies a valid stateid. If a layout is not | |||
| currently held by the client, the loga_stateid field represents a | currently held by the client, the loga_stateid field represents a | |||
| stateid reflecting the correspondingly valid open, record lock, or | stateid reflecting the correspondingly valid open, byte-range lock, | |||
| delegation stateid. Once a layout is held by the client for the | or delegation stateid. Once a layout is held by the client for the | |||
| file, the loga_stateid field is a stateid as returned from a previous | file, the loga_stateid field is a stateid as returned from a previous | |||
| LAYOUTGET or LAYOUTRETURN operation or provided by a CB_LAYOUTRECALL | LAYOUTGET or LAYOUTRETURN operation or provided by a CB_LAYOUTRECALL | |||
| operation (see Section 12.5.3). | operation (see Section 12.5.3). | |||
| The loga_maxcount field specifies the maximum layout size (in bytes) | The loga_maxcount field specifies the maximum layout size (in bytes) | |||
| that the client can handle. If the size of the layout structure | that the client can handle. If the size of the layout structure | |||
| exceeds the size specified by maxcount, the metadata server will | exceeds the size specified by maxcount, the metadata server will | |||
| return the NFS4ERR_TOOSMALL error. | return the NFS4ERR_TOOSMALL error. | |||
| The returned layout is expressed as an array, logr_layout, with each | The returned layout is expressed as an array, logr_layout, with each | |||
| End of changes. 52 change blocks. | ||||
| 180 lines changed or deleted | 183 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||