This section defines extensions to the FTP specification STD 9, RFC 959, FILE TRANSFER PROTOCOL (FTP) (October 1985) These extensions provide striped data transfer, parallel data transfer, extended data transfer, data buffer size configuration, and data channel authentication.
The following new commands are introduced in this specification
A new transfer mode (extended-block mode) is introduced for parallel and striped data transfers. Also, a set of extension options to RETR are added to control striped data layout and parallelism.
The following new feature names are to be included in the FTP server's response to FEAT if it implements the following sets of functionality
This extension is used to establish a vector of data socket listeners for for a server with one or more stripes. This command MUST be used in conjunction with the extended block mode. The response to this command includes a list of host and port addresses the server is listening on.
Due to the nature of the extended block mode protocol, SPAS must be used in conjunction with data transfer commands which receive data (such as STOR, ESTO, or APPE) and can not be used with commands which send data on the data channels.
spas = "SPAS" <CRLF>
spas-response = "229-Entering Striped Passive Mode" CRLF
1*(<SP> host-port CRLF)
229 End
Where the command is correctly parsed, but the server-DTP cannot process the SPAS request, it must return the same error responses as the PASV command.
This extension is to be used as a complement to the SPAS command to implement striped third-party transfers. This command MUST always be used in conjunction with the extended block mode. The argument to SPOR is a vector of host/TCP listener port pairs to which the server is to connect. This
Due to the nature of the extended block mode protocol, SPOR must be used in conjunction with data transfer commands which send data (such as RETR, ERET, LIST, or NLST) and can not be used with commands which receive data on the data channels.
SPOR 1*(<SP> <host-port>) <CRLF>
The host-port sequence in the command structure MUST match the host-port replies to a SPAS command.
The extended retrieve extension is used to request that a retrieve be done with some additional processing on the server. This command an extensible way of providing server-side data reduction or other modifications to the RETR command. This command is used in place of OPTS to the RETR command to allow server side processing to be done with a single round trip (one command sent to the server instead of two) for latency-critical applications.
ERET may be used with either the data transports defined in RFC 959, or using extended block mode as defined in this document. Using an ERET creates a new virtual file which will be sent, with it's own size and byte range starting at zero. Restart markers generated while processing an ERET are relative to the beginning of this view of the file.
ERET <SP> <retrieve-mode> <SP> <filename> retrieve-mode ::= P <SP> <offset> <SP> <size> offset ::= 64 bit integer size ::= 64 bit integer
The extended store extension is used to request that a store be done with some additional processing on the server. Arbitrary data processing algorithms may be added by defining additional ESTO store-modes. Similar to the ERET, the ESTO command expects data sent to satisfy the request to be sent as if it were a new file with data block offset 0 being beginning the beginning of the new file.
The format of the ESTO command is
ESTO <SP> <store-mode> <filename> store-mode ::= A <SP> <offset>
The store-mode defines the behavior of the extended store. There is one mode defined by this specification, but others may be added later.
This extension adds the capability of a client to set the TCP buffer size for subsequent data connections to a value. This replaces the server-specific commands SITE RBUFSIZE, SITE RETRBUFSIZE, SITE RBUFSZ, SITE SBUFSIZE, SITE SBUFSZ, and SITE BUFSIZE. Clients may wish to consider supporting these other commands to ensure wider compatibility.
sbuf = SBUF <SP> <buffer-size> buffer-size ::= <number>
The buffer-size value is the TCP buffer size in bytes. The TCP window size should be set accordingly by the server.
This extension provides a method for specifying the type of authentication to be performed on FTP data channels. This extension may only be used when the control connection was authenticated using RFC 2228 Security extensions.
The format of the DCAU command is
DCAU <SP> <authentication-mode> <CRLF> authentication-mode ::= <no-authentication> | <authenticate-with-self> | <authenticate-with-subject> no-authentication ::= N authenticate-with-self ::= A authenticate-with-subject ::= S <subject-name> subject-name ::= string
The default data channel authentication mode is A for FTP sessions which are RFC 2228 authenticated---the client must explicitly send a DCAU N message to disable it if it does not implement data channel authentication.
If the security handshake fails, the server should return the error response 432 (Data channel authentication failed).
Clients indicate that they want to use extended block mode by sending the command
MODE <SP> E <CRLF>
on the control channel before a transfer command is sent.
The structure of the extended block header is
Extended Block Header +----------------+-------/-----------+------/------------+ | Descriptor | Byte Count | Offset Count | | 8 bits | 64 bits | 64 bits | +----------------+-------/-----------+------/------------+
The descriptor codes are indicated by bit flags in the descriptor byte. Six codes have been assigned, where each code number is the decimal value of the corresponding bit in the byte.
Code Meaning
128 End of data block is EOR (Legacy)
64 End of data block is EOF
32 Suspected errors in data block
16 Data block is a restart marker
8 End of data block is EOD for a parallel/striped transfer
4 Sender will close the data connection
With this encoding, more than one descriptor coded condition may exist for a particular block. As many bits as necessary may be flagged.
Some additional protocol is added to the extended block mode data channels, to properly handle end-of-file detection in the presence of an unknown number of data streams.
+----------------+-------/--------+------/---------------+ | Descriptor | unused | EOD count expected | | 8 bits | 64 bits | 64 bits | +----------------+-------/--------+------/---------------+
EOF Descriptor. The EOF header descriptor has the same definition as the regular data message header described above.
EOD Count Expected. This 64 bit field represents the total number of data connections that will be established with the server receiving the file. This number is used by the receiver to determine it has received all of the data. When the number of EOD messages received equals the number represented by the "EOD Count Expected" field the receiver has hit end of file.
Simply waiting for EOD on all open data connections is not sufficient. It is possible that the receiver reads an EOD message on all of its open data connects while an additional data connection is in flight. If the receiver were to assume it reached end of file it would fail to receive the data on the in flight connection.
To handle EOF in the multi-striped server case a 126 response has been introduced. When receiving data from a striped server a client makes a control connection to a single host, but several host may create several data connections back to the client. Each host can independently decide how many data connections it will use, but only a single EOF message may be sent to back to the client, therefore it must be possible to aggregate the total number of data connections used in the transfer across the stripes. The 126 response serves this purpose.
The 126 is an intermediate response to RETR command. It has the following format.
"126" <SP> 1*(count of data connections)
Several "Count of data connections" can be in a single reply. They correspond to the stripes returned in the response to the SPAS command.
Discussion of protocol change to enable bidirectional data channels brought up the following problem if doing bidirectional data channels
If the client is pasv, and sending to a multi-stripe server, then the server creates data connections connections; since the client didn't do SPAS, it cannot associate HOST/PORT pairs on the data connections with stripes on the server (it doesn't even know how many there are). it cannot reliably determine which nodes to send data to. (Becomes even more complex in the third-party transfer case, because the sender may have multiple stripes of data.) The basic problem is that we need to know logical stripe numbers to know where to send the data.
extended-mark-response = "111" <SP> "Range Marker" <SP> <byte-ranges-list> byte-ranges-list = <byte-range> [ *("," <byte-range>) ] byte-range = <start-offset> "-" <end-offset> start-offset ::= <number> end-offset ::= <number>
The byte ranges in the marker are an incremental set of byte ranges which have been stored to disk by the data server. The complete restart marker is a concatenation of all byte ranges received by the client in 111 responses.
The client MAY combine adjacent ranges received over several range responses into any number of ranges when sending the REST command to the server to restart a transfer.
For example, the client, on receiving the responses:
111 Range Marker 0-29 111 Range Marker 30-89
may send, equivalently,
REST 0-29,30-89 REST 0-89 REST 30-59,0-29,60-89
to restart the transfer after those 90 bytes have been received.
The server MAY indicate that a given range of data has been received in multiple subsequent range markers. The client MUST be able to handle this. For example:
111 Range Marker 30-59 111 Range Marker 0-89
is equivalent to
111 Range Marker 30-59 111 Range Marker 0-29,60-89
Similarly, the client, if it is doing no processing of the restart markers, MAY send redundant information in a restart.
Should these be allowed as restart markers for stream mode?
extended-perf-response = "112-Perf Marker" CRLF <SP> "Timestamp:" <SP> <timestamp> CRLF <SP> "Stripe Index:" <SP> <stripe-number> CRLF <SP> "Stripe Bytes Transferred:" <SP> <byte count> CRLF <SP> "Total Stripe Count:" <SP> <stripe count> CRLF "112 End" CRLF timestamp = <number> [ "." <digit> ]
<timestamp> is seconds since the epoch
The performance marker can contain these or any other perf-line facts which provide useful information about the current performance.
All perf-line facts represent an instantaneous state of the transfer at the given timestamp. The meaning of the facts are
A server should send a 'start' marker for each stripe. A server should also send a final perf marker for each stripe. This is a marker with 'Stripe Bytes Transferred' set to the total transfer size for that stripe.
The options described in this section provide a means to convey striping and transfer parallelism information to the server-DTP. For the RETR command, the Client-FTP may specify a parallelism and striping mode it wishes the server-DTP to use. These options are only used by the server-DTP if the retrieve operation is done in extended block mode. These options are implemented as RFC 2389 extensions.
The format of the RETR OPTS is specified by:
retr-opts = "OPTS" <SP> "RETR" [<SP> option-list] CRLF option-list = [ layout-opts ";" ] [ parallel-opts ";" ] layout-opts = "StripeLayout=Partitioned" | "StripeLayout=Blocked;BlockSize=" <block-size> parallel-opts = "Parallelism=" <starting-parallelism> "," <minimum-parallelism> "," <maximum-parallelism> block-size ::= <number> starting-parallelism ::= <number> minimum-parallelism ::= <number> maximum-parallelism ::= <number>
[1] Postel, J. and Reynolds, J., " FILE TRANSFER PROTOCOL (FTP)", STD 9, RFC 959, October 1985.
[2] Hethmon, P. and Elz, R., " Feature negotiation mechanism for the File Transfer Protocol", RFC 2389, August 1998.
[3] Horowitz, M. and Lunt, S., " FTP Security Extensions", RFC 2228, October 1997.
[4] Elz, R. and Hethom, P., " FTP Extensions", IETF Draft, May 2001.
There are several security components in this document which are extensions to the behavior of RFC 2228. These appendix attempts to clarify the protocol how these extensions map to the OpenSSL-based implementation of the GSSAPI known as GSI (Grid Security Infrastructure).
A client implementation which communicates with a server which supports the DCAU extension should delegate a limited credential set (using the GSS_C_DELEG_FLAG and GSS_C_GLOBUS_LIMITED_DELEG_PROXY_FLAG flags to gss_init_sec_context()). If delegation is not performed, the client MUST request that DCAU be disable by requesting DCAU N, or the server will be unable to perform the default of DCAU A as described by this document.
When DCAU mode "A" or "S" is used, a separate security context is established on each data channel. The context is established by performing the GSSAPI handshake with the active-DTP calling gss_init_sec_context() and the passive-DTP calling gss_accept_sec_context(). No delegation need be done on these data channels.
Data channel protection via the PROT command MUST always be used in conjunction with the DCAU A or DCAU S commands. If a PROT level is set, then messages will be wrapped according to RFC 2228 Appendix I using the contexts established on each data channel. Tokens transferred over the data channels when either PROT or DCAU is used are not framed in any way when using GSI. (When implementing this specification with other GSSAPI mechanisms, a 4 byte, big endian, binary token length should procede all tokens).
If the DCAU mode or the PROT mode is changed between file transfers when caching data channels in extended block mode, all open data channels must be closed. This is because the GSI implementation does not support changing levels of protection on an existing connection.
about globus |
globus toolkit |
dev.globus
Comments? webmaster@globus.org