FTP (File Transfer Protocol) is an industry-standard protocol and user interface for transferring files between computer systems by means of a series of interactive commands. FTP involves a local client (software you execute to send or receive files) and a remote server (software elsewhere that responds to instructions from your client to accept or deliver files).
FTP features include:
- Use of standard TCP/IP network protocols to move files between machines.
- Support for transfers to or from nonUNIX systems as well as among computers running UNIX.
- Use of IP host addresses as well as domain names to specify transfer targets.
- Interactive login, usually with a password, to begin transfers to or from each remote machine (at LC, some transfers are preauthenticated to omit the password).
LC users with special file-transfer needs (such as for batch-oriented command files, extensive tracking of each transfer, or persistent transfers if network problems arise) may prefer to use the NFT file-transfer tool to move files among LC machines. Users whose primary interest in FTP is as an interface to LC's High Performance Storage System (archival file storage) may want to consult the Using LC Archival Storage for helpful comparisons. For an alternative file-transfer tool (that offers special services beyond the basic FTP interface and that transfers to or from storage), see the HSI manual and the HTAR Reference Manual. If you prefer to FTP using a graphical user interface, then run Hopper on any LC production machine and select FTP from Hopper's Connect menu.
This manual explains how to run FTP and shows a typical FTP session. Standard FTP commands, as well as server replies and error codes, are easily searched online and are therefore not in this guide. On many LC production machines, a parallel FTP client (PFTP) is now the default, and parallel transfers occur automatically when they are possible. Users who need to transfer files (to FIS) with their data encrypted can try Secure FTP, a special FTP client with very limited server support.
To run FTP on any LC machine, type
where remotehost is either the IP address or the domain name of the computer with which you want to exchange files. The machine on which you run FTP is the "client" or "local" machine, and the machine whose address or name you specify on the execute line is the "server" or "remote" machine (for purposes of describing commands and file transfers below). If run with no remotehost, FTP prompts for input (and you will need to use its interactive open command to specify a target host).
You must log in to your local machine to run FTP, and you must also log in to the specified remote machine at the start of each FTP file-transfer session (when you are prompted for your remote username and password, which might be different from the local ones). FTP expects file transfers to be done by a series of interactive commands, and it does not allow "third-party" transfers (between two remote machines). On LC production machines, Hopper serves as a graphical controller for FTP.
LC uses its hardware/software security firewall to block direct FTP connections from machines outside the llnl.gov domain to LC machines within llnl.gov. Offsite users must either log on to an llnl.gov production machine, execute FTP there, and then draw external files toward them (with the get option), or log in to the Lab's Virtual Private Network (VPN) before beginning an FTP session. For more information, see the Access Information section of LC's Computing Web pages.
On all LC production machines, open and secure (but not necessarily on LC's other machines), a parallel FTP client (PFTP) is the default. Parallel file transfers occur automatically when they are possible.
For information on how executing the specialized secure FTP (SFTP) client differs from running standard FTP, consult the SFTP section.
Suppose you want to bundle a set of files (perhaps including directory trees) and transfer the resulting archive to another LC machine, but you lack enough disk space to run TAR locally and thus double your disk usage on the client machine before you invoke FTP for the transfer. LC's special HTAR utility creates a large archive file directly in storage without your needing to invoke FTP separately.
See the HTAR Reference Manual for details and annotated examples.
Parallel FTP service is available between each LC production machine and (both OCF and SCF) storage, as well as between pairs of LC production machines themselves. In all cases where parallel service is available, it is automatic. Multiple "command completed" messages (one for each parallel stripe) betray the parallel transfer of large files with FTP. (HTAR also automatically uses parallel transfers but does not execute the PFTP client.)
Also, the login nodes on some LC production machines are connected to "jumbo-frame gigabit Ethernet links" for fast network traffic. The OCF machines oslic and rzslic and SCF cslic have jumbo-frame links, and are best used for data transfer to storage. Naturally, the best file-transfer rates occur when you get parallel FTP between a pair of machines that also has the fast jumbo-frame links. For example, NFT automatically routes storage transfers to a cluster's login nodes to take advantage of the jumbo-frame links.
LC has installed jumbo-frame gigabit Ethernet links on both OCF and SCF storage systems and the login nodes of most LC clusters (AIX and Linux/CHAOS). Use of these jumbo-frame network links for faster file transfers is automatic among those LC computers that have them installed. FTP, HSI, HTAR, and NFT all utilize the jumbo-frame gigabit Ethernet links automatically.
The default FTP client on all LC production machines (but not necessarily on special-purpose machines) is a locally developed version that enables you to transfer data in parallel. When parallel FTP transfers are possible, they occur automatically. The FTP server ("daemon") on your destination (target) machine determines whether or not the file transfer is automatically parallel. This is the case for all LC production clusters (AIX or Linux/CHAOS) and storage. For all files over 4 Mbytes, FTP file transfers to storage from all LC production machines (both directions) are automatically parallel (OCF and SCF). Transfers originating on those machines with jumbo-frame gigabit Ethernet links also automatically use those links for even faster data movement. FTP's parallel command now simply reports the current parallel stripe width and block size.
The PFTP client offers a number of extra commands (beyond the usual set offered by FTP) to specifically manage parallel file transfers (for example, pget and mpget perform parallel gets). On LC production machines these special PFTP commands are quite unnecessary because parallel transfers occur automatically where they are possible. At other (ASC tri-lab) sites, you may need to remember the special PFTP commands to perform parallel file transfers (especially to storage). See LC's HPSS User Guide for information on extra PFTP commands.
LC's parallel FTP client is more verbose than the standard FTP client during file transfers. Parallel FTP users may want a complete record of each verbose FTP dialog in their batch log files. The child mode execute-line option (-c) causes all interactive output to be sent during batch runs of FTP, and the echo mode (-e) option copies FTP input commands into your batch output. Thus, running FTP with the execute line
ftp -ce remotehost
will preserve all the details of a parallel session even within a batch job. For more information on these and other FTP-related options, consult the FTP main pages.
FTP sessions with storage (and with some other target machines) are fully preauthenticated and do not prompt for your user name, while in other cases FTP returns a
prompt to which you must reply to continue. Parallel users who want to eliminate this Name: prompt from all sessions (including batch sessions) can install a file called .netrc in their (global) home directory, containing the following lines:
default login username macdef init binary <empty line>
where the last line in the .netrc file is present but blank. This will put you into binary mode every time, which is fine as there is no need to use the default ASCII mode.
The following sample session (with annotated steps) shows a typical dialog by which a user (Jane) transfers files interactively using FTP. In this case, the local machine (on which Jane executes the FTP client) is Cab, and the remote machine that files are copied to and from is a local department server called depserver. (For an alternative approach on LC production machines, you can use Hopper as a graphical controller for FTP.)
(1) The user runs FTP (on Cab) with the remote machine's domain name as an argument.
ftp depserver.llnl.gov Connected to depserver.llnl.gov. 220 [NOTICE TO USERS -- very long legal statement] 222 depserver.llnl.gov FTP server (Version LLNL-27...) ready. 202 Command not implemented.
(2) FTP prompts for a user ID and a password to log in to depserver (some LC machines "preauthenticate" and skip this password step).
Name (depserver.llnl.gov:jane): jane 331 Password required for jane. Password: [does not echo] 230 User jane logged in. Remote system type is UNIX. Using binary mode to transfer files.
(3) At the FTP prompt, change remote directories to /var/tmp/jane (which is not shared among LC machines).
ftp> cd /var/tmp/jane 250 CWD command successful.
(4) Next, get the file nft.ps (copy it from depserver to Cab).
ftp> get nft.ps 200 PORT command successful. 150 Opening Binary data connection for nft.ps 226 Transfer complete. 1602470 bytes received in 0.579 seconds (2.64 Mbytes/s)
(5) Next, put the file testfile (copy it from Cab to depserver).
ftp> put testfile 200 PORT command successful. 150 Opening Binary data connection for testfile 226 Transfer complete. 5264 bytes sent in 0 seconds (5.14 Kbytes/s)
(6) The user then transfers a 1.1-Gbyte file called large from Cab to depserver. FTP automatically invokes four parallel stripes (each separately reported as FTP "completed" commands in the output).
ftp> put large 200 Command complete (11496780, large, 0, 4, 4194304) 200 Command complete. Address 1 is 18.104.22.168.2356 200 Command complete. Address 2 is 22.214.171.124.2357 200 Command complete. Address 3 is 126.96.36.199.2358 200 Command complete. Address 4 is 188.8.131.52.2359 150 Transfer starting. 226 Transfer complete. (moved = 11496780). 11496780 bytes sent in 0.79 seconds (16.3 Mbytes/s) 200 Command complete.
(7) When the file transfers are done and confirmed, quit FTP.
ftp> quit 221 Goodbye
FTP commands are described in the FTP man pages. Access them by typing man ftp at the command line prompt. Type help at the FTP prompt to see a list of available commands. Information about standard FTP commands and other FTP issues is easily found online using your favorite Web search engine.
When you enter an FTP command, you receive a corresponding reply indicating that the command was accepted, rejected, or is being processed. An FTP reply consists of a three-digit code followed by a brief description of the result (as seen in the Sample FTP Session above). Information about common FTP replies and errors is easily found online using your preferred Web search engine.
FTP reply codes and their meanings are easily found online using your preferred Web search engine. The exact text accompanying each reply code depends on the command issued.
Standard FTP clients do not encrypt the data that they send to remote hosts, which theoretically allows malicious third parties to intercept and read that data. Secure FTP (SFTP) is a modified client that does encrypt all the files that it sends for greater safety.
SFTP clients reside on all OCF and SCF production machines.
- FIS—LC's File Interchange Service (FIS, at fis.llnl.gov) is the only LC server that now accepts incoming files from SFTP clients. FIS only accepts SFTP transfers from within the LC firewall, so direct SFTP transfers from outside machines by means of VPN are not accepted.
- Others—No other LC FTP servers accept SFTP transfers. In particular, you cannot store files (at storage.llnl.gov) from any host by running SFTP.
SFTP clients present a different user dialog than do standard FTP clients on LC machines. While some differences are trivial, others require different user responses to open connections or to transfer files successfully. SFTP:
- Does not request your user name (nor present it as a default to which you can simply respond by hitting the RETURN key).
- Checks for a host key for every new host to which you try to connect and, if not found, asks if you want to continue connecting (yes/no) anyway.
- Requests your one-time password (OTP) to open every connection unless you have Kerberos or public key authentication (no default preauthentication occurs, unlike for standard FTP connections among LC machines).
- Prompts for input with sftp>.
SFTP recognizes many of the usual set of FTP control options. Type ? or help at the SFTP prompt to see the list of available commands.
Among the most useful standard FTP options that SFTP does not accept are:
In most situations, the SFTP alternative option (ls -l) lists files and their properties just as dir does for standard FTP sessions.
The SFTP alternative rm removes remote files and performs the same functions as delete during standard FTP sessions.
ascii, binary, parallel, quote, site
SFTP provides no alternative options for these commands. It is supposed to automatically detect ASCII and BINARY files on arrival and transfer them in the appropriate mode, but you cannot force the mode if inappropriate transfers occur.
Public Keys (SCF Only)
If you prefer not to use your OTP (one-time password) to authenticate every SFTP session, you can create and install a special file called an SSH public key, generated using OpenSSH, on every pair of machines between which you transfer files with SFTP. Generating an appropriate public key, converting it to the needed OpenSSH format if needed, and installing it in the right directories (including those on the open FIS node) is a complex, multi-step process. If you need assistance with creating a public key, please contact the LC Hotline.
Note: On OCF, SSH public key authentication is only allowed in limited cases. One is between production clusters using port 622, and the other is when uploading to FIS.