How HTAR Works

HTAR makes an archive (or library) file in the standard POSIX 1003.1 TAR format, which allows TAR to open any HTAR archive file. Because HTAR offers more services than ordinary TAR, it needs extra internal machinery to support those services, some of which reveals itself in HTAR status messages or command responses. This section briefly explains how HTAR makes an archive file and the role that several support files play in that process.

  • Archive File (name.tar): When you run HTAR with the create-archive (-c) option, the program first opens a connection to storage (HPSS). It then deploys multiple threads to transfer in parallel (but not with PFTP) the local disk files that you specify into a TAR-format envelope file created (unless you request otherwise) in your storage home directory. This archive file never exists on local disk (unless you demand it with the -E option), even in temporary directories on the machine where HTAR runs. Instead, HTAR reads the member files piecewise into its internal buffers and moves the data directly to HPSS, where it assembles the archive. HTAR simultaneously builds a separate index file (outside the archive) and a little consistency file (deposited last inside the archive), discussed below. HPSS is very reliable, but HTAR automatically uses a storage class of service (COS) that keeps only one copy of your stored archive file. For files of special importance (only), use HTAR's -Y dualcopy option to force creation of a duplicate (invisible) backup copy. Use -K to verify your archived results. Note: The .tar suffix is not required for the archive file name, but it may be useful to the user as an indication of file type.
  • Index File (name.tar.idx): To allow archives of unlimited size and to support the direct extraction of any stored archive member(s) without retrieving the whole archive to local disk, HTAR automatically builds an external index file to accompany every archive that you create. While making the archive, HTAR temporarily writes the index file to the local /tmp file system on the machine where it runs, then transfers it (by default) to the same storage directory where the archive itself resides at the end of the process. Each HTAR index file contains one 512-byte record for every member file, directory entry, or symbolic link stored in the corresponding archive file, regardless of the member file's size (so even a 10,000-file archive will have an index file of only about 5 MB). HTAR index files are so much smaller than the archives that they support that the index file often remains on HPSS disk (to rapidly respond to queries) even when the larger archive file itself migrates to storage tape. If you use HTAR's -E option to force the archive to local disk, the index file is written to the same location as the archive file.
  • Consistency File (/usr/tmp/HTAR_CF_CHK_nnnnn_mmmmmmmmm): Because the archive and index files are separate, HTAR maintains a consistency check between them in an additional 1-block (256-byte) file always included (as a last step) at the end of each archive. This consistency file's name has the long numerical format shown above, but it begins with /var/tmp/uname. HTAR never extracts this file (unless you specifically request it), but every use of -t and -v (together with -c or -x) reports this consistency file at the end of HTAR's list of archived contents. (Verification option -K neither reports this consistency file nor counts it.)

TAR and HTAR Compared

HTAR is specifically designed to efficiently store a set of files together in HPSS or get them back (not merely to make an archive file and leave it).The table below compares the TAR features and effects with those of HTAR.

Feature

TAR

HTAR

Can create an archive file without storing it? Yes (the default) With -E
Can create an archive file without using local disk space? No Yes (the default)
Can store an archive file while creating it? No, needs FTP Yes (the default)
Can read any TAR archive file? Yes (the default) Yes, if -X first
Can read any HTAR archive file? Yes Yes (the default)
Can extract just one file from a stored archive? No Yes (the default)
Can add file(s) to an existing archive? Yes No
Default target if no archive specified? Yes (tape) No, -f required
Treats input directories recursively? Yes Yes
Preserves original permissions on files? No (uses UMASK) Yes, with -p
Depends on HPSS availability to work? No Yes
Archive duplicated automatically in storage? No Only with -Y dualcopy
Builds and needs an external index file? No Yes
Builds and needs a consistency check file? No Yes
Overwrites existing files without warning? Yes Yes (-w disables)
Can use standard input or output? Yes (with -f -) Yes (with -L, -O)
Order of options important? Somewhat Somewhat
Table of contents (-t) reveals what? File names only File names and properties
Can create and verify CRC checksums of member files? No Yes
Can verify contents of a newly created archive as part of creation operation? No Yes