HTAR Options

Action Options

One of these action options is required every time that you run HTAR.

-c

(create) opens a connection to storage, creates an archive file at the storage location (not online) and with the name specified by -f, and transfers (copies) into the archive each file specified by filelist (required whenever you use -c). If archivename already exists, HTAR overwrites it without warning. To create a local archive file instead (the way TAR does), also use -E. If filelist specifies any directories, HTAR includes them and all of their children recursively. Use -P with -c to automatically create all needed subdirectories along the archive path name.

-D

(soft delete) opens a connection to storage and reads the existing index file, creating a new temporary index file in the local file system and marking each of the specified member files as deleted in the new index file. It then replaces the existing index file with the new temporary copy.

-K

(verify) opens a connection to storage, verifies the index file for the archive that you specify with -f, then uses the index file to verify every entry in (member of) the archive file itself. The default responses from -K appear very quickly and overwrite, so you may only be able to read the last one ("HTAR successful," if it is). If the index file is missing for an archive, -K reports the error message "no such file archivename.idx." If you combine -K with -v, HTAR lists the name of each file that it finds in the specified archive in alphabetical order, one per line, along with the size of each in bytes and in blocks (excluding the consistency file), then gives a total file count.

-t

(table of contents) opens a connection to storage, then lists the files currently within the stored archive file specified by -f, along with their owner, size, permissions, and modification date (the list includes HTAR's own consistency file). Here filelist defaults to * (all files in the archive), but you can specify a more restrictive subset (usually by making filelist a filter).

-U

(undelete) undeletes the specified member files from the archive that were previously soft-deleted by -D by removing the deleted flag in their index file entries.

-x

(extract) opens a connection to storage, then transfers (extracts, copies) from the stored (remote) archive file specified by -f each internal file specified by filelist (or all files in the archive if you omit filelist). If filelist specifies any directories, HTAR extracts them and all their children recursively. If any file already exists locally, HTAR overwrites it without warning, and it creates all new files with the same owner and group IDs (and if you use -p, with the same UNIX permissions) as they had when stored in the archive. (If you lack needed permissions, extracted files get your own user and group IDs and the local UMASK permissions; if you lack write permission then -x creates no files at all.) Note that -x works directly on the remote archive file; you never retrieve the whole archive from storage just to extract a few specified files from within it.

-X

(index) opens a connection to storage, then creates an (external) index file for the existing archive file specified by -f (a stored TAR format file by default or a local TAR-format file if you also use -E). Using -X rescues an HTAR archive whose (stored) index file was lost, and it enables HTAR to manage an archive originally created by traditional TAR. The resulting external index file is stored if the corresponding archive is stored, but local if the archive is local (with -E). See the How HTAR Works section for an explanation of HTAR index files.

Archive Option

This option is required every time you run HTAR unless you only use the -? option.

-f archivename

(required option) specifies the archive file on which HTAR performs the action options -c|t|x|X|K. HTAR has no default for -f (whose argument must appear immediately after the option name). Because HTAR operates on stored archive files, archivename also locates the archive file relative to your HPSS home directory: a simple file name here (e.g., abc.tar) resides in your storage home directory, while a relative pathname (e.g., xyz/abc.tar) specifies a subdirectory of your storage home directory (i.e., /users/unn/username/xyz/abc.tar). Never use tilde (~) in archivename. HTAR's -f makes no subdirectories; you must have created them in advance.

Control Options

Control options change how HTAR behaves, but they are not required. Default values are indicated when they exist.

-?

displays a short syntax summary of the HTAR execute line and a one-line description of each option. Users running HTAR under some shells may need to protect the question mark by using the three-character string -\? to display this help message.

-B

adds block numbers to the listing (-t) output.

-d debuglevel

(default is 0) sets to an integer from 0 through 5 the level of debug output from HTAR, where 0 disables debug information for normal use and 1 to 5 enable progressively more elaborate debug output.

-E

emulates TAR by forcing the archive file to reside on the local machine (where you run HTAR) rather than in HPSS (storage), where it resides by default (-f always specifies the archive pathname, which -E interprets as local rather than remote). The HTAR index file goes into the same (local) directory as the archive. Option -P works with -E.

--exclude options

 

Note: The exclude family of options only applies to creation of new archive files.  These options are applied during the initial directory scan. The specification (but not the code) for these options was based on the "exclude" feature of the popular GNUTAR program.  However, there is no guarantee that the HTAR exclude feature will operate exactly the same as the GNUTAR exclude feature.

Files and directories that are excluded from the archive are not listed by default.  To enable listing of  excluded files, create an .htarrc file in your home directory with the following contents:

DisplayExcludedObjects = yes 

--exclude=pattern

causes htar to recursively avoid including files or directories whose name matches the shell wildcard pattern. Multiple --exclude options may be given.

--exclude-from=file

causes htar to read a list of shell patterns from file to be recursively excluded.

Note: a frequent error that can be hard to find is whitespace characters after a name read from the file. However, empty lines are ok.

Multiple --exclude-from options may be given.

--exclude-vcs-ignores

Before htar archives a new directory found during the prescan, it looks to see if the directory contains any of the following files:

cvsignore, .gitignore (R), .bzrignore (R), .hgignore(R)

If so, it reads patterns from the file and ignores objects that match any of the patterns.

It treats the files in the same way that the version control system would treat them, some recursively starting at the new directory (marked with R) and some that apply just to the new directory. Patterns in .bzrignore and .hgignore files can be either shell globbing patterns or regular expressions. .bzrignore and .hgignore files can also contain comments whose first character is ’#’.

--exclude-ignore=file

Before scanning a new directory, htar checks if it contains file. If so, it reads exclusions patterns, which apply only to new directory, from file.

--exclude-ignore-recursive=file

This is the same as exclude-ignore, except that patterns apply recursively to the new directory and to all of its subdirectories.

--exclude-vcs

Excludes files and directories used by the following version control systems:

CVS, RCS, SCCS, Arch, Bazaar, Mercurial and Darcs. This includes all of the following files and directories:

  • ‘CVS/’, and everything under it
  • ‘RCS/’, and everything under it
  • ‘SCCS/’, and everything under it
  • ‘.git/’, and everything under it
  • ‘.gitignore’
  • ‘.cvsignore’
  • ‘.svn/’, and everything under it
  • ‘.arch-ids/’, and everything under it
  • ‘{arch}/’, and everything under it
  • ‘=RELEASE-ID’
  • ‘=meta-update’
  • ‘=update’
  • ‘.bzr’
  • ‘.bzrignore’
  • ‘.bzrtags’
  • ‘.hg’
  • '.hgignore’
  • ‘.hgtags’
  • ‘_darcs’

--exclude-backups

causes htar to exclude backup and lock files that match the following shell globbing patterns (with the quotes removed): ".#*" "*~" "#*#"

--exclude-caches options

causes htar to exclude directories that contain a standard CACHEDIR.TAG file, in the form specified by http://www.brynosaurus.com/cachedir/spec.html

There are 3 variations of the exclude-caches option, each with slightly different semantics:

  •  --exclude-caches – do not archive the contents of the directory, but archive the directory itself and the CACHEDIR.TAG file
  • --exclude-caches-under – do not archive the contents of the directory, nor the CACHEDIR.TAG file, archive just the directory itself
  • --exclude-caches-all – entirely omit directories containing the CACHEDIR.TAG file

--exclude-tag options

is a generalization of the ’exclude-caches’ option which allows specifying the filename to look for (instead of CACHEDIR.TAG).

  • --exclude-tag=file do not archive the contents of the directory, but archive the directory itself and file
  • --exclude-tag-under=file – do not archive the contents of the directory, nor file, archive just the directory itself
  • --exclude-tag-all=file – entirely omit directories containing file

-h

(used only with -c; has no effect otherwise) for each symbolic link that it encounters, causes HTAR to replace the link with the actual contents of the linked-to file (stored under the link name, not under the file's original name). Later use of -t or -x treats the linked-to file as if it had always been present as an actual file with the link name. Without -h, HTAR records, reports, and restores every symbolic link overtly, but it does not replace the link with the linked-to contents.

-H subopt[:subopt...]

specifies a colon-delimited list of HTAR suboptions to control program execution. Possible subopt values include:

acct=id/acctname

specifies the numeric account ID or alphabetic account name to use for the current HTAR run. This option is only meaningful for HPSS-resident archives.

cix

used with the extract (-x) operation with HPSS-resident archives. If specified, precopies the index file to a temporary local file before reading the archive file. This option is normally not needed, but was added to avoid problems that were encountered with multithreaded I/O on some hardware platforms.

crc

enables generation of Cyclic Redundancy Checksums (CRCs) when copying member files into the archive and when verifying the contents of the archive (-K command line option, or -Hverify option for creates). Enabling checksums usually degrades HTAR's I/O performance and increases its CPU utilization.

exfile=path

specifies a path name to an "exceptions" file, which contains a list of failed member files and an explanation of the failure. Note: This option is currently implemented only for the GPFS/HPSS Interface (GHI).

family=id[,index_id]

specifies tape file family ID to use when creating HPSS-resident archive files, and, optionally, the family ID to use when creating the index file. This option is useful at sites which make use of the HPSS "file family" capability. Family ID 0, which is the default, uses the default pool of tapes. Contact your HPSS administrator to determine the file families that are available at your site.

nocfchk

causes HTAR to disable the verification of the index file and the consistency file. Use of this option can avoid extra tape mounts if the consistency file lives on a different tape cartridge than the specified member file(s). Currently, this option is only effective for the -D (soft delete) action.

nocrc

(the default) disables generation of CRCs when creating files and when extracting files from or verifying existing archive files.

nostage

avoids prestaging tape-resident (stored) archive files when HTAR performs -x or -X actions.

port=x

Specifies the TCP port number to use when HTAR connects to the remote HPSS server. This parameter is only used in conjunction with the - -Hserver parameter.

relpaths

used with the verify (-K) action. When comparing member files in the archive file with local files, forces relative local file paths to be used by removing any leading "/" from the member file path name before attempting to read it in the local file system.

rmlocal

removes local member files after HTAR has successfully written both the archive file and the index file (used with -c).

server=host

specifies the hostname or TCP/IP address of the HPSS server. The HPSS administrator defines the default server host or IP address when HTAR is built. The -Hport parameter (see above) can be used in conjunction with this option to completely specify the connection address to be used.

tss=stack_size

specifies the thread stack size to be used when HTAR creates threads to read local files during a create (-c) operation. In most cases, the system default value can be used, but situations such as the case where the default thread stack size is set very large, for example, on machines that are tuned for compute-type problems, can cause HTAR thread creations to fail. stack_size can be specified in bytes, kilobytes, or megabytes by appending a case-insensitive suffix (k, kb, m, or mb).

umask=octal_mask

used with the -c option. This specifies the HPSS umask value to be set during HTAR startup. This impacts the permissions that are set on the resulting archive and index files that HTAR creates in the same manner as the Unix umask command.

verify=option[,option,..]

specifies one or more verification options that should be performed following successful creation of the archive (-c), or for the verify (-K) command. Multiple options can be specified by separating them with a comma, with no whitespace. Options are processed from left to right, and, in the case of conflicting options, the last one encountered is used without comment. The options can be either individual items or the keyword "all" or a numeric value of 0, 1, or 2. Each numeric level includes all of the checks for lower-valued levels and adds additional checks. The verifications options are:

all

enables all possible verification options except paranoid.

info

reads and verifies the tar-format headers that precede each member file in the archive.

crc|noncrc

enables or disables recalculation of the cyclic redundancy checksum (CRC) and verification that it matches the value that is stored in the index file. Note that this option only applies if the -Hcrc option was specified, which causes a CRC to be generated for each member file as it is copied into the archive file.

compare|nocompare

enables or disables byte-by-byte comparison of the local member files with the corresponding archive files. If -Hrelpaths is not specified, then absolute paths for member files in the archive will also be treated as absolute local paths.

paranoid|noparanoid

enables or disables (the default) extreme efforts to detect problems (such as discovering whether local files were modified during archive creation before deleting them if authorized by RMLOCAL).

0|1|2

0 enables the "info" verification. 1 enables level 0 and "crc" (i.e., info,crc). 2 enables level 1 and "compare" (i.e., info,crc,compare). It is also possible to specify a verification option such as "all" or a numeric level such as 0, 1, or 2, and then selectively disable one or more options.

 

-I indexname

specifies a nondefault name for the HTAR external index file that supports the archive specified by -f.

WARNING: if you use -I to make any nondefault index name (3 cases, below) when you create (-c) an archive, then you MUST also use -I with the same argument every time you extract (-x) files from that archive (else HTAR will look for the default index, not find it, and end with an error).

There are three cases based on the first character of indexname:

. (dot)

If indexname begins with a period (dot), HTAR treats it as a suffix to append to the current archive name.

Example: -I .xnd yields an index file called archivename.xnd

/

If indexname begins with a / (slash), HTAR treats it as an absolute path name (you must create all the subdirectories ahead of time with FTP's or HSI's mkdir option).

Example: -I /users/unn/yourname/projects/text.idx uses that absolute path name in storage (HPSS) or the local file system (-E) or remote file system (-F) for the index file.

other

If indexname begins with any other character, HTAR treats it as a relative pathname (relative to the storage directory where the archive file resides, which might be different than your storage home directory).

Example: -I projects/first.index locates first.index at storagehome/ projects/first.index if the archive file is in your storagehome (the default), but tries to locate first.index at storagehome/projects/projects/ first.index if the archive was specified as -f projects/aname in the first place. (All such subdirectories must be created in advance or the -P command line option must be specified to create any missing intermediate subdirectories.)

-L inputfile

(used with -c) writes the files and directories specified by their literal names (in the inputfile, which contains file names one per line) into the archive specified by -f. Directories are treated recursively; a directory entry and its subdirectories or subfiles are all written to the archive. Normal metacharacters (tilde, asterisk, question mark) are treated literally, not expanded as filters. Replace inputfile with a hyphen (-L -) for HTAR to read the list of file names from standard input; the HTAR Limitations section shows how to use this technique.

(used with -x) retrieves the files and directories specified by their literal names. See the Retrieving Files example below for how to use -L instead of wild cards to retrieve only specified files from a stored archive.

WARNING: HTAR's -L differs from both AIX TAR's -L (which handles directories nonrecursively) and Linux TAR's -L (which changes tapes).

-m

(used only with -x; applies only to files) makes the time of extraction the last-modified time for each member file (the default preserves each file's original time of last modification). For directories, HTAR itself always preserves the original modification time for top-level directories that it copies from an archive, even if you invoke -m. However, subsequently creating subdirectories or files within a directory may cause the operating system to change the modification time on one or more directories (so that it too appears to be the time of extraction).

-M maxfiles

    (default is 10,000,000 at LLNL) specifies the maximum number of member files allowed when you use -c to create an HTAR archive. Internal limits are set when HTAR is compiled at each site; at LLNL, you can increase maxfiles as high as 50,000,000.

-n timeinterval

(used only with -c; has no effect otherwise) includes in a new archive only those files (that meet your other naming criteria and) that were either created or modified between now and the start of timeinterval. Option -n is intended mostly to simplify the creation of incremental backup archives. Here timeinterval can have the form:

d

an integer that specifies days (e.g., 5 for 5 days), or

:h

an integer that specifies hours (e.g., :12 for :12 hours), or

d:h

a pair of integers that specify days and then hours (e.g., 1:6 for 1 day and 6 hours).

-o

(lowercase, used only with -x) (default for all nonroot users) causes the extracted files to take on the user and group ID (UID, GID) of the person running HTAR, not those of the original archive. This makes a difference for root users but not for ordinary HTAR users.

-O

(uppercase, used only with -x, mimics the Linux TAR --to-stdout option) writes the file(s) extracted from an archive (with -x) to standard output (and hence to a UNIX pipe for postprocessing, if you wish). The HTAR Limitations section shows how to use this technique. Because HTAR does not separate files in the output stream, -O is usually useful only when you extract a single file.

-p

preserves all UNIX permission fields (on extracted files) in their original modes, ignoring the present UMASK (the default changes the permissions to the local UMASK where HTAR extracts the files). Root users can also preserve the setuid, setgid, and sticky bit permissions with this option.

-P

(used only with -c, has no effect otherwise) automatically creates all intermediate subdirectories specified on the archive file's pathname if they do not already exist. HTAR's -P thus works the same as MKDIR's -P option. You can use -P with archives created in HPSS (storage, the default) or on your local machine (with -E).

-q

(quiet mode) suppresses most HTAR informational messages, such as its usual interactive progress reports as it creates an archive file.

-S bufsize

(default is 16 Mbyte) specifies the buffer size to use when HTAR reads from or writes to an HPSS archive file. Here bufsize can be a plain integer (interpreted as bytes), an integer suffixed by k, K, kb, or KB for kilobytes, or an integer suffixed by m, M, mb, or MB for megabytes (e.g., 16mb). -S is intended mostly for LC staff, not ordinary HTAR users.

-T maxthreads

specifies the maximum number of threads that HTAR will use to copy member files to or from the archive file (default varies from 5 to 20 threads). This value is ignored when extracting member files from an archive (-x). HTAR reports the actual number of threads used on each run if you invoke -v or -V. HTAR creates a maxthreads pool of threads and then uses buffer size (see -S), average member file size, and HPSS network transfer rates to estimate how many threads to actually deploy. Normally, the smaller the member file size, the more threads can be active when creating files. For small files, setting -T to a larger number (up to 100 has been tested) can dramatically improve the transfer rates if the operating system is able to support the load.

-U

undeletes soft-deleted member files (see -D above) by copying the existing index file to a temporary local file, removing the deleted flag in the specified index entries along the way, and then rewriting the temporary index to the same location.

-V

requests "slightly verbose" reporting of file-transfer progress (often very brief, overwritten messages to the terminal). Do not use with -v.

-v

requests "very verbose" reporting of file-transfer progress. For each member file transferred to an archive, HTAR prints A (added) and its name on one line; for each member file extracted from an archive, HTAR prints X, its name, and its size on a line, along with a summary of the whole transfer at the end. For each file added during a build index (-X) operation, HTAR prints i and its name. For each file verified during a verify operation (-K), HTAR prints v (or V if comparing archive and local file contents), its name, and a trailing / if this is a directory. For each file that is soft-deleted during a delete (-D) operation, HTAR prints d; similarly, for an undelete (-U) operation, HTAR prints u. Do not use with -V.

-w

(works only with -x, -D, -U, not with -c) lists (one by one) each member file to be extracted from the archive and prompts you for your choice of confirmatory action, where possible responses are:

y[es]

extracts the named file.

n[o]

skips the named file.

a[ll]

extracts the named file and all remaining (not yet processed) selected files too.

q[uit]

skips the named file and stops prompting. HTAR ends.

-Y auto | [archiveCOS][:indexCOS]

specifies the HPSS class of service (COS) for each stored archive and its corresponding index file. The default is AUTO, which causes HTAR to use a site-specific COS chosen for archive suitability (at LC, the default COS for HTAR files is 160, which automatically stores a single copy of each archive, regardless of its size). You can specify a nondefault COS for the archive, the index, or both (e.g., -Y 120:110), but this is usually undesirable except when testing new HPSS features or devices (if your archive size grows to exceed that allowed by a nondefault COS, HPSS will stop the transfer and HTAR will end with an error). Use -Y dualcopy to request dual-copy storage of any mission critical archive of any size for extra safety. Using -Y overrides the HTAR_COS environment variable. NFT's DIR command with the -h option reports the COS for stored files (in output column 3), while NFT's SETCOS command offers a different way to specify the storage class of service.