Consolidating and Compressing

Consolidating Many Files

If you need to transfer a large number of files, it is a good practice to create a tar file before the transfer. Moving one large file is much more efficient than moving a large number of small files, although be aware that putting thousands of files into a tar file can itself take a long time. The following command creates one "tar" file from the contents of a directory. This example uses the most common options:

c
create new archive
v
verbosely list the files
f
file name for the archive
$ tar cvf tarfile.tar dirname/
Compressing Large Data

You might also consider performing compression on large individual files, thereby reducing the amount of data you have to transfer. This can be effective, but it is computationally expensive. Depending on the nature of your data, performing compression could actually make your overall transfer time longer if the compression does not significantly reduce the size of the file you will transfer.

To compress a file, you can use gzip or a host of other compression algorithms. Some algorithms (such as delta compression) can be highly performant and efficient in file size, depending on the nature of the data. To archive and compress files at the same time, add the "z" option to a "tar" command: (tar cvfz). Please note that binary data can be insufficiently compressible to make this option worthwhile.

 
©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Access Statement