Wednesday, May 25, 2011

gzip

Compressing Multiple Files

You can concatenate multiple compressed files. When you do so, gunzip (or gzip -d) extracts all files in the compressed file as a single file. For example:
gzip -c file1  > catfiles.gz
gzip -c file2 >> catfiles.gz
After creating the compressed file catfiles.gz, the command:
gunzip -c catfiles.gz
is equivalent to
cat file1 file2
If one of the files in such a .gz file is damaged or corrupt, the other files can still be recovered (if the damaged or corrupt member is removed).
You can improve the level of compression achieved by compressing all the files at once rather than compressing them individually and then concatenating the results. For example:
cat file1 file2 | gzip > catfiles.gz
yields better compression than:
gzip -c file1 file2 > catfiles.gz
You can recompress concatenated files to get better compression with a command like:
gzip -cd old.gz | gzip > new.gz
When a compressed file contains several individual files, the uncompressed size and CRC reported by the --list option are for the last member only. To get the uncompressed size for all members, use:
gzip -cd file.gz | wc -c
Multiple compressed files can be concatenated. In this case, gunzip will extract all members at once. For example:

gzip -c file1 > foo.gz
gzip -c file2 >> foo.gz
Then

gunzip -c foo
is equivalent to

cat file1 file2
In case of damage to one member of a .gz file, other members can still be recovered (if the damaged member is removed). However, you can get better compression by compressing all members at once:

cat file1 file2 | gzip > foo.gz
compresses better than

gzip -c file1 file2 > foo.gz
If you want to recompress concatenated files to get better compression, do:

gzip -cd old.gz | gzip > new.gz
If a compressed file consists of several members, the uncompressed size and CRC reported by the --list option applies to the last member only. If you need the uncompressed size for all members, you can use:

gzip -cd file.gz | wc -c
If you wish to create a single archive file with multiple members so that members can later be extracted independently, use an archiver such as tar or zip. GNU tar supports the -z option to invoke gzip transparently. gzip is designed as a complement to tar, not as a 
One other useful option is the-rflag, which tellsgzipandgunzipto recursively compress or decompress all files in the current directory and any subdirectories. (Even with the-rflag,gzipstill compresses one file at a time.) Here are some examples:
gzip -r somedirZip all files in thesomedirdirectory.
gunzip -r somedirUnzip all files in thesomedirdirectory.

Handling Compressed Archives
It's common to applygzipto a tar file, which is why you see files with names likesomething.tar.gzon Linux systems. When you want to extract the contents of a gzipped tar file, you have several choices. The first is to usegunzipfollowed bytar, like this:
gunzip something.tar.gz
tar xvf something.tar

Or you could do it all in one command, like this:
gunzip -c something.tar.gz | tar xvf -
The-cflag tellsgunzipto decompress the file, but instead of creating asomething.tarfile, it pipes the decompressed data directly to thetarcommand. Thetarcommand on the right side of the pipeline looks a little strange, too--instead of a file name after thexvf, there's just a dash. The dash tellstarthat the input is not an actual file on disk, but rather a stream of data from the pipeline. (Note that thegunzipinput file is not deleted when you use the-cflag.)
Here's a third method of extracting the contents of a compressed tar file that's even easier. Remember thezflag with thetarcommand? You can use it to decompress and unbundle a tar file, like this:
tar xvzf something.tar.gz
The end result is exactly the same (the files that were in the compressed tar file are now in your current directory), but this is much easier than issuing multiple commands or writing a messy-lookinggunzip-tarpipeline.
Note that this command will work on all Linux systems, but thezflag fortaris not always available on other flavors of Unix. (However, you can download and compile the source code for the GNU version of thetarcommand. See the note near the beginning of this section about getting the source code for the GNU utilities.)


No comments:

Post a Comment