RED HAT ENTERPRISE LINUX
Archiving and
Compressing Files
Archive, compress, unpack, and uncompress files using tar, gzip, and bzip2
CIS126RH | RHEL System Administration 1
Mesa Community College
Archiving and compression are essential daily skills for Linux administrators. You will use them to back up configuration files before making changes, transfer directory trees between systems, and package software for deployment. Understanding the difference between archiving and compression — and knowing which tools to combine — makes these tasks fast and reliable. These skills are tested on the RHCSA exam.
Learning Objectives
- Distinguish archiving from compression — Explain the role of tar, gzip, bzip2, and xz and when to use each
- Create and inspect tar archives — Use tar to bundle files, list contents, and extract archives
- Compress and decompress files — Use gzip and bzip2 directly on individual files
- Create and extract compressed archives — Combine tar with a compression flag to produce .tar.gz and .tar.bz2 files in a single command
Archiving vs Compression
These are two separate operations that are often combined but serve different purposes.
| Operation | What it does | Tools on RHEL |
|---|---|---|
| Archiving | Combines many files and directories into one file, preserving names, permissions, ownership, and directory structure | tar |
| Compression | Reduces the size of a file by encoding repeated patterns — no files are combined, only one file in and one file out | gzip, bzip2, xz |
Common File Extensions
| Extension | Meaning |
|---|---|
.tar | tar archive — no compression |
.tar.gz or .tgz | tar archive compressed with gzip |
.tar.bz2 | tar archive compressed with bzip2 |
.tar.xz | tar archive compressed with xz |
.gz | Single file compressed with gzip — no archive |
.bz2 | Single file compressed with bzip2 — no archive |
tar Fundamentals
tar — tape archive — is the standard tool for bundling files into a
single archive. The archive file is called a tarball.
| Flag | Long form | Meaning |
|---|---|---|
-c | --create | Create a new archive |
-x | --extract | Extract files from an archive |
-t | --list | List the contents of an archive |
-f | --file | Specify the archive filename — always required |
-v | --verbose | Print each filename as it is processed |
-z | --gzip | Compress or decompress with gzip |
-j | --bzip2 | Compress or decompress with bzip2 |
-J | --xz | Compress or decompress with xz |
-C | --directory | Change to directory before operating |
-p | --preserve-permissions | Restore original permissions on extract |
Creating tar Archives
Use the -c flag to create a new archive. Always specify the archive filename
with -f.
# Archive a single directory
$ tar -cvf backup.tar /etc/ssh
etc/ssh/
etc/ssh/sshd_config
etc/ssh/ssh_config
# Archive multiple directories
$ tar -cvf configs.tar /etc/ssh /etc/httpd
# Archive files in the current directory
$ tar -cvf project.tar .
# Exclude a subdirectory from the archive
$ tar -cvf home.tar --exclude=/home/student/Downloads /home/student
When you archive an absolute path such as /etc/ssh, tar stores the leading
slash by default on some systems. On RHEL, tar strips the leading slash and warns you —
this is the safe behaviour that prevents accidental overwrites during extraction.
Listing Archive Contents
Use the -t flag to inspect what is inside an archive without extracting it.
This is an essential step before extracting an unknown archive.
# List the contents of an uncompressed archive
$ tar -tvf backup.tar
drwxr-xr-x root/root 0 2026-05-25 etc/ssh/
-rw-r--r-- root/root 3905 2026-05-25 etc/ssh/sshd_config
-rw-r--r-- root/root 1770 2026-05-25 etc/ssh/ssh_config
# List the contents of a gzip-compressed archive
$ tar -tzvf backup.tar.gz
# List the contents of a bzip2-compressed archive
$ tar -tjvf backup.tar.bz2
# Let tar detect the compression automatically
$ tar -tvf backup.tar.bz2
Listing an archive first shows you where files will land, warns you about archives that contain absolute paths, and confirms the archive is not corrupted before you commit to extracting it.
Extracting tar Archives
Use the -x flag to extract files from an archive.
# Extract into the current directory
$ tar -xvf backup.tar
# Extract to a specific directory with -C
$ tar -xvf backup.tar -C /tmp/restore
# Extract only specific files from an archive
$ tar -xvf backup.tar etc/ssh/sshd_config
# Extract a gzip-compressed archive to a specific directory
$ tar -xzvf backup.tar.gz -C /tmp/restore
# Preserve original permissions on extraction (useful as root)
$ tar -xpvf backup.tar -C /tmp/restore
The exam frequently asks you to extract an archive to a specific directory.
The -C flag is essential — the target directory must exist before
you run the command.
gzip: Compressing Individual Files
gzip compresses a single file and replaces it with a .gz version.
The original file is removed by default.
# Compress a file — replaces messages with messages.gz
$ gzip messages
$ ls
messages.gz
# Decompress — replaces messages.gz with messages
$ gunzip messages.gz
# Keep the original file while compressing
$ gzip -k messages
$ ls
messages messages.gz
# View a compressed file without decompressing it
$ zcat messages.gz
$ zless messages.gz
# Show compression ratio and file size
$ gzip -l messages.gz
compressed uncompressed ratio name
102400 512000 80.0% messages
bzip2: Higher Compression
bzip2 produces smaller files than gzip but takes longer to compress and
decompress. It operates the same way — one file in, one .bz2 file out.
# Compress a file
$ bzip2 messages
$ ls
messages.bz2
# Decompress — bunzip2 is an alias for bzip2 -d
$ bunzip2 messages.bz2
# Keep the original file while compressing
$ bzip2 -k messages
# View a bzip2-compressed file without decompressing
$ bzcat messages.bz2
$ bzless messages.bz2
| Tool | Speed | Compression ratio | Best for |
|---|---|---|---|
gzip | Fast | Good | Logs, quick backups, streaming |
bzip2 | Slower | Better | Distribution archives, large text files |
xz | Slowest | Best | Software packages, maximum space savings |
xz: Maximum Compression
xz achieves the highest compression ratio of the three tools but is the
slowest. It is the format used by RPM packages on RHEL.
# Compress a file
$ xz messages
$ ls
messages.xz
# Decompress
$ unxz messages.xz
# Keep the original file
$ xz -k messages
# View without decompressing
$ xzcat messages.xz
RPM packages on RHEL 9 use xz compression internally. When you run
rpm -qf /path/to/file or extract an RPM with
rpm2cpio package.rpm | cpio -idv, xz is working behind the scenes.
The RHCSA exam objective names gzip and bzip2 explicitly. Know all three —
xz archives appear frequently in the real world and on exam systems.
Creating Compressed Archives
Adding a compression flag to tar -c creates and compresses in one step.
# Create a gzip-compressed archive (.tar.gz)
$ tar -czvf ssh-backup.tar.gz /etc/ssh
etc/ssh/
etc/ssh/sshd_config
etc/ssh/ssh_config
# Create a bzip2-compressed archive (.tar.bz2)
$ tar -cjvf ssh-backup.tar.bz2 /etc/ssh
# Create an xz-compressed archive (.tar.xz)
$ tar -cJvf ssh-backup.tar.xz /etc/ssh
# Archive multiple sources into one compressed file
$ tar -czvf configs.tar.gz /etc/ssh /etc/httpd /etc/firewalld
# Use a datestamp in the filename for rotation
$ tar -czvf backup-$(date +%F).tar.gz /etc
Use the full extension (.tar.gz not just .tgz) and include
a date stamp in backup filenames. This makes it immediately obvious what format the
file is and when it was created.
Extracting Compressed Archives
Adding a compression flag to tar -x decompresses and extracts in one step.
# Extract a gzip-compressed archive here
$ tar -xzvf ssh-backup.tar.gz
# Extract a bzip2-compressed archive to a specific directory
$ tar -xjvf ssh-backup.tar.bz2 -C /tmp/restore
# Let tar detect the compression automatically (RHEL / GNU tar)
$ tar -xvf ssh-backup.tar.gz -C /tmp/restore
# List contents before extracting — always a good habit
$ tar -tzvf ssh-backup.tar.gz
$ tar -xzvf ssh-backup.tar.gz -C /tmp/restore
tar will not create the directory specified with -C.
Create it first: mkdir -p /tmp/restore
On the exam, read the task carefully — it will specify where to extract.
Use -C /path/to/destination and confirm the directory exists first.
tar Quick Reference
| Task | Command |
|---|---|
| Create an uncompressed archive | tar -cvf archive.tar /path |
| Create a gzip-compressed archive | tar -czvf archive.tar.gz /path |
| Create a bzip2-compressed archive | tar -cjvf archive.tar.bz2 /path |
| Create an xz-compressed archive | tar -cJvf archive.tar.xz /path |
| List archive contents | tar -tvf archive.tar |
| List compressed archive contents | tar -tzvf archive.tar.gz |
| Extract to current directory | tar -xvf archive.tar |
| Extract to a specific directory | tar -xvf archive.tar -C /dest |
| Extract compressed to a directory | tar -xzvf archive.tar.gz -C /dest |
| Extract one file from an archive | tar -xvf archive.tar etc/ssh/sshd_config |
The three operations are create, extract, and list. Always add f with the filename. Add v to see progress. Add z, j, or J for compression.
Working with Compressed Log Files
Linux log rotation compresses old log files automatically. These utilities let you read compressed logs without decompressing them first.
| Compressed format | View with cat | Page through with less | Search with grep |
|---|---|---|---|
.gz |
zcat |
zless |
zgrep |
.bz2 |
bzcat |
bzless |
bzgrep |
.xz |
xzcat |
xzless |
xzgrep |
# Search for errors across all rotated log files
$ zgrep -i error /var/log/messages-*.gz
# Page through a compressed log
$ zless /var/log/messages-20260518.gz
Practical Admin Scenarios
Back Up Config Files Before Editing
$ tar -czvf /root/ssh-before-$(date +%F).tar.gz /etc/ssh
Transfer a Directory Tree to Another Server
$ tar -czvf /tmp/webapp.tar.gz /var/www/html
$ scp /tmp/webapp.tar.gz student@serverb:/tmp/
$ ssh student@serverb 'tar -xzvf /tmp/webapp.tar.gz -C /var/www'
Restore a Specific File from a Backup
# List to find the exact path in the archive
$ tar -tzvf ssh-backup.tar.gz
# Extract just that one file
$ tar -xzvf ssh-backup.tar.gz -C /tmp etc/ssh/sshd_config
Compress a Large Log Before Archiving
$ gzip -k /var/log/messages
$ mv messages.gz /mnt/archive/
Common Mistakes
| Mistake | What goes wrong | Correct approach |
|---|---|---|
Forgetting -f |
tar treats the next argument as the archive name — often a flag — and fails with a confusing error | Always include -f archivename |
Extracting without -C |
Files land in the current directory — possibly overwriting existing files | List first, then extract with -C /destination |
| Target directory does not exist | tar exits with an error — no files are extracted | Run mkdir -p /destination first |
| Compressing an already-compressed file | The file grows larger — compression cannot compress random data | Do not gzip a .tar.gz or .jpg — it will not help |
| Wrong compression flag for the format | tar reports "not in gzip format" or similar | Match the flag to the extension: -z for .gz, -j for .bz2 |
Using gzip on a directory |
gzip only compresses single files — it cannot bundle a directory | Use tar -czvf to archive and compress together |
Knowledge Check
Answer these before moving to the next slide.
- What is the difference between archiving and compression? Which tool on RHEL handles each?
- Write the command to create a gzip-compressed archive of
/etc/httpdnamedhttpd-backup.tar.gz. - Write the command to list the contents of
httpd-backup.tar.gzwithout extracting it. - Write the command to extract
httpd-backup.tar.gzinto/tmp/restore. What must you do before running the command? - What is the difference between
gzipandbzip2? When would you prefer one over the other? - You want to search for the word "error" in a rotated, gzip-compressed log file without decompressing it. What command do you use?
Knowledge Check — Answers
- Archiving combines multiple files into one, preserving directory structure
and metadata — handled by
tar. Compression reduces the size of a single file — handled bygzip,bzip2, andxz. tar -czvf httpd-backup.tar.gz /etc/httpdtar -tzvf httpd-backup.tar.gztar -xzvf httpd-backup.tar.gz -C /tmp/restore
Before running this command, create the destination directory:mkdir -p /tmp/restoregzipis faster but compresses less.bzip2is slower but produces smaller files. Prefergzipfor speed-sensitive tasks like log compression and interactive backups. Preferbzip2when file size matters more than time, such as distributing large archives.zgrep -i error /var/log/messages-20260518.gz
Or to search all rotated copies:zgrep -i error /var/log/messages*.gz
Key Takeaways
-
Archiving and compression are separate operations.
tarbundles files;gzip,bzip2, andxzreduce size. Combine them with tar's-z,-j, and-Jflags to create compressed archives in one command. -
The three tar operations are create, extract, and list.
Always add
-fwith the archive filename. Add-vto see progress. Add a compression flag to match the archive format. -
List before you extract.
Use
tar -tvf archiveto confirm contents and paths before extracting. Use-C /destinationto extract to a specific directory — create that directory first withmkdir -p. -
Use the right tool for the format.
gzip— fast, good compression, .gz.bzip2— slower, better compression, .bz2.xz— slowest, best compression, .xz. Usezcat,bzcat, andxzcatto read compressed files without decompressing.
Graded Lab
- Create a gzip-compressed tar archive of
/etc/sshnamedssh-backup.tar.gzin/tmp - List the contents of
ssh-backup.tar.gzwithout extracting it and note the exact path stored forsshd_config - Create
/tmp/restoreand extractssh-backup.tar.gzinto it — confirm the files are present - Extract only the
sshd_configfile from the archive into/tmp/single - Create a bzip2-compressed archive of the same directory and compare file sizes: which compressor produced the smaller archive?
- Compress a copy of
/var/log/messageswithgzip -kand usezgrepto search the compressed file for the word "error"
"Archive, compress, unpack, and uncompress files using tar, gzip, and bzip2." — Create and extract compressed archives to and from specific locations.