grep and Regular
Expressions
Analyzing Text on the Command Line
CIS126RH | RHEL System Administration 1
Mesa Community College
One of the most powerful skills for a Linux administrator is the ability to search and filter text efficiently.
grep combined with regular expressions lets you find exactly what you need in log files, configuration files,
and command output — without opening a text editor. These skills are tested on the RHCSA exam and used every day in production.
Learning Objectives
What is grep?
grep — Global Regular Expression Print
Reads lines of text and prints any line that matches a pattern. Works on files or standard input piped from another command.
grep [OPTIONS] PATTERN [FILE...]
0 = at least one match found
1 = no match
2 = error (bad option, file not found)
Exit codes make grep scriptable.
# Search /etc/passwd for the word 'root'
[student@rhel ~]$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
# Case-insensitive search
[student@rhel ~]$ grep -i ROOT /etc/passwd
root:x:0:0:root:/root:/bin/bash
# Search standard input from a pipe
[student@rhel ~]$ ps aux | grep httpd
Common grep Options
| Option | Long form | Effect | Example |
|---|---|---|---|
-i | --ignore-case | Case-insensitive match | grep -i error syslog |
-v | --invert-match | Show non-matching lines | grep -v '^#' httpd.conf |
-n | --line-number | Prefix each match with its line number | grep -n failed auth.log |
-c | --count | Print count of matching lines only | grep -c FAILED secure |
-l | --files-with-matches | List filenames that contain a match | grep -l root /etc/*.conf |
-r | --recursive | Search all files under a directory | grep -r sshd /etc/ |
-E | --extended-regexp | Enable extended regular expressions | grep -E 'cat|dog' file |
-o | --only-matching | Print only the matched portion | grep -oE '[0-9]+' data |
-A n | --after-context=n | Show n lines after each match | grep -A 3 error syslog |
-B n | --before-context=n | Show n lines before each match | grep -B 2 error syslog |
Practical Examples
Filter config files
# Show only active (non-comment) lines
[student@rhel ~]$ grep -v '^#' \
/etc/ssh/sshd_config
Count failed login attempts
[student@rhel ~]$ grep -c 'Failed password' \
/var/log/secure
Find which files reference a user
[student@rhel ~]$ grep -rl 'student' /etc/
Show context around errors
[student@rhel ~]$ grep -i -A 3 \
'error' /var/log/messages
Filter live command output
# Listening web and SSH ports
[student@rhel ~]$ ss -tlnp | \
grep -E ':22|:80|:443'
# Running services only
[student@rhel ~]$ systemctl list-units \
| grep running
Combine -i and -n to find where errors appear in large log files: grep -in error /var/log/messages
Introduction to Regular Expressions
A regular expression (regex) is a pattern that describes a set of strings. grep uses them to decide which lines match.
Two standards used with grep on RHEL:
- BRE — Basic Regular Expressions (grep default)
- ERE — Extended Regular Expressions (
grep -E)
ERE does not require backslashes before +, ?, |, and (). BRE requires \+, \?, \|, and \(\).
| Metacharacter | Meaning |
|---|---|
. | Any single character |
* | Zero or more of preceding |
^ | Anchor: start of line |
$ | Anchor: end of line |
[ ] | Character class |
[^ ] | Negated character class |
\ | Escape the next metacharacter |
Anchors — ^ and $
Anchors match a position in the line, not a character. They do not consume any text.
| Pattern | Matches lines that… |
|---|---|
^root | Start with root |
bash$ | End with bash |
^# | Are comment lines |
^$ | Are completely blank |
^[^#] | Are not comment lines |
Filtering config files to show only active settings — removing comment and blank lines — is a very common exam task.
# Lines starting with 'root'
[student@rhel ~]$ grep '^root' /etc/passwd
root:x:0:0:root:/root:/bin/bash
# Lines ending with '/bin/bash'
[student@rhel ~]$ grep '/bin/bash$' /etc/passwd
# Remove comments AND blank lines (BRE)
[student@rhel ~]$ grep -v '^#\|^$' \
/etc/ssh/sshd_config
# Same with ERE (cleaner syntax)
[student@rhel ~]$ grep -vE '^(#|$)' \
/etc/ssh/sshd_config
Character Classes [ ]
Literal ranges
| Pattern | Matches |
|---|---|
[aeiou] | Any single vowel |
[A-Z] | Any uppercase letter |
[0-9] | Any digit |
[a-zA-Z0-9] | Any alphanumeric character |
[^0-9] | Any non-digit character |
POSIX named classes
| Class | Equivalent to |
|---|---|
[:alpha:] | [a-zA-Z] |
[:digit:] | [0-9] |
[:alnum:] | letters and digits |
[:space:] | whitespace characters |
[:upper:] | uppercase letters only |
# Lines containing any digit
[student@rhel ~]$ grep '[0-9]' /etc/passwd
# Match lowercase username field
[student@rhel ~]$ grep '^[a-z]*:x:1' /etc/passwd
# POSIX class — locale-safe digit match
[student@rhel ~]$ grep '[[:digit:]]' data.txt
# Lines with whitespace (tabs or spaces)
[student@rhel ~]$ grep '[[:space:]]' file.txt
POSIX named classes ([[:digit:]]) are locale-independent and safer than simple ranges ([0-9]) on systems with non-ASCII locales.
Quantifiers
Quantifiers follow an element and control how many times it must match.
| BRE (default grep) | ERE (grep -E) | Meaning | ERE Example |
|---|---|---|---|
* | * | Zero or more | go*gle → ggle, gogle, google |
\+ | + | One or more | go+gle → gogle, google (not ggle) |
\? | ? | Zero or one (optional) | colou?r → color or colour |
\{n\} | {n} | Exactly n times | [0-9]{4} → any 4-digit number |
\{n,\} | {n,} | n or more times | [a-z]{3,} → 3+ lowercase letters |
\{n,m\} | {n,m} | Between n and m times | [0-9]{1,3} → 1 to 3 digits |
# Match lines with two or more consecutive spaces (ERE)
[student@rhel ~]$ grep -E ' {2,}' config.txt
# Match 4-digit port numbers in ss output
[student@rhel ~]$ ss -tlnp | grep -E ':[0-9]{4}'
The Dot . and Escaping \
Dot matches any single character
# 'c', any character, 't'
[student@rhel ~]$ grep 'c.t' words.txt
cat cut cot c3t c_t
# Exactly 3-character words
[student@rhel ~]$ grep '^...$' words.txt
The pattern 192.168.1.1 also matches 192X168Y1Z1 because . is any character. Use 192\.168\.1\.1 to match the literal dots in an IP address.
Escaping makes metacharacters literal
| Pattern | Matches literally |
|---|---|
\. | A period / dot |
\* | An asterisk |
\[ | A left bracket |
\$ | A dollar sign |
# Literal IP address match (ERE)
[student@rhel ~]$ grep -E \
'[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' \
/var/log/secure
ERE: Alternation and Grouping
Alternation with |
Match one or another pattern. Use grep -E (or write \| in BRE).
# Match 'Failed' or 'Rejected' in logs
[student@rhel ~]$ grep -E \
'Failed|Rejected' /var/log/secure
# Case-insensitive with alternation
[student@rhel ~]$ grep -iE \
'error|warning|critical' \
/var/log/messages
Grouping with ( )
Group sub-expressions to apply quantifiers to a whole group.
# 'ha' repeated one or more times
[student@rhel ~]$ grep -E '(ha)+' file.txt
ha haha hahaha
# Optional protocol prefix (ERE)
[student@rhel ~]$ grep -E \
'(https?://)?example\.com' access.log
In BRE without -E, use \| for alternation and \( \) for grouping.
BRE vs ERE — Quick Reference
| Feature | BRE — grep (default) | ERE — grep -E |
|---|---|---|
| One or more | \+ | + |
| Zero or one | \? | ? |
| Alternation | \| | | |
| Grouping | \( \) | ( ) |
| Interval — exact | \{n\} | {n} |
| Interval — range | \{n,m\} | {n,m} |
| Zero or more | * | * |
| Any character | . | . |
| Line anchors | ^ $ | ^ $ |
| Character class | [abc] | [abc] |
When in doubt, use grep -E. ERE is a strict superset of BRE — any BRE pattern also works with -E. ERE syntax is cleaner and less error-prone.
grep in Pipelines
grep reads standard input when no file is given, making it ideal for filtering the output of other commands.
Filtering command output
# Firewall services in effect
[student@rhel ~]$ firewall-cmd --list-all \
| grep services
# Processes owned by the apache user
[student@rhel ~]$ ps aux | grep '^apache'
# Installed packages with 'http' in name
[student@rhel ~]$ rpm -qa | grep -i http
Chaining multiple greps
# Errors but NOT debug-level messages
[student@rhel ~]$ grep -i error /var/log/messages \
| grep -iv debug
# Top IPs with failed SSH logins
[student@rhel ~]$ grep 'Failed password' \
/var/log/secure \
| grep -oE '[0-9]{1,3}(\.[0-9]{1,3}){3}' \
| sort | uniq -c | sort -rn | head
Practice reading multi-stage pipelines aloud. The exam may test comprehension of a given pipeline, not just authoring one from scratch.
Real-World Scenarios
# Who has sudo rights?
grep -v '^#' /etc/sudoers \
| grep ALL
Identify privileged accounts without reading every line of the sudoers file.
# 404 errors from the last hour
grep -E ' 404 ' \
/var/log/httpd/access_log \
| grep "$(date +%H:)"
Isolate HTTP errors from millions of log entries in seconds.
# Active sshd settings only
grep -vE '^(#|$)' \
/etc/ssh/sshd_config
See only effective configuration — every comment and blank line stripped.
# Users with an interactive shell
grep -v '/nologin\|/false' /etc/passwd
# Filesystems over 80% used
df -h | grep -E '[89][0-9]%|100%'
grep Variants
| Command | Regex type | Notes |
|---|---|---|
grep | BRE | Default — use \+, \?, \|, \( \) |
grep -E | ERE | Preferred for complex patterns; cleaner syntax |
egrep | ERE | Deprecated alias for grep -E — avoid in scripts |
grep -F | Fixed string | No regex processing; fastest; safest for literal text |
fgrep | Fixed string | Deprecated alias for grep -F |
grep -P | PCRE | Perl-compatible; powerful but not POSIX — avoid on exam |
Use grep and grep -E on the exam. The deprecated aliases egrep and fgrep still work on RHEL 9 but should not be used in new scripts. grep -P (PCRE) is not required for RHCSA.
Use grep -F when searching for a literal string that contains regex metacharacters — for example, a literal IP address like 192.168.1.1 or a dollar sign. Faster and avoids accidental regex interpretation.
Finding the Right Approach
I know the text I want to find, no special characters:
→ grep 'pattern' file — plain grep, no flags needed
I want to find lines that do NOT match:
→ grep -v 'pattern' file — invert match
My pattern needs + ? | or ( ) without backslashes:
→ grep -E 'pattern' file — extended regex
I am searching for a literal string with dots, asterisks, or brackets:
→ grep -F 'literal.string' file — fixed string, no regex
I need to search recursively across an entire directory:
→ grep -r 'pattern' /path/ — recursive search
I want to count matches, not see them:
→ grep -c 'pattern' file — count only
Knowledge Check
Answer these before moving to the next slide:
- What option makes grep search recursively through a directory?
- Write a grep command to show only non-comment, non-blank lines from
/etc/httpd/conf/httpd.conf. - What is the difference between
^[0-9]and[^0-9]? - Write an ERE pattern to match either
eth0orens3in a file. - Why should you use
grep -Fwhen searching for the literal text192.168.1.1?
Test every answer at the terminal before reading the answers slide. Typing the command yourself is the fastest path to remembering it on exam day.
Knowledge Check — Answers
grep -rorgrep --recursivesearches all files under a directory tree.-
grep -vE '^(#|[[:space:]]*$)' /etc/httpd/conf/httpd.conf ^[0-9]matches lines that start with a digit.[^0-9]matches any single character that is not a digit — the^inside brackets negates the class.-
grep -E 'eth0|ens3' file - Without
-F, the dots in192.168.1.1are treated as "any character" metacharacters, so192X168Y1Z1would also match.grep -Ftreats every character literally, eliminating false positives.
Key Takeaways
-
1
grep pattern fileis the primary text-search tool. Pipe it from any command that produces output. Exit codes (0=match,1=no match) make it scriptable. -
2Master the essential options:
-i(case),-v(invert),-n(line numbers),-c(count),-r(recursive),-E(extended regex). -
3Anchors (
^start,$end), character classes ([0-9],[[:alpha:]]), and quantifiers (*,+,{n,m}) are the core building blocks of every useful pattern. -
4Use
grep -Efor alternation (|) and grouping (()) without backslashes. ERE syntax is cleaner and a superset of BRE — default to it for complex patterns.
Graded Lab
- Use
grep -vto display only active (non-comment, non-blank) lines from/etc/ssh/sshd_config - Use
grep -Ewith alternation to find lines containing eitherFailedorAcceptedin/var/log/secure - Use
grep -oEand a quantifier pattern to extract all IPv4 addresses from/var/log/secure - Use a pipeline with
grep,sort, anduniq -cto count the top five source IPs in failed login attempts
/etc/passwd · /etc/ssh/sshd_config ·
/var/log/secure · /var/log/messages ·
/proc/cpuinfo