What are Regular Expressions in Grep (regex)? Explain with Examples
Introduction
We are sure you must have gone through a file in the command line somewhere in your working process. Have you ever tried to find a particular word or phrase? If you have done it just with your reading capability, then hats off to you. A grep tool is a better way to search in a command-line file.
A grep tool is used to achieve perfect results in a short period. The use of regular expressions or regex in grep is pretty common as it enhances the power of this searching tool to manifolds.
This whole article revolves around grep and regex. Hold on to your horses before you ask about regular expressions because we have separate headings to describe the term with examples. Read the complete article to get a fair idea about regular expressions in grep.
Grep Regular Expression
A simple searching tool for command-line files is empowered with the combination of regex or regular expressions. Regex has a set of patterns that you can mix and match to achieve the perfect search result. The advanced feature was introduced due to the large command line file size, which sometimes takes time to search. Regular expressions in grep are known to provide solutions for repetitive and alternative searches. Do you know the regex grep is divided into three types at a broader level? The types are mentioned below.
- Basic Regular Expressions (BRE)
- Extended Regular Expressions (ERE)
- Perl-Compatible Regular Expressions (PCRE)
PCRE is the most advanced regex with high-end metacharacters, while BRE is the most basic regex. Now let us move forward to grep regex examples.
Grep Regex Example
We will compare a single example with two different approaches to better understand regex. The first approach will not use regex, while the second one will use regex.
Example: There is a file named ‘wonders.txt,’ which has the below content.
apple: 5
banana: 10
orange: 7
grape: 3
cherry: 6
watermelon: 8
Approach 1: We have to find the word ‘apple in every line with simple grep command. To act, we will use the below command.
grep "apple" wonders.txt
The result will be as follows.
apple: 5
Approach 2: Now, we have to search for all those lines that have numbers through regular expressions. You can try out the below command.
grep -E "\d+" wonders.txt
The result will be as follows.
apple: 5
banana: 10
orange: 7
grape: 3
cherry: 6
watermelon: 8
In the second approach, you can see that we have used metacharacters to perform a next-level search. The character ‘-E’ is used to enable regex mode. Further, the character ‘\d’ is used to match digits from 0 to 9. The last character, ‘+,’ is used to search for repetitive words, phrases, numbers, or symbols.
How to Use Regex With Grep?
Do you know which is the default mode for grep? Well, it is BRE or basic regular expression mode. This mode has limited metacharacters. If you want to have specific search results, then you will have to enable extended grep mode by entering ‘-E.’ Now we will have a detailed look at various cases where we can use regex.
Literal Matches
As the name suggests, literal matches are used to search for exact matching words or phrases. Also, it is case-sensitive, meaning the upper and lower case can easily affect the search result. Look at this example to understand literal matches.
Today is a good day to eat kababs.
You have to search for good from the above sentence through regex with grep. Use the below command.
grep ‘good’ file.txt
The output will display the line containing the word ‘good.’
Match Any Character
This is a very interesting case where you will place a dot ‘.’ in the command, and any character can replace the dot. ” to specify the result. Look at the below text and follow the example.
Volleyball
Football
Handball
You have to get the result related to ‘o’ as the first letter and ‘b’ as the third letter. The second letter of the search pattern can be anything. You can use the below command to create the pattern.
grep 'a.e' file.txt
The result will show both volleyball and football but not handball.
Bracket Expressions
Bracket expressions are best to use when you want to search for a group of characters. Take the example of the below content that consists of various number formations.
123
256
868
168
118
Now you want to find the pattern starting with ‘8’ and ‘2’. Use the below command to do so.
grep '^[28]' file.txt
Here the output will be 256 and 868. Also, you can notice we have used ‘^’ this metacharacter to justify the matching at the start of the pattern.
Character Classes
If you want to search for a variety of characters, then this option can be very useful. Keep in mind to always use character classes inside the bracket. Have a look at this example below.
I ate 2 burgers and 1 pizza today.
Now we want to find out the digits from the above sentence. We will use this command to act.
echo "I ate 2 burgers and 1 pizza today." | grep -E '[0-9]'
The output will be 2 and 1.
Quantifiers
If your search has a repetition, then a quantifier is the best way to filter your search results. Here is a list of quantifiers that you can use for your search, {n},+,*,{n,m},? Etc.
Alternation
Have you ever faced a situation where you need your search pattern has a variety of alternatives? Let us have a quick look at the alternation case example.
I have 2 cars and 3 bikes.
We will apply alternation to the above sentence to search for digits and any word starting with the letter ‘c’.
echo "I have 2 cars and 3 bikes." | grep -E '\d+|c\w+'
In the above command, we have used ‘\d’ for digits and ‘\w+’ to match letters or words after the letter ‘c’.
The output will be 2, 3, and cars
Grouping
Grouping is used for complex data at hand and to apply search results to only specific parts of the data. You can mix and match your search results with the help of grouping. Go through the below example.
I want to have apple juice and pineapple juice.
Here you apply the grouping on fruits and juices using the below command.
(apple|orange) juice
The result will be apple juice and pineapple juice. Did you notice how easily the grouping included different words with combinations?
Special Backslash Expressions
If you are a fan of compact commands with regular expressions in grep, then a special backlash can satisfy your search result. Some example of special backslash includes:
\d- used for numerical characters from 0 to 9.
\D- used to exclude non-digit characters.
\w- used to match letters containing both uppercase and lowercase.
\W- used to search any non-word character (no digits, letters, or words).
\s- used to search for whitespace characters like blank, tab, new line, etc.
\S- it searches all the non-whitespace characters like digits, alphabets, etc.
\\- used to search for literal backslash.
Escaping Meta-Characters
Sometimes it is a requirement to avoid the meta character; this is where we use escaping metacharacters. You just have to put a backslash behind the meta-character to search for it literally. Take a look at the below situation.
‘*’ is commonly used for zero or more repetition, but with ‘\*,’ you are instructed to search for ‘*’ in the text literally.
Also Read: Steps to Unzip/Extract tar.gz Files in Linux Using Command Line
Conclusion
This content explains regular expressions (regex) in grep and includes examples to show how it can be used. Grep is a tool used to search for text in files, and it becomes more powerful when used with regular expressions. Regular expressions consist of patterns that allow for more precise searches. A simple grep command can search for exact word matches, like finding the word ‘apple’ in a file using the command: grep “apple” filename. By learning and using regex with grep, you can conduct more efficient searches, which can make your command-line file exploration and analysis more effective.