Difference between revisions of "Linux-Ubuntu/C3/Mastering-grep/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
(Created page with "'''TITLE: Mastering Grep''' '''Author:''' EduPyramids''' '''Keywords:''' grep, search, pattern matching, regular expressions, extended regex, case-insensitive search, cha...")
 
 
(3 intermediate revisions by the same user not shown)
Line 4: Line 4:
  
 
'''Keywords:''' grep, search, pattern matching, regular expressions, extended regex, case-insensitive search, character classes, anchors, dot operator, asterisk operator, Linux, Ubuntu, Bash, text search, EduPyramids, video tutorial.
 
'''Keywords:''' grep, search, pattern matching, regular expressions, extended regex, case-insensitive search, character classes, anchors, dot operator, asterisk operator, Linux, Ubuntu, Bash, text search, EduPyramids, video tutorial.
 
  
  
Line 27: Line 26:
 
* Match any one character using dot
 
* Match any one character using dot
 
* Match a pattern at the beginning and ending of a line.
 
* Match a pattern at the beginning and ending of a line.
 +
 
|| In this tutorial, we will learn to:
 
|| In this tutorial, we will learn to:
 
* Match more than one pattern.
 
* Match more than one pattern.
Line 77: Line 77:
  
 
Type '''grep -e "electronics” -e "civil" grepdemo.txt '''Press '''Enter'''.  
 
Type '''grep -e "electronics” -e "civil" grepdemo.txt '''Press '''Enter'''.  
|| Let us get the details of students from the Civil or Electronics stream.  
+
|| We can match multiple patterns using the '''hyphen e''' option in '''grep'''.
 +
 
  
 
We will use the same example file, '''grep demo dot t x t'''.  
 
We will use the same example file, '''grep demo dot t x t'''.  
  
We can match multiple patterns using the '''hyphen e''' option in '''grep'''.
+
Let us get the details of students from the Civil or Electronics stream.  
 +
 
  
 
Type this command and press '''Enter'''.  
 
Type this command and press '''Enter'''.  
Line 100: Line 102:
 
We can as well combine it with multiple '''hyphen e''' options.
 
We can as well combine it with multiple '''hyphen e''' options.
  
This helps to match different spellings of the same word. Type this command and press''' Enter.'''
+
This helps to match different spellings of the same word.  
  
The output is displayed. However, there can be many other ways to write the name.
+
Type this command and press''' Enter.'''
 +
 
 +
The output is displayed.  
 +
 
 +
However, there can be many other ways to write the name.
  
 
We could use many -e options, but a better solution is Regular Expressions.
 
We could use many -e options, but a better solution is Regular Expressions.
Line 111: Line 117:
 
|| There are several special characters used in regular expressions.
 
|| There are several special characters used in regular expressions.
 
|-  
 
|-  
|| '''Slide 6'''
 
  
'''charcter-class.png'''
 
 
'''Character class'''
 
 
|| A character class is a part of a regular expression.
 
 
It allows us to define a group of characters inside square brackets.
 
 
When a pattern is matched, only one character from this group is selected.
 
 
For example, square bracket '''a b c '''matches either '''a''', '''b''', or '''c'''.
 
 
The pattern square bracket zero to nine matches any one digit from zero to nine.
 
 
Square bracket a to z matches any one lowercase letter.
 
 
Character classes are useful when more than one character is allowed.
 
 
To specify a larger range, we use the format: first character hyphen last character.
 
 
For example, [0-9] matches any digit.
 
 
Only one character from the range is matched at a time.
 
|-
 
 
||  
 
||  
 
|| Let us look at some examples.
 
|| Let us look at some examples.
Line 145: Line 126:
  
 
Press '''Enter Add annotation for this'''
 
Press '''Enter Add annotation for this'''
|| To understand character class type this command and press '''Enter'''.  
+
|| A character class is a part of a regular expression.
 +
 
 +
To understand character class type this command and press '''Enter'''.  
  
 
Observe the output.
 
Observe the output.
Line 153: Line 136:
 
However, it still does not match '''choudhuree''' with double '''e'''.
 
However, it still does not match '''choudhuree''' with double '''e'''.
 
|-  
 
|-  
|| '''Slide 7'''
+
 +
|| Type:
  
'''The asterisk (*) operator'''
+
'''grep -i “m[ei]*ra*” grepdemo.txt''' press '''Enter'''
  
The '''* operator''' is a regular expression operator.
+
|| Let us match a student's name Mira in the file.
  
It matches zero or more repetitions of the preceding character or pattern.
+
Type this command
  
This allows us to match repeated characters, or even when the character is absent.
+
The asterisk operator is a regular expression operator.
 
+
For example, ab* matches a, ab, abb, abbb, and so on.
+
|| The asterisk operator is a regular expression operator.
+
  
 
It matches zero or more repetitions of the preceding character or pattern.
 
It matches zero or more repetitions of the preceding character or pattern.
  
This allows matching repeated characters or cases where the character is absent.
+
Press '''Enter''' to see the output.
 
+
For example, the pattern ab asterisk (*) matches a, ab, abb, abbb, and so on.
+
 
+
|-
+
|| Type:
+
 
+
'''grep -i “m[ei]*ra*” grepdemo.txt''' press '''Enter'''
+
 
+
|| Let us match a student's name Mira in the file.
+
 
+
Type this command and press '''Enter''' to see the output.
+
  
 
We see records of Mira with 3 different spellings here.
 
We see records of Mira with 3 different spellings here.
 
|-  
 
|-  
|| '''Slide 8'''
+
 +
|| Type '''grep "M… " grepdemo.txt'''  
  
'''The Dot Operator'''
+
press '''Enter'''
 +
|| For example, type this command
  
In regular expressions, the '''dot (.) operator''' is a special character.
+
The dot is a special character in regular expressions.
 
+
It matches '''any single character''', except the newline character.
+
 
+
For example:* a.c matches abc, a1c, or a_c
+
* It does not match '''ac''' because one character is required between''' a''' and '''c'''
+
 
+
 
+
The '''dot operator''' is used when the character is unknown but its position is fixed.
+
|| The '''dot''' is a special character in regular expressions.
+
  
 
It matches any single character, except the newline character.
 
It matches any single character, except the newline character.
  
For example, the pattern '''a dot c '''can match '''a b c''', '''a 1 c''', or '''a underscore c'''.
 
  
However, it will not match '''a c''', because one character must appear between '''a''' and '''c'''.
+
Now press '''Enter'''.
 
+
The '''dot operator''' is used when the character is unknown but its position is fixed.
+
|-
+
|| Type '''grep "M… " grepdemo.txt'''
+
 
+
press '''Enter'''
+
|| For example, type this command and press '''Enter'''.
+
  
 
It searches for four letter words that start with '''M'''.  
 
It searches for four letter words that start with '''M'''.  
Line 221: Line 176:
 
The output shows records for students '''Mani''' and '''Mira'''.
 
The output shows records for students '''Mani''' and '''Mira'''.
 
|-  
 
|-  
|| '''Slide 9'''
 
  
'''Anchors (^ and $)'''
+
|| At the prompt type '''grep "^A" grepdemo.txt'''
  
An '''anchor''' is a special symbol in regular expressions.
+
press '''Enter'''
  
It specifies where a pattern should match in a line.
+
Point to the output.
 +
|| Let us use anchors.
  
It does not match any character, it matches only a position.
+
Now, we will extract entries with roll numbers starting with A.
  
The two most common anchors are''' ^ '''and '''$'''.
+
The roll number is the first field in the file.
  
To match a pattern at the beginning of a line, we use the caret ^ symbol.
+
Type this command
  
To match a pattern at the end of a line, we use the dollar sign '''$'''.
+
An anchor is a special symbol used in regular expressions.
|| An anchor is a special symbol used in regular expressions.  
+
  
 
It specifies where a pattern should match in a line.
 
It specifies where a pattern should match in a line.
Line 242: Line 196:
 
Anchors do not match any character, they match only a position.
 
Anchors do not match any character, they match only a position.
  
The two most common anchors are the '''caret''' and the '''dollar''' sign. We use '''caret''' to match a pattern at the beginning of a line.
 
  
We use dollar to match a pattern at the end of a line.
+
Now press '''Enter'''.  
|-
+
|| At the prompt type '''grep "^A" grepdemo.txt'''
+
 
+
press '''Enter'''
+
 
+
Point to the output.
+
|| Let us use anchors.
+
 
+
Now, we will extract entries with roll numbers starting with A.
+
 
+
The roll number is the first field in the file.
+
 
+
Type this command and press '''Enter'''.  
+
  
 
Only lines with roll numbers starting with A are shown.
 
Only lines with roll numbers starting with A are shown.
Line 267: Line 207:
 
All other lines, starting with different characters, are ignored.
 
All other lines, starting with different characters, are ignored.
 
|-  
 
|-  
|| Add annotation.Press '''Ctrl''' and '''L''' keys together to clear the screen.
+
 
|| Let me clear the screen
+
|-
+
 
|| Type: '''grep "1$" grepdemo.txt'''Press '''Enter'''.
 
|| Type: '''grep "1$" grepdemo.txt'''Press '''Enter'''.
  
Line 281: Line 219:
 
Here is the output.
 
Here is the output.
  
To find stipends between 7000''' '''to 8999'''. '''Type this command and press '''Enter'''.  
+
To find stipends between 7000 to 8999.
 +
 
 +
Type this command and press '''Enter'''.  
  
 
Only lines with stipend numbers ending in the specified digit are shown.
 
Only lines with stipend numbers ending in the specified digit are shown.
  
 
That is, the numbers between 7000 and 8999.
 
That is, the numbers between 7000 and 8999.
 +
 +
grep ignores all other lines that do not match this pattern.
  
 
Here, it searches for 7 or 8 first.
 
Here, it searches for 7 or 8 first.
Line 291: Line 233:
 
Then any 3 characters following it, from the end of the file '''grep demo dot t x t'''.
 
Then any 3 characters following it, from the end of the file '''grep demo dot t x t'''.
 
|-  
 
|-  
|| '''Slide 10'''
+
|| '''Slide 6'''
  
 
'''Summary'''
 
'''Summary'''
Line 307: Line 249:
 
Let us summarise.
 
Let us summarise.
 
|-  
 
|-  
|| '''Slide 11'''
+
|| '''Slide 7'''
  
 
'''Assignment'''
 
'''Assignment'''
  
As an assignment# Search for students whose names contain the letters “ra” in sequence.
+
As an assignment
 +
# Search for students whose names contain the letters “ra” in sequence.
 
# Find entries where the stream is either “Mechanical” or “Electrical.”
 
# Find entries where the stream is either “Mechanical” or “Electrical.”
 
# List all students whose roll numbers end with the digit 5.
 
# List all students whose roll numbers end with the digit 5.
Line 318: Line 261:
 
|| As an assignment, please do the following.
 
|| As an assignment, please do the following.
 
|-  
 
|-  
|| '''Slide 12'''
+
|| '''Slide 8'''
  
 
'''Thank you'''
 
'''Thank you'''
|| This Spoken Tutorial is brought to you by EduPyramids educational services private limited SINE IIT Bombay. Thank you.
+
 
 +
 
 +
This Spoken Tutorial is brought to you by EduPyramids educational services private limited SINE IIT Bombay.
 +
||  Thank you.
 
|-
 
|-
 
|}
 
|}

Latest revision as of 11:56, 13 March 2026

TITLE: Mastering Grep

Author: EduPyramids

Keywords: grep, search, pattern matching, regular expressions, extended regex, case-insensitive search, character classes, anchors, dot operator, asterisk operator, Linux, Ubuntu, Bash, text search, EduPyramids, video tutorial.


Visual Cue Narration
Slide 1

Title Slide

Welcome to this spoken tutorial on Mastering grep.
Slide 2

Learning Objectives

In this tutorial, we will learn to:

  • Match more than one pattern
  • Check a word that has a different spelling
  • Character classes
  • Use of * operators
  • Match any one character using dot
  • Match a pattern at the beginning and ending of a line.
In this tutorial, we will learn to:
  • Match more than one pattern.
  • Check a word that has a different spelling.
  • Character classes.
  • Use of asterisk operators.
  • Match any one character using a dot.
  • Match a pattern at the beginning and ending of a line
Slide 3

System Requirements

To record this tutorial, I am using:

Ubuntu OS version 24 point zero 4.

Slide 4

Pre-requisites

https://EduPyramids.org


To follow this tutorial,

Learners should have Ubuntu version 24 point zero 4.

And should be familiar with basic Linux terminal commands.

For the prerequisite Linux tutorials please visit this website.

Slide 5

Code files

grepdemo.txt

grep-commands.txt

The following code files are required to practice this tutorial.

These files are provided in the Code Files link of this tutorial page.

Let us get started with grep commands.
Note: Please type the commands on the terminal don't paste as the double quotes are wrong.

Type grep -e "electronics” -e "civil" grepdemo.txt Press Enter.

We can match multiple patterns using the hyphen e option in grep.


We will use the same example file, grep demo dot t x t.

Let us get the details of students from the Civil or Electronics stream.


Type this command and press Enter.

Output displays both the civil and electronics students records.

Type grep -ie “choudhury” -ie “chowdhari” grepdemo.txt Press Enter

Type clear and press Enter.

Now we wish to search for people whose title is Choudhury.

The issue is that the title may be spelled in different ways.

How can we handle this?

In such cases, we can perform a case-insensitive search using the hyphen i option.

We can as well combine it with multiple hyphen e options.

This helps to match different spellings of the same word.

Type this command and press Enter.

The output is displayed.

However, there can be many other ways to write the name.

We could use many -e options, but a better solution is Regular Expressions.

Let me clear the screen.

There are several special characters used in regular expressions.
Let us look at some examples.
In terminal type grep -i “ch[ao][uw]dh[ua]r[yi]”

grepdemo.txt

Press Enter Add annotation for this

A character class is a part of a regular expression.

To understand character class type this command and press Enter.

Observe the output.

This matches most variations of the name chowdhury spelt differently.

However, it still does not match choudhuree with double e.

Type:

grep -i “m[ei]*ra*” grepdemo.txt press Enter

Let us match a student's name Mira in the file.

Type this command

The asterisk operator is a regular expression operator.

It matches zero or more repetitions of the preceding character or pattern.

Press Enter to see the output.

We see records of Mira with 3 different spellings here.

Type grep "M… " grepdemo.txt

press Enter

For example, type this command

The dot is a special character in regular expressions.

It matches any single character, except the newline character.


Now press Enter.

It searches for four letter words that start with M.

Each dot matches one character.

The space after the dots ensures only four-letter matches.

This avoids matching words longer than four letters.

The output shows records for students Mani and Mira.

At the prompt type grep "^A" grepdemo.txt

press Enter

Point to the output.

Let us use anchors.

Now, we will extract entries with roll numbers starting with A.

The roll number is the first field in the file.

Type this command

An anchor is a special symbol used in regular expressions.

It specifies where a pattern should match in a line.

Anchors do not match any character, they match only a position.


Now press Enter.

Only lines with roll numbers starting with A are shown.

The caret acts as an anchor for the beginning of the line.

Grep matches lines where the first character is A.

All other lines, starting with different characters, are ignored.

Type: grep "1$" grepdemo.txtPress Enter.

Type: grep "[78]...$" grepdemo.txt

press Enter

Let us match a pattern at the end of the file.

To match a pattern at the end of a line, we use the dollar sign.

Here is the output.

To find stipends between 7000 to 8999.

Type this command and press Enter.

Only lines with stipend numbers ending in the specified digit are shown.

That is, the numbers between 7000 and 8999.

grep ignores all other lines that do not match this pattern.

Here, it searches for 7 or 8 first.

Then any 3 characters following it, from the end of the file grep demo dot t x t.

Slide 6

Summary

In this tutorial, we have learnt to:

  • Match more than one pattern
  • Check a word that has a different spelling
  • Character classes
  • Use of * operators
  • Match any one character using dot
  • Match a pattern at the beginning and ending of a line.
With this we come to the end of this tutorial.

Let us summarise.

Slide 7

Assignment

As an assignment

  1. Search for students whose names contain the letters “ra” in sequence.
  2. Find entries where the stream is either “Mechanical” or “Electrical.”
  3. List all students whose roll numbers end with the digit 5.
  4. Count how many students have a stipend greater than 5000.
As an assignment, please do the following.
Slide 8

Thank you


This Spoken Tutorial is brought to you by EduPyramids educational services private limited SINE IIT Bombay.

Thank you.

Contributors and Content Editors

Ketkinaina, Madhurig