Sed-Stream-Editor/C2/Sed-special-characters/English

From Script | Spoken-Tutorial
Revision as of 13:19, 15 February 2021 by Nancyvarkey (Talk | contribs)

Jump to: navigation, search


VISUAL CUE NARRATION
Slide 1: Welcome to the spoken tutorial on Sed special characters.
Slide 2:

Learning Objectives

In this tutorial, we will learn about:
  • Print command

Also, how to use special characters such as

  • $ (dollar)
  • ^ (caret)
  • = (equal to)
  • &(ampersand)

in Sed command.

Slide 3:

System requirements

This tutorial is recorded using Ubuntu Linux OS version 18.04
Slide 4:

Prerequisites

To follow this tutorial, you should know basics of Linux.

If not, for relevant tutorials please visit our website.

Slide 5:

Code files – IndianBooks.txt

  • The files used in this tutorial are available in the Code Files link on this tutorial page.
  • Please download and extract them
  • Make a copy and then use them while practicing
Open the terminal Open the terminal by pressing Ctrl+Alt+T keys simultaneously.

Press the Enter key after every command.

> cd Downloads

> cat IndianBooks

Let us see the content of the file IndianBooks.txt which I have saved in the Downloads folder.

Type cat space IndianBooks.txt

In this tutorial, I’ll be using this file for demonstration.

Print command First we will learn about print command.
>sed ‘p’ IndianBooks.txt

Highlight p

Type

sed space within single quotes p space IndianBooks.txt

'p' is the command for printing the data from the pattern buffer.

In the output, we can see that each line is displayed twice.

Recall the workflow of Sed.

By default, Sed prints the content of the pattern buffer.

We have included the print option 'p' in the above command. So it prints twice.

>sed -n ‘p’ IndianBooks.txt

Highlight -n

Use -n to suppress the default printing of the pattern buffer.

Type the command with -n as shown.

-n (hyphen n) is the Sed command line flag.

Address range

> sed -n ‘5p’ IndianBooks.txt

Highlight 5p

Let us see how Sed operates on a particular line.

Type the command as shown.

5p prints the fifth line of the input file IndianBooks.txt

> sed -n ‘3,6 p’ IndianBooks.txt

Highlight 3,6p

Type the command as shown.

Here we have specified an address range in the Sed command.

This will print from the 3rd line to the 6th line .

$ command Next we will see how to use the special character dollar($) in Sed command.
> sed -n ‘$p’ IndianBooks.txt

Highlight $

This command with a special character dollar($) matches the last line of the file.
>sed -n ‘4, $p’ IndianBooks.txt We can use the dollar($) character to specify the address range.

Type the command as shown.

This command prints from the 4th line to the last line.

+ operator

> sed -n ‘3, +3p’ IndianBooks.txt

Next we will see how to use plus(+) operator in the address range.

Type the command with plus(+) operator as shown.

This command prints the next 3 lines starting from the third line.

~ operator

> sed -n ‘1~2p’ IndianBooks.txt

We can use the tilde(~) operator to specify the address range.

For example, this command prints the odd line numbers from the file.

That is, it starts from line number 1 and processes every second line.

>sed -n ‘2~2 p’ IndianBooks.txt Type this command to print only the even numbers from the file.
Pattern range

>sed -n ‘ /Gandhi/ p’ IndianBooks.txt

Highlight between forward slash

Next we will see how Sed command handles the pattern range.

Type sed space hyphen n space within single quotes forward slash Gandhi forward slash p space IndianBooks.txt

Note that pattern has to be specified within forward slashes.

Sed operates on each line and prints only those lines that match the string Gandhi.

> sed -n’/Gandhi/!p’ IndianBooks.txt

Highlight ! operator

We can reverse the above command.

We can print the lines that don’t contain the word Gandhi.

Use negation operator as shown here and see the output.

Next we will see about a few other special characters used in regular expressions.
Character Description

^ Matches the beginning of the line

$ Matches the end of the line

. Matches any single character

  • Matches zero or more occurrences

[ ] Matches all the characters inside the [ ]

Caret matches the beginning of the line

Dollar matches the end of the line

Dot matches any single character

Asterisk matches zero or more occurrences

Square brackets matches all the characters inside it

Let us see how these special characters work with the sed commands.
(^) start of line

> sed -n ‘/^1/ p’ IndianBooks.txt

Type the command as shown.

This command prints all the lines that starts with the pattern ‘1’ (one).

The caret(^) symbol matches the start of a line.

($) end of line

>sed -n ‘/3$/ p’ IndianBooks.txt

In this command, the end of the line pattern is denoted by the dollar($) symbol.

This command prints all the lines that end with the pattern ‘3’.

(.) single character

>sed -n ‘/...8$/ p’ IndianBooks.txt

Type the command as shown.

Dot matches any single character.

This command prints all the four letter text that ends with 8.

(*)

>sed -n '/Ind*/ p' IndianBooks.txt

Let us see an example for asterisks.

Asterisks (*) matches the zero or more occurrence of the preceding character.

This prints the line that matches "India", "Indira" and so on.

= (equal to) Next we will see how to use equal to command.
>sed ‘=’ IndianBooks.txt

Highlight =

Type as shown.

Equal to command is used to print line numbers in standard output.

We can see the line number followed by the content.

> sed ‘1,5=’ IndianBooks.txt Type the command as shown.

This prints the line numbers for the first 5 lines and the remaining without the line numbers.

>sed ‘/Nehru/ =’ IndianBooks.txt This command prints the line numbers with respect to the pattern match.
>sed -n '/My/ =' IndianBooks.txt How to find out the line numbers that contain a pattern?

Type the command as shown.

This prints only the line numbers of the line that has the pattern ‘My’

sed -n '$=' IndianBooks.txt Next let us find the number of lines in a file.

Type the command as shown and see the output.

& command:

sed 's/^.*/(&)/' IndianBooks.txt

sed 's/^.*dhi/(&)/' IndianBooks.txt

Next we will see how to use ampersand in Sed command.

Type the command.

When ampersand is used in the search string, it replaces with whatever text matches the original string.

This command inserts parentheses in all the lines.

Let’s see another example.

This command puts parentheses around the matched text which ends with ‘dhi’.

Slide:

[] regular expression:

Character class Description

[:alnum:] Alphanumeric [a-z A-Z 0-9]

[:alpha:] Alphabetic [a-z A-Z]

[:blank:] Spaces or tabs

[:digit:] Numeric digits [0-9]

[:lower:] Lower-case [a-z]

[:space:] Whitespace

[:upper:] Upper-case [A-Z]

The square bracket regular expression has some more additional options as shown here.

Let us see a few examples on how to use these in regular expression.

>sed 's/[Tt]ruth/TRUTH/' IndianBooks.txt

Highlight the output

Type the command as shown.

In regular expression terminology, a character set is represented by square brackets.

This command matches the pattern "Truth" with capital T and small ‘t’.

Then it replaces the word with TRUTH, all in capital letters.

sed 's/19[0-9]*/****/' IndianBooks.txt

Highlight 19[0-9]

Highlight ****

Highlight the output

Next we will see how the character range is specified with hyphen.

Type the command as shown.

Note this regular expression.

It should match with the string starting with nineteen and any character within the range 0 to 9.

Then it is replaced with 4 asterisks.

In the output, we can see all the published years are replaced with 4 asterisks.

>sed ‘s/digit:/Book no &/’ IndianBooks.txt

Highlight s

Highlight the output

Let us see the digit character class in Sed command.

Here, the digit represents number 0 to 9 as the search pattern.

This command adds the text ‘Book No’ at the beginning of each line.

Cat IndianBooks.txt

Highlight year

> sed ‘s/digit:*$/Published on: &/’ IndianBooks.txt

Highlight digit:*$

Let us see another example.

In IndianBooks.txt , the last entry is the published year.

Let us add the word “Published on” before the year.

Type the command as shown.

This command finds the last occurrence of the digit and replaces with ‘Published on:’

With this we come to the end of this tutorial. Let us summarize.

Slide 8:

Summary

In this tutorial, we learnt:
  • Print command
  • special characters such as
  • $ (dollar)
  • ^ (caret)
  • = (equal to)
  • &(ampersand)

in Sed command.

Slide:9

Assignment

As an assignment, try the below commands and see the output.
  1. sed 's/Indi.*/****/' IndianBooks.txt
  2. sed 's/Indi[a-z]*a/****/' IndianBooks.txt
Slide 10:

(About Spoken Tutorial Project)

The video at the following link, summarizes the Spoken Tutorial project.

Please download and watch it.

Slide 11:

(About Spoken Tutorial Project)

The Spoken Tutorial Project Team conducts workshops and gives certificates.

For more details, please write to us.

Slide

Forum questions:

  • Do you have questions in THIS Spoken Tutorial?
  • Please visit this site
  • Choose the minute and second where you have the question
  • Explain your question briefly
  • The Spoken Tutorial project team will ensure an answer
  • You will have to register on this website to ask questions.
Slide: Acknowledgement Spoken Tutorial project is funded by the Ministry of Education (MoE), Govt. of India.
This is Pooja from Spoken tutorials, IIT Bombay signing off.

Thanks for joining.

Contributors and Content Editors

Nancyvarkey, Nirmala Venkat