Linux-AWK/C2/Basics-of-awk/English

From Script | Spoken-Tutorial
Revision as of 15:29, 1 July 2013 by Ashwini (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Title of script: The Awk tool Part 1

Author: Sachin Patil

Keywords: Selection criteria, action, formatted printing, fields, Regular expressions, Variables


Visual Cue
Narration
Display Slide 1 Welcome to this spoken tutorial on the awk command.
Display Slide 2

Learning Objective

In this tutorial we will learn,

awk command.

Display Slide 3

System requirement

To record this tutorial, I am using

Ubuntu Linux 12.04 OS

GNU BASH v. 4.2.24(1)

Please note, GNU bash version 4 or above is recommended to practice this tutorial.

Display Slide 4

Introduction

The awk command is a very powerful text manipulation tool of Linux.

It is named after its authors, Aho, Weinberger and Kernighan.

Continue Slide


It can perform several functions.

It operates at the field level of a record.

So, it can easily access and edit the individual fields of the record.

Let us see some examples.

For demonstration purpose we use the awkdemo.txt file.

Let us see the contents of awkdemo.txt.



awkdemo.txt This is the content of awkdemo.txt file.
Open the terminal

ctrl+alt+t


type:

Now open the terminal by pressing ctrl+alt+t
type:


"awk '/Pass/ {print}' awkdemo.txt" [enter]

Now type:

awk space (opening single quote) (front slash) ‘/Pass (front slash)/(in curly brace) {print} (closing single quote) space awkdemo.txt


Here Pass is the selection criterion.


All the lines of the awkdemo where Pass occurs are printed.


The action here is print.

Type

"awk '/M[ei]*ra*/ {print}' awkdemo.txt" [enter]

We can also use regular expressions in awk


Say we want to print records of students with name Mira.


We would type:


awk space '/M[ei]*ra*/{print}' space awkdemo.txt


Press Enter.


* will give one or more occurrences of previous character.


Thus entries with more than one occurrence for i, e and a will be listed


For ex. Meera

Mira

Meeraa



awk supports the extended regular expressions (ERE).

Which means we can match multiple patterns separated by a PIPE.

Type

"awk '/civil|electrical/ {print}' awkdemo" [enter]

Type at the terminal:


awk space (in single quotes)(front slash) ‘/civil (vertical bar)|electrical (front slash)/{print}' awkdemo.txt


Press Enter.


Now entries for both civil and electrical are shown.

Display slide 7


Lets go back to the slides.


Awk has some special parameters to identify individual fields of a line.


$1(Dollar 1) would indicate the first field.


Similarly we can have $2, $3 and so on for respective fields.


$0 represents the entire line.

Switch to the terminal


Type: cat awkdemo.txt

Note that each word is separated by PIPE in the file awkdemo.txt.


In this case PIPE is called a delimiter.


A delimiter separates words from each other.


A delimiter can also be a single whitespace.


To specify a delimiter we have to give -F flag followed by a delimiter.



Type

“awk -F "|" '/civil|electrical/ {print $0}' awkdemo ” [enter]


Lets go back to the terminal.


So we can write the last command as:


awk space minus capital F space


Within double quotes PIPE space


Within single quote front slash civil PIPE electrical front slash


Within curly braces print space dollar0 now outside the quotes space awkdemo.txt


This print the entire line since we have used $0.

Type

“awk -F"|" '/Pass/ {print $2, $3}' awkdemo” [enter]

Notice that, names and stream of students are the second and third fields.


Say we only want to print two fields.


We will replace $0 with $2,$3 in the above command.


Let’s try.


press Enter

Though it gives the right result the display is all jagged and unformatted.



“awk -F"|" '/Pass/ {printf "%4d %-25s %-15s \n",

NR,$2,$3 }' awkdemo” [enter]


We can provide formatted output by using the C style printf statement.


We can also provide a serial number by using a builtin variable NR.


We will see more about builtin variables later.


We would write:

awk space -F”|” space '/Pass/{printf “%4d %-25s %-15s \n”, NR,$2,$3 }' space awkdemo.txt


Press Enter.


Here the NR stands for number of records.


records are integers, hence we have written %d.


Name and stream are strings. So we have used %s.


Here 25s will reserve 25 spaces for Name field.


15s will reserve for Stream field.


The minus sign is used to left justify the output.



Display Slide 11

Acknowledgement Slide


Watch the video available at the link shown below

It summarises the Spoken Tutorial project

If you do not have good bandwidth, you can download and watch it

Display Slide 12

Spoken Tutorial Workshops


The Spoken Tutorial Project Team

Conducts workshops using spoken tutorials

Gives certificates to those who pass an online test

For more details, please write to

contact@spoken-tutorial.org

Display Slide 13

Acknowledgement


Spoken Tutorial Project is a part of the Talk to a Teacher project

It is supported by the National Mission on Education through ICT, MHRD, Government of India

More information on this Mission is available at: http://spoken-tutorial.org\NMEICT-Intro

No Last Slide for tutorials created at IITB

Display the previous slide only and narrate this line.

The script has been contributed by Sachin Patil.

This is Ashwini Patil from IIT Bombay signning off. Thank you for joining.

Contributors and Content Editors

Ashwini, Nancyvarkey