From Script | Spoken-Tutorial
Jump to: navigation, search
Time Narration
00:01 Welcome to the spoken tutorial on awk command.
00:05 In this tutorial we will learn awk command.
00:09 We will do this through some examples.
00:12 To record this tutorial, I am using:

Ubuntu Linux 12.04 OS GNU BASH v. 4.2.24

00:23 Please note, GNU Bash version 4 or above is recommended to practice this tutorial.
00:29 Let us start with an introduction to awk.
00:33 The awk command is a very powerful text manipulation tool.
00:38 It is named after its authors, Aho, Weinberger and Kernighan.
00:44 It can perform several functions.
00:46 It operates at the field level of a record.
00:51 So, it can easily access and edit the individual fields of the record.
00:56 Let us see some examples.
00:59 For demonstration purpose, we use the awkdemo.txt file.
01:04 Let us see the contents of awkdemo.txt file.
01:09 Now open the terminal window by pressing Ctrl + Alt and T keys simultaneously on your keyboard.
01:17 Let us see how to print using awk command.
01:22 Type: awk space (within single quote) (front slash) '/Pass (front slash)/(opening curly bracket) {print (closing curly bracket)} (after the quotes) space awkdemo.txt
01:38 Press Enter.
01:40 Here, Pass is the selection criteria.
01:44 All the lines of the awkdemo where Pass occurs are printed.
01:49 The action here is print.
01:52 We can also use regular expressions in awk.
01:56 Say, we want to print records of students with name "Mira."
02:01 We would type:

awk space '/M (opening square bracket) [ ei (closing square bracket) ]*ra */{print}' space awkdemo.txt

02:27 Press Enter.
02:29 "*" will give one or more occurrences of previous character.
02:33 Thus, entries with more than one occurrence for i, e and a will be listed.
02:40 For example,
02:42 Mira with M I R A
02:45 Meera with M double E R A
02:47 and Meeraa with M double E R double A
02:52 awk supports the extended regular expressions (ERE)
02:58 Which means we can match multiple patterns separated by a PIPE.
03:03 Let me clear the prompt.
03:05 electrical(front slash)space (opening curly brackets)/{print}(closing curly brackets) after the quotes spaceawkdemo.txt
03:23 Press Enter.
03:26 Now entries for both "civil" and "electrical" are given.
03:31 Let us go back to our slides.
03:34 Parameters: awk has some special parameters to identify individual fields of a line.
03:41 $1(Dollar 1) would indicate the first field.
03:45 Similarly we can have $2, $3 and so on for respective fields.
03:53 $0 represents the entire line.
03:56 come back to our terminal.
03:59 Note that each word is separated by PIPE in the file awkdemo.txt.
04:05 In this case PIPE is called a delimiter.
04:09 A delimiter separates words from each other.
04:13 A delimiter can also be a single white space.
04:16 To specify a delimiter, we have to give - capital F flag followed by a delimiter.
04:24 Let us see. Type: awk space minus capital F space within double quotes PIPE space within single quote front-slash civil PIPE electrical front-slash opening curly bracket print space dollar0 closing curly bracket after the quotes space awkdemo.txt
04:51 Press Enter.
04:53 This prints the entire line since we have used $0.
04:58 Notice that names and stream of students are the second and third fields.
05:04 Say we only want to print two fields.
05:08 We will replace $0 with $2 and $3 in the above command.
05:15 Press Enter .
05:18 Only two fields are shown.
05:21 Though it gives the right result, the display is all jagged and un-formatted.
05:26 We can provide formatted output by using the C style printf statement.
05:32 We can also provide a serial number by using a built-in variable NR.
05:40 We will see more about built-in variables later.
05:44 Now Type awk space minus capital F within double quotes (Pipe) after the double quotes space 'front-slash Pass front slash opening curly bracket printf (within double quotes) "percentage sign 4d space percentage sign -25s space percentage sign minus 15s space backslash n”, after the double quotes NR, $2, $3 closing curly bracket' after the single quote space awkdemo.txt
06:33 Press Enter. We see the difference.
06:37 Here, NR stands for number of records.
06:41 Records are integers, hence we have written %d.
06:45 Name and Stream are strings. So we have used %s.
06:50 Here 25s will reserve 25 spaces for Name field.
06:55 15s will reserve 15 spaces for Stream field.
07:01 The minus sign is used to left justify the output.
07:05 This brings us to the end of this tutorial.
07:08 Let us move back to our slides.
07:10 Let us summarize. In this tutorial we learnt: * To print using awk
07:16 Regular expression in awk * To list the entries for a particular stream
07:21 To list only the second and the third fields
07:24 To display a formatted output.
07:28 As an assignment, display roll no., stream and marks of Ankit Saraf.
07:34 Watch the video available at the link shown below.
07:37 It summarizes the Spoken Tutorial project.
07:40 If you do not have good bandwidth, you can download and watch it.
07:45 The Spoken Tutorial Project Team: Conducts workshops using spoken tutorials.
07:48 Gives certificates to those who pass an online test.
07:52 For more details, please write to
07:58 Spoken Tutorial Project is a part of the Talk to a Teacher project.
08:01 It is supported by the National Mission on Education through ICT,MHRD,Government of India.
08:07 More information on this Mission is available at: [1]
08:12 This is Ashwini Patil from IIT Bombay, signning off. Thank you for joining.

Contributors and Content Editors

Nancyvarkey, PoojaMoolya, Pratik kamble, Ranjana, Sandhya.np14