Difference between revisions of "Linux-AWK/C2/Built-in-Variables-in-awk/English-timed"
PoojaMoolya (Talk | contribs) |
Sandhya.np14 (Talk | contribs) |
||
Line 6: | Line 6: | ||
|- | |- | ||
| 00:01 | | 00:01 | ||
− | | Welcome to the spoken tutorial on '''awk built-in variables''' and '''awk script.''' | + | | Welcome to the '''spoken tutorial''' on '''awk built-in variables''' and '''awk script.''' |
|- | |- | ||
|00:07 | |00:07 | ||
− | | In this tutorial we will learn about '''Built-in variables ''', '''awk script''' | + | | In this tutorial, we will learn about '''Built-in variables ''', '''awk script'''. |
|- | |- | ||
Line 18: | Line 18: | ||
|- | |- | ||
| 00:17 | | 00:17 | ||
− | | To record this tutorial, I am using '''Ubuntu Linux 16.04 Operating System '''and '''gedit text editor''' 3.20.1 | + | | To record this tutorial, I am using: |
+ | '''Ubuntu Linux 16.04 Operating System '''and '''gedit text editor''' 3.20.1 | ||
|- | |- | ||
Line 72: | Line 73: | ||
|- | |- | ||
|01:44 | |01:44 | ||
− | | When we are processing this '''awkdemo''' file with '''awk''' | + | | When we are processing this '''awkdemo''' file with ''''awk' command''', this becomes our '''input '''file. |
|- | |- | ||
| 01:51 | | 01:51 | ||
− | | Observe that all the | + | | Observe that all the '''record'''s are separated from each other by a '''newline character'''. |
|- | |- | ||
Line 86: | Line 87: | ||
|- | |- | ||
| 02:08 | | 02:08 | ||
− | | Notice that all the | + | | Notice that all the '''field'''s are separated by the '''pipe''' symbol. |
How can we inform '''awk '''about it? | How can we inform '''awk '''about it? | ||
Line 94: | Line 95: | ||
|- | |- | ||
| 02:18 | | 02:18 | ||
− | | By default, any number of ''' | + | | By default, any number of '''space'''s or '''tab'''s separate the '''field'''s. |
|- | |- | ||
|02:24 | |02:24 | ||
− | | We can reset this with the help of '''hyphen capital F''' option as learnt in our earlier tutorials. | + | | We can '''reset''' this with the help of '''hyphen capital F''' option as learnt in our earlier tutorials. |
|- | |- | ||
|02:33 | |02:33 | ||
− | | Or else, we can reset this in the '''BEGIN | + | | Or else, we can reset this in the '''BEGIN''' section with the use of '''FS''' '''variable'''. |
|- | |- | ||
Line 116: | Line 117: | ||
|- | |- | ||
| 02:57 | | 02:57 | ||
− | | Go to the folder in which you downloaded and | + | | Go to the '''folder''' in which you downloaded and '''extract'''ed the '''Code Files''' using '''cd command.''' |
|- | |- | ||
Line 124: | Line 125: | ||
|- | |- | ||
| 03:08 | | 03:08 | ||
− | | Here in the '''BEGIN''' section, we have assigned the value of '''FS''' as a '''pipe | + | | Here, in the '''BEGIN''' section, we have assigned the value of '''FS''' as a '''pipe''' symbol. |
Similarly, we can modify '''RS variable.''' | Similarly, we can modify '''RS variable.''' | ||
Line 130: | Line 131: | ||
|- | |- | ||
| 03:19 | | 03:19 | ||
− | | Press '''Enter''' to execute the command. | + | | Press '''Enter''' to '''execute''' the '''command'''. |
|- | |- | ||
Line 138: | Line 139: | ||
|- | |- | ||
| 03:30 | | 03:30 | ||
− | | Here the '''name '''field and the '''stipend '''field | + | | Here, the '''name '''field and the '''stipend '''field are separated by a blank '''space'''. |
|- | |- | ||
|03:36 | |03:36 | ||
− | | Also, all the | + | | Also, all the '''record'''s are separated by a '''newline character.''' |
|- | |- | ||
| 03:42 | | 03:42 | ||
− | | Suppose we want '''colon '''as the '''output field separator | + | | Suppose we want '''colon '''as the '''output field separator''' |
− | + | and double '''newline '''as '''output record separator'''. | |
|- | |- | ||
Line 160: | Line 161: | ||
|- | |- | ||
| 04:01 | | 04:01 | ||
− | | Modify the command as shown here | + | | Modify the command as shown here |
− | + | and then press '''Enter.''' | |
|- | |- | ||
Line 178: | Line 179: | ||
|- | |- | ||
| 04:27 | | 04:27 | ||
− | | How can we extract the roll no. and name information from this file? | + | | How can we '''extract''' the '''roll no.''' and '''name''' information from this file? |
|- | |- | ||
Line 194: | Line 195: | ||
|- | |- | ||
|04:47 | |04:47 | ||
− | | Capital '''NR''' gives the '''Number of Records''' processed by '''awk''' | + | | Capital '''NR''' gives the '''Number of Records''' processed by '''awk'''. |
|- | |- | ||
|04:53 | |04:53 | ||
− | | Capital '''NF''' gives the '''Number of Fields '''in the current record | + | | Capital '''NF''' gives the '''Number of Fields '''in the current record. |
|- | |- | ||
Line 212: | Line 213: | ||
|- | |- | ||
| 05:13 | | 05:13 | ||
− | | Switch to the '''terminal'''. Let me clear the terminal using '''Ctrl''' and '''L''' keys | + | | Switch to the '''terminal'''. Let me clear the terminal using '''Ctrl''' and '''L''' keys. |
|- | |- | ||
Line 220: | Line 221: | ||
|- | |- | ||
| 05:24 | | 05:24 | ||
− | | As the fields are separated by '''pipe '''symbol, set the '''FS''' value to '''pipe''' symbol in the '''BEGIN | + | | As the fields are separated by '''pipe '''symbol, set the '''FS''' value to '''pipe''' symbol in the '''BEGIN''' section. |
|- | |- | ||
Line 228: | Line 229: | ||
|- | |- | ||
| 05:37 | | 05:37 | ||
− | | This checks whether the number of fields in the current line | + | | This checks whether the number of fields in the current line is not equal to 6. |
|- | |- | ||
| 05:43 | | 05:43 | ||
− | | If true, then '''print | + | | If true, then '''print''' section will print the record’s line number '''NR''', along with the entire line denoted by '''$0'''. |
Press '''Enter'''. | Press '''Enter'''. | ||
Line 238: | Line 239: | ||
|- | |- | ||
| 05:55 | | 05:55 | ||
− | | In the output, we can see that record number 16 is the incomplete record. | + | | In the '''output''', we can see that record number 16 is the incomplete record. |
It has only 5 '''fields '''instead of 6. | It has only 5 '''fields '''instead of 6. | ||
Line 274: | Line 275: | ||
| We want to print the first 3 lines from each of these two files. | | We want to print the first 3 lines from each of these two files. | ||
− | We can do this using '''NR | + | We can do this using '''NR''' variable. |
|- | |- | ||
Line 282: | Line 283: | ||
|- | |- | ||
| 07:02 | | 07:02 | ||
− | | Now to display the first 3 lines from each file, type the following command on the '''terminal.''' | + | | Now, to display the first 3 lines from each file, type the following command on the '''terminal.''' |
|- | |- | ||
Line 308: | Line 309: | ||
|- | |- | ||
| 07:39 | | 07:39 | ||
− | | It is | + | | It is re-initialized to zero each time a new input file is started. |
|- | |- | ||
Line 332: | Line 333: | ||
|- | |- | ||
| 08:14 | | 08:14 | ||
− | | In the '''Print | + | | In the '''Print''' section, next to '''NR,''' type '''FNR'''. Press '''Enter.''' |
|- | |- | ||
Line 376: | Line 377: | ||
| Here we have used '''space '''as a '''string concatenation operator.''' | | Here we have used '''space '''as a '''string concatenation operator.''' | ||
− | Press '''Enter''' to execute the command. | + | Press '''Enter''' to '''execute''' the '''command'''. |
|- | |- | ||
Line 390: | Line 391: | ||
|- | |- | ||
| 09:48 | | 09:48 | ||
− | | Clear the '''terminal''' | + | | Clear the '''terminal'''. |
Press the '''up arrow '''key to get the previously executed command. | Press the '''up arrow '''key to get the previously executed command. | ||
Line 422: | Line 423: | ||
|- | |- | ||
| 10:36 | | 10:36 | ||
− | | In the '''terminal''' type the following command | + | | In the '''terminal''', type the following command. |
Press '''Enter'''. | Press '''Enter'''. | ||
Line 430: | Line 431: | ||
| We can see that only one student has passed and gets stipend more than Rs.8000. | | We can see that only one student has passed and gets stipend more than Rs.8000. | ||
− | And the record number is 2. | + | And, the record number is 2. |
|- | |- | ||
Line 468: | Line 469: | ||
|- | |- | ||
| 11:42 | | 11:42 | ||
− | |This code is also available in the '''Code Files''' link. | + | |This '''code''' is also available in the '''Code Files''' link. |
|- | |- | ||
Line 474: | Line 475: | ||
| Switch to the '''terminal'''. | | Switch to the '''terminal'''. | ||
− | See what have we written inside '''single quotes''' of the '''command '''last executed? | + | See, what have we written inside '''single quotes''' of the '''command '''last executed? |
|- | |- | ||
Line 488: | Line 489: | ||
| To execute the file, type the following on the '''terminal-''' | | To execute the file, type the following on the '''terminal-''' | ||
− | '''awk space hyphen small f space prog1.awk space awkdemo.txt '''and press''' Enter''' | + | '''awk space hyphen small f space prog1.awk space awkdemo.txt '''and press''' Enter'''. |
|- | |- | ||
Line 506: | Line 507: | ||
|- | |- | ||
| 12:40 | | 12:40 | ||
− | | In this tutorial we learnt about- '''Built-in variables''' | + | | In this tutorial we learnt about- |
− | + | '''Built-in variables''', | |
− | + | ||
+ | '''awk script''' | ||
using various examples. | using various examples. | ||
|- | |- | ||
| 12:48 | | 12:48 | ||
− | | As an assignment- | + | | As an assignment- |
+ | write an '''awk''' script to print the last field of the 5th line in '''awkdemo.txt '''file. | ||
|- | |- | ||
Line 538: | Line 540: | ||
|- | |- | ||
| 13:28 | | 13:28 | ||
− | | The video at the following link summarises the Spoken Tutorial project. | + | | The video at the following link summarises the '''Spoken Tutorial''' project. |
Please download and watch it. | Please download and watch it. | ||
Line 554: | Line 556: | ||
|- | |- | ||
| 13:51 | | 13:51 | ||
− | | Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India. | + | | Spoken Tutorial Project is funded by '''NMEICT, MHRD''', Government of India. |
More information on this mission is available at this link. | More information on this mission is available at this link. | ||
Line 560: | Line 562: | ||
|- | |- | ||
| 14:03 | | 14:03 | ||
− | | The script has been contributed by Antara. And this is Praveen from IIT Bombay signing off. | + | | The script has been contributed by Antara. And this is Praveen from '''IIT Bombay''', signing off. |
Thanks for joining. | Thanks for joining. | ||
|} | |} |
Latest revision as of 11:30, 10 July 2019
|
|
00:01 | Welcome to the spoken tutorial on awk built-in variables and awk script. |
00:07 | In this tutorial, we will learn about Built-in variables , awk script. |
00:14 | We will do this through some examples. |
00:17 | To record this tutorial, I am using:
Ubuntu Linux 16.04 Operating System and gedit text editor 3.20.1 |
00:30 | The files used in this tutorial are available in the Code Files link on this tutorial page.
Please download and use them. |
00:40 | To practice this tutorial, you should have gone through the earlier awk tutorials on this website. |
00:47 | If not, then please go through the corresponding tutorials on this website. |
00:52 | First, let us see some of the built-in variables in awk. |
00:57 | Capital RS specifies the record separator in an input file. By default, it is newline. |
01:07 | Capital FS specifies the field separator in an input file. |
01:13 | By default, the value of FS is a whitespace. |
01:18 | Capital ORS defines the output record separator.
By default, it is newline. |
01:27 | Capital OFS defines the output field separator.
By default, it is whitespace. |
01:36 | Let us understand the meaning of each of these. |
01:40 | Let us have a look at the awkdemo file now. |
01:44 | When we are processing this awkdemo file with 'awk' command, this becomes our input file. |
01:51 | Observe that all the records are separated from each other by a newline character. |
01:58 | newline is the default value for record separator RS variable.
So, there is no need to do anything else. |
02:08 | Notice that all the fields are separated by the pipe symbol.
How can we inform awk about it? Let us see. |
02:18 | By default, any number of spaces or tabs separate the fields. |
02:24 | We can reset this with the help of hyphen capital F option as learnt in our earlier tutorials. |
02:33 | Or else, we can reset this in the BEGIN section with the use of FS variable. |
02:40 | Let us do this through an example.
Suppose, I want to find out the name of students who are getting a stipend of more than Rs.5000. |
02:51 | Open the terminal by pressing CTRL, ALT and T keys. |
02:57 | Go to the folder in which you downloaded and extracted the Code Files using cd command. |
03:04 | Type the command as shown here. |
03:08 | Here, in the BEGIN section, we have assigned the value of FS as a pipe symbol.
Similarly, we can modify RS variable. |
03:19 | Press Enter to execute the command. |
03:23 | The output shows the list of students who are receiving more than Rs.5000 as a stipend. |
03:30 | Here, the name field and the stipend field are separated by a blank space. |
03:36 | Also, all the records are separated by a newline character. |
03:42 | Suppose we want colon as the output field separator
and double newline as output record separator. |
03:52 | How can we do this? Let us see. |
03:55 | In the terminal, press the up arrow key to get the previously executed command. |
04:01 | Modify the command as shown here
and then press Enter. |
04:08 | We get the output in the desired format. |
04:12 | Now, suppose our new input file is sample.txt. |
04:18 | Observe that the field separator here is newline and record separator is double newline. |
04:27 | How can we extract the roll no. and name information from this file? |
04:32 | Yes, you have guessed correctly. We have to modify both the FS and RS variables. |
04:39 | Pause this tutorial and do this as an assignment. |
04:43 | Next, let us see other built-in variables. |
04:47 | Capital NR gives the Number of Records processed by awk. |
04:53 | Capital NF gives the Number of Fields in the current record. |
04:59 | Let us see one example on this.
Suppose, we want to find incomplete lines in the file. |
05:07 | Here, incomplete line means it has less than the normal 6 fields. |
05:13 | Switch to the terminal. Let me clear the terminal using Ctrl and L keys. |
05:20 | Type the command as shown. |
05:24 | As the fields are separated by pipe symbol, set the FS value to pipe symbol in the BEGIN section. |
05:33 | Next we have written NF not equal to 6. |
05:37 | This checks whether the number of fields in the current line is not equal to 6. |
05:43 | If true, then print section will print the record’s line number NR, along with the entire line denoted by $0.
Press Enter. |
05:55 | In the output, we can see that record number 16 is the incomplete record.
It has only 5 fields instead of 6. |
06:05 | Let us see one more example.
How can we print the first and last field for each student regardless of how many fields there are? |
06:16 | Type the command as shown here on the terminal. |
06:21 | Here we have used hyphen capital F option instead of setting FS variable.
Press Enter. |
06:30 | We get only the first and the last fields for each record in the file. |
06:36 | Let’s try something else now. |
06:39 | Suppose, the student records are distributed across two files demo1.txt, demo2.txt. |
06:48 | We want to print the first 3 lines from each of these two files.
We can do this using NR variable. |
06:57 | Here are the contents of the two files. |
07:02 | Now, to display the first 3 lines from each file, type the following command on the terminal. |
07:11 | Press Enter. |
07:13 | The output shows only the first 3 records of demo1.txt file. |
07:20 | How can we print the same for the second file also? |
07:24 | The solution is to use FNR instead of NR.
FNR is the current record number in the current file. |
07:34 | FNR is incremented each time a new record is read. |
07:39 | It is re-initialized to zero each time a new input file is started. |
07:46 | But NR is the number of input records awk has processed since the starting of the program's execution. |
07:55 | It does not reset to zero with a new file. |
07:59 | Switch to the terminal.
Press the up arrow key to get the previously executed command. |
08:06 | Modify the previous command as follows.
Type FNR instead of NR. |
08:14 | In the Print section, next to NR, type FNR. Press Enter. |
08:21 | See, we get the correct output now.
FNR is set to zero with new file but NR keeps on increasing. |
08:31 | Let us now look at some other built-in variables.
FILENAME variable gives the name of the file being read. |
08:40 | ARGC specifies the number of arguments provided at the command line. |
08:46 | ARGV represents an array that stores the command line arguments. |
08:52 | ENVIRON specifies the array of the shell environment variables and corresponding values. |
09:00 | As ARGV and ENVIRON use array in awk, we will look at those in subsequent tutorials. |
09:09 | Let us have a look at the variable FILENAME now.
How can we print the name of the current file being processed? |
09:18 | Switch to the terminal and type the command as shown. |
09:23 | Here we have used space as a string concatenation operator.
Press Enter to execute the command. |
09:32 | The output shows the input filename multiple times. |
09:37 | This is because, this command prints the filename once for each row in the awkdemo.txt file.
How can we print this only once? |
09:48 | Clear the terminal.
Press the up arrow key to get the previously executed command. |
09:55 | Modify the previous command as shown here.
Press Enter. |
10:02 | Now, We get the filename only once. |
10:06 | There are some other built-in variables in awk.
Please browse the internet to know more on them. |
10:14 | Suppose, we want to find the students who have passed and have stipend more than Rs.8000 |
10:22 | use comma as the output field separator and print “The data is shown for file” and the name of file in the footer section.
How can we do this? |
10:36 | In the terminal, type the following command.
Press Enter. |
10:43 | We can see that only one student has passed and gets stipend more than Rs.8000.
And, the record number is 2. |
10:53 | We can also see the name of the file in the footer, as desired. |
10:58 | We can use awk for more and more complex tasks. |
11:03 | In that case, it becomes more difficult to write the commands every time on the terminal. |
11:09 | We can instead write the awk program in a separate file. |
11:14 | To be executable, that file should have the dot awk extension. |
11:19 | While executing, we can just specify this awk program filename with the awk command. |
11:26 | For doing so, we need to use hyphen small f option.
Let us see an example. |
11:35 | I have already written an awk program and saved it as prog1 dot awk. |
11:42 | This code is also available in the Code Files link. |
11:46 | Switch to the terminal.
See, what have we written inside single quotes of the command last executed? |
11:55 | Content of prog1.awk file is exactly the same. |
12:00 | The only difference is that in the awk file, we have not written inside the single quotes. |
12:07 | To execute the file, type the following on the terminal-
awk space hyphen small f space prog1.awk space awkdemo.txt and press Enter. |
12:24 | We are getting exactly the same output as we have seen before. |
12:29 | So, this way you can write awk programs and use it multiple times. |
12:35 | This brings us to the end of this tutorial.
Let us summarize. |
12:40 | In this tutorial we learnt about-
Built-in variables, awk script using various examples. |
12:48 | As an assignment-
write an awk script to print the last field of the 5th line in awkdemo.txt file. |
12:58 | Open the system file /etc/passwd on the terminal. |
13:05 | Identify all the separators therein. |
13:09 | Now write a script to process the file from the 20th line onwards. |
13:15 | That too, only for the lines that contain more than 6 fields. |
13:20 | You should print the line number, entire line and count of fields in that particular line. |
13:28 | The video at the following link summarises the Spoken Tutorial project.
Please download and watch it. |
13:36 | The Spoken Tutorial Project team conducts workshops using spoken tutorials and gives certificates.
For more details, please write to us. |
13:47 | Please post your timed queries in this Forum. |
13:51 | Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.
More information on this mission is available at this link. |
14:03 | The script has been contributed by Antara. And this is Praveen from IIT Bombay, signing off.
Thanks for joining. |