Difference between revisions of "Linux-AWK/C2/Built-in-Variables-in-awk/English-timed"

From Script | Spoken-Tutorial
Jump to: navigation, search
(Created page with " {| border=1 | <center>'''Time'''</center> | <center>'''Narration'''</center> |- | 00:01 | Hello and Welcome to this spoken tutorial on''' built-in functions''' in '''awk'''...")
 
 
(One intermediate revision by one other user not shown)
Line 2: Line 2:
 
{| border=1
 
{| border=1
 
| <center>'''Time'''</center>
 
| <center>'''Time'''</center>
| <center>'''Narration'''</center>
+
| <center>'''Narration'''</center>
  
 
|-
 
|-
 
| 00:01
 
| 00:01
| Hello and Welcome to this spoken tutorial on''' built-in functions''' in '''awk'''.
+
| Welcome to the '''spoken tutorial''' on '''awk built-in variables''' and '''awk script.'''
  
 
|-
 
|-
| 00:07
+
|00:07
| In this tutorial we will learn about different types of '''built-in functions''' like-
+
| In this tutorial, we will learn about '''Built-in variables ''''''awk script'''.
 
+
'''Arithmetic functions'''
+
  
 
|-
 
|-
| 00:15
+
| 00:14
| '''String functions'''
+
|We will do this through some examples.
  
 
|-
 
|-
 
| 00:17
 
| 00:17
| '''Input/Output functions '''and  '''Time-stamp functions'''
+
| To record this tutorial, I am using: 
 +
'''Ubuntu Linux 16.04 Operating System '''and  '''gedit text editor''' 3.20.1
  
 
|-
 
|-
| 00:23
+
| 00:30
|We will do this through some examples.
+
| The files used in this tutorial are available in the '''Code Files''' link on this tutorial page.
 
+
|-
+
| 00:26
+
|  To record this tutorial, I am using  '''Ubuntu Linux '''16.04 operating system and
+
  
'''gedit text editor '''3.20.1
+
Please download and use them.
  
 
|-
 
|-
| 00:38
+
|00:40
|You can use any text editor of your choice.
+
| To practice this tutorial, you should have gone through the earlier '''awk tutorials''' on this website.
  
 
|-
 
|-
|00:42
+
|00:47
| To practice this tutorial, you should have gone through the earlier '''awk''' tutorials on this website.
+
| If not, then please go through the corresponding tutorials on this website.
  
 
|-
 
|-
| 00:49
+
|00:52
|You should have some knowledge of any programming language like '''C''' or '''C++'''.
+
| First, let us see some of the '''built-in variables '''in '''awk'''.  
  
 
|-
 
|-
| 00:56
+
|00:57
|If not, then please go through the corresponding tutorials on our website.
+
| Capital '''RS''' specifies the '''record separator''' in an '''input''' file. By default, it is '''newline'''.
  
 
|-
 
|-
| 01:02
+
|01:07
The files used in this tutorial are available in the '''Code Files''' link on this tutorial page.
+
Capital '''FS''' specifies the '''field separator '''in an '''input''' file.
 
+
Please download and extract them.
+
  
 
|-
 
|-
| 01:12
+
|01:13
| '''Built-in functions''' are always available for '''awk''' to call.
+
| By default, the value of '''FS''' is a '''whitespace'''.
  
 
|-
 
|-
| 01:17
+
|01:18
|First we will learn about the '''arithmetic functions.'''
+
| Capital '''ORS''' defines the '''output record separator'''.
  
'''square root function (sqrt (x))''' returns positive '''square root''' of a number '''x'''
+
By default, it is '''newline'''.
  
 
|-
 
|-
| 01:27
+
|01:27
| '''int''' function truncates '''x''' to an integer value
+
| Capital '''OFS''' defines the '''output field separator'''.
  
|-
+
By default, it is '''whitespace.'''
| 01:32
+
| '''exponential  function''' gives the exponential of '''x'''
+
  
 
|-
 
|-
| 01:37
+
|01:36
|   '''log function''' returns natural '''logarithm''' value of '''x'''
+
| Let us understand the meaning of each of these.
  
 
|-
 
|-
| 01:43
+
| 01:40
| '''sin''' and '''cos''' gives '''sine(x)''' and '''cosine(x)''' respectively
+
| Let us have a look at the '''awkdemo''' file now.
  
 
|-
 
|-
| 01:49
+
|01:44
| Please note that '''argument x '''should be mentioned in '''radians'''.
+
| When we are processing this '''awkdemo''' file with ''''awk' command''', this becomes our '''input '''file.
  
 
|-
 
|-
| 01:55
+
| 01:51
| Let’s look at an example to understand these '''functions'''.
+
| Observe that all the '''record'''s are separated from each other by a '''newline character'''.  
  
 
|-
 
|-
| 02:00
+
|01:58
| I have already written the code in a file '''arithmetic underscore function dot awk'''
+
| '''newline''' is the default value for '''record separator RS variable. '''
  
The same is available in the '''Code Files''' link.
+
So, there is no need to do anything else.
  
 
|-
 
|-
| 02:10
+
| 02:08
| Here, we are printing the '''square root''' of a positive and negative number respectively.
+
| Notice that all the '''field'''s are separated by the '''pipe''' symbol.
 +
 
 +
How can we inform '''awk '''about it?
 +
 
 +
Let us see.  
  
 
|-
 
|-
| 02:17
+
| 02:18
Next we are printing the integer value for a positive and negative number respectively.
+
| By default, any number of '''space'''s or '''tab'''s separate the '''field'''s.
  
 
|-
 
|-
| 02:24
+
|02:24
| Then we are printing exponential of a small number and a very large number.
+
| We can '''reset''' this with the help of '''hyphen capital F''' option as learnt in our earlier tutorials.
  
 
|-
 
|-
| 02:31
+
|02:33
| After that, natural '''logarithm''' of positive and negative numbers are printed.
+
| Or else, we can reset this in the '''BEGIN''' section with the use of '''FS''' '''variable'''.  
  
 
|-
 
|-
| 02:38
+
| 02:40
| We are also printing '''sine''' and '''cosine''' values of '''0.52 radian''', that is actually '''30 degree.'''
+
| Let us do this through an example.
  
Let us execute the file in the '''terminal.'''
+
Suppose, I want to find out the name of students who are getting a stipend of more than Rs.5000.
  
 
|-
 
|-
| 02:50
+
| 02:51
|  Open the terminal by pressing Ctrl, Alt and T Keys.
+
|  Open the '''terminal''' by pressing '''CTRL, ALT''' and '''T''' keys.
  
 
|-
 
|-
| 02:55
+
| 02:57
Next go to the folder where you have downloaded and extracted the file using '''cd''' command.
+
Go to the '''folder''' in which you downloaded and '''extract'''ed the '''Code Files''' using '''cd command.'''
  
 
|-
 
|-
|03:03
+
| 03:04
Now type '''awk space -f space arithmetic_function.awk '''
+
Type the command as shown here.
 
+
And press '''Enter '''to see the output.
+
  
 
|-
 
|-
| 03:14
+
| 03:08
| Couple of things are clear from this output.
+
| Here, in the '''BEGIN''' section, we have assigned the value of '''FS''' as a '''pipe''' symbol.
 +
 
 +
Similarly, we can modify '''RS variable.'''
  
 
|-
 
|-
| 03:18
+
| 03:19
| '''sqrt() function''' gives square root of a positive number.  
+
| Press '''Enter''' to '''execute''' the '''command'''.
  
 
|-
 
|-
 
| 03:23
 
| 03:23
| It returns '''nan''' or '''not a number''' if the number is negative.
+
| The output shows the list of students who are receiving more than Rs.5000 as a stipend.
 +
 
 +
|-
 +
| 03:30
 +
| Here, the '''name '''field and the '''stipend '''field are separated by a blank '''space'''.
  
 
|-
 
|-
| 03:29
+
|03:36
| '''int()''' gives the truncated integer of any positive or negative number.
+
| Also, all the '''record'''s are separated by a '''newline character.'''
  
 
|-
 
|-
| 03:36
+
| 03:42
| '''exp()''' gives exponential of a number.
+
| Suppose we want '''colon '''as the '''output field separator'''
  
If the number is very large, the '''function''' will return '''inf'''.
+
and double '''newline '''as '''output record separator'''.
  
 
|-
 
|-
| 03:47
+
|03:52
| Natural '''logarithm''' of positive number is given by '''log() function'''.
+
| How can we do this? Let us see.
  
 
|-
 
|-
| 03:53
+
| 03:55
| If the number if negative, the '''function '''returns '''nan'''.
+
| In the '''terminal''', press the '''up arrow '''key to get the previously executed command.
  
 
|-
 
|-
| 03:58
+
| 04:01
| '''Sine '''and '''cosine functions '''return corresponding values.
+
| Modify the command as shown here
  
You can verify the value using your calculator.
+
and then press '''Enter.'''
  
 
|-
 
|-
| 04:07
+
| 04:08
| Now, let us look at '''random functions.'''
+
| We get the output in the desired format.
  
 
|-
 
|-
| 04:11
+
| 04:12
| '''rand()''' returns any random number between 0 and 1. But never returns 0 or 1.
+
| Now, suppose our new input file is '''sample.txt.'''
  
 
|-
 
|-
|04:21
+
|04:18
| Generated numbers will be random within one '''awk '''execution.
+
| Observe that the '''field separator '''here is '''newline '''and '''record separator '''is double '''newline.'''
  
 
|-
 
|-
 
| 04:27
 
| 04:27
| But predictable across different executions of the '''awk''' program.
+
| How can we '''extract''' the '''roll no.''' and '''name''' information from this file?
  
 
|-
 
|-
|04:33
+
|04:32
| '''srand(x)''' is used to provide '''seed value x''' for '''random function.'''
+
| Yes, you have guessed correctly. We have to modify both the '''FS''' and '''RS''' '''variables'''.
  
 
|-
 
|-
| 04:39
+
|04:39
| In absence of '''x''', date and time of day is used as the '''seed value.'''
+
| Pause this tutorial and do this as an assignment.
  
Let us understand these with an example.
+
|-
 +
| 04:43
 +
| Next, let us see other '''built-in variables.'''
  
 
|-
 
|-
| 04:49
+
|04:47
I have written a code for the '''random function '''and saved it as '''random.awk'''
+
Capital '''NR''' gives  the '''Number of Records''' processed by '''awk'''.
  
 
|-
 
|-
| 04:56
+
|04:53
Here, inside the '''for loop, rand() function '''will generate a random number between 0 and 1.
+
Capital '''NF''' gives the '''Number of Fields '''in the current record.
  
 
|-
 
|-
| 05:04
+
| 04:59
| Then the generated number will be multiplied by 50 and get printed.
+
| Let us see one example on this.
 +
 
 +
Suppose, we want to find incomplete lines in the file.
  
 
|-
 
|-
| 05:10
+
| 05:07
| So, this code will generate 5 random numbers within 50.
+
| Here, incomplete line means it has less than the normal 6 fields.
  
 
|-
 
|-
|05:16
+
| 05:13
| Switch to the '''terminal''' and execute the file.
+
| Switch to the '''terminal'''. Let me clear the terminal using '''Ctrl''' and '''L''' keys.
  
Let me clear the '''terminal'''.
+
|-
 +
| 05:20
 +
| Type the command as shown.
  
 
|-
 
|-
| 05:23
+
| 05:24
| Type: '''awk space hyphen f space random dot awk''' and press '''Enter.'''
+
| As the fields are separated by '''pipe '''symbol, set the '''FS''' value to '''pipe''' symbol in the '''BEGIN''' section.
  
 
|-
 
|-
| 05:31
+
| 05:33
| See, it is giving 5 random numbers.
+
| Next we have written '''NF not equal to 6'''.
  
 
|-
 
|-
| 05:35
+
| 05:37
| What happens if I execute the code again?
+
| This checks whether the number of fields in the current line is not equal to 6.
  
 
|-
 
|-
| 05:39
+
| 05:43
| Press the Up arrow key to get the previously executed command and press '''Enter'''.
+
| If true, then  '''print''' section will print the record’s line number '''NR''', along with the entire line denoted by '''$0'''.
 +
 
 +
Press '''Enter'''.
  
 
|-
 
|-
| 05:47
+
| 05:55
| We are getting the same output. Which means, '''awk '''is generating the same set of random numbers for every execution of the script.
+
| In the '''output''', we can see that record number 16 is the incomplete record.
 +
 
 +
It has only 5 '''fields '''instead of 6.
  
 
|-
 
|-
| 05:57
+
| 06:05
| Then how can we get a new set of random numbers in every execution?
+
| Let us see one more example.
  
Switch to the code once again.
+
How can we print the first and last '''field '''for each student regardless of how many '''fields''' there are?
  
 
|-
 
|-
| 06:06
+
| 06:16
| Before the '''for loop''', type '''srand() function'''
+
| Type the command as shown here on the '''terminal'''.
  
 
|-
 
|-
| 06:11
+
| 06:21
| Press Crtl and S keys to the save the file.
+
| Here we have used '''hyphen capital F''' option instead of setting '''FS''' '''variable'''.
 +
 
 +
Press '''Enter'''.
  
 
|-
 
|-
| 06:16
+
| 06:30
| Now switch to the '''terminal.'''
+
| We get only the first and the last '''fields''' for each record in the file.
  
 
|-
 
|-
| 06:19
+
| 06:36
| Press the Up arrow key to get the previously executed command and press '''Enter'''.
+
| Let’s try something else now.
  
 
|-
 
|-
| 06:27
+
| 06:39
| It is giving a different set of random numbers.
+
| Suppose, the student records are distributed across two files '''demo1.txt''', '''demo2.txt'''.
  
 
|-
 
|-
| 06:31
+
| 06:48
| So, we can generate a new set of random numbers using '''srand function''', when it’s used without an '''argument'''.  
+
| We want to print the first 3 lines from each of these two files.
 +
 
 +
We can do this using '''NR''' variable.
  
 
|-
 
|-
| 06:40
+
| 06:57
| Next will see some '''string functions'''.  
+
| Here are the contents of the two files.
  
'''length function''' gives the length of a particular string '''s'''
+
|-
 +
| 07:02
 +
| Now, to display the first 3 lines from each file, type the following command on the '''terminal.'''
  
 
|-
 
|-
| 06:49
+
| 07:11
| '''index function''' determines the position of''' string s2 '''within the larger '''string s1.'''
+
| Press '''Enter.'''
  
 
|-
 
|-
| 06:57
+
| 07:13
| For example, '''index within parentheses within double quotes linux comma within double quotes n''', returns 3.
+
| The output shows only the first 3 records of '''demo1.txt''' file.
  
Let us see an example.
+
|-
 +
| 07:20
 +
| How can we print the same for the second file also?
  
 
|-
 
|-
| 07:10
+
| 07:24
| Open the file '''awkdemo.txt'''
+
| The solution is to use '''FNR''' instead of '''NR'''.
 +
 
 +
'''FNR''' is the '''current record number '''in the current file.
  
 
|-
 
|-
| 07:14
+
| 07:34
| We know that each student in the '''awkdemo.txt '''file has a 4 digit roll number.  
+
| '''FNR''' is incremented each time a new record is read.  
  
 
|-
 
|-
| 07:21
+
| 07:39
| Due to typing error, the roll numbers may have wrong number of digits.  
+
| It is re-initialized to zero each time a new input file is started.
  
We can easily detect these using '''awk commands.'''
+
|-
 +
| 07:46
 +
| But '''NR''' is the number of input records '''awk''' has processed since the starting of the program's execution.
 +
 
 +
|-
 +
| 07:55
 +
| It does not reset to zero with a new file.
  
 
|-
 
|-
| 07:30
+
| 07:59
| Switch to the '''terminal '''.
+
| Switch to the '''terminal.'''
  
Let me clear the '''terminal'''.
+
Press the '''up arrow''' key to get the previously executed command.
  
 
|-
 
|-
| 07:36
+
| 08:06
| Now type the '''command '''as shown here.
+
| Modify the previous command as follows.
  
Here we are checking the length of the 1st field is equal to 4 or not.  
+
Type '''FNR''' instead of '''NR.'''
  
 
|-
 
|-
| 07:46
+
| 08:14
| If not, then that particular record will get printed.
+
| In the '''Print''' section, next to '''NR,''' type '''FNR'''.  Press '''Enter.'''
 +
 
 +
|-
 +
| 08:21
 +
| See, we get the correct output now.
  
Press '''Enter'''.  
+
'''FNR''' is set to zero with new file but '''NR''' keeps on increasing.
  
 
|-
 
|-
| 07:53
+
| 08:31
| See, there is one roll-number '''S02''' that has been typed incorrectly.
+
| Let us now look at some other '''built-in variables.'''
 +
 
 +
'''FILENAME''' variable gives the name of the file being read.
  
 
|-
 
|-
| 08:00
+
| 08:40
| It has three digits, whereas all others have four digits.
+
| '''ARGC''' specifies the number of '''arguments''' provided at the '''command line'''.
  
 
|-
 
|-
| 08:07
+
| 08:46
| The '''substr(s,a,b) function '''extracts a '''substring '''from a larger '''string s.'''
+
| '''ARGV''' represents an '''array '''that stores the '''command line arguments.'''
  
 
|-
 
|-
| 08:14
+
| 08:52
| Let me explain the parameters.
+
| '''ENVIRON''' specifies the '''array '''of the '''shell environment variables '''and corresponding values.
  
 
|-
 
|-
| 08:17
+
| 09:00
| Here '''s''' is the '''string'''
+
| As '''ARGV''' and '''ENVIRON''' use '''array''' in '''awk''', we will look at those in subsequent tutorials.
  
 
|-
 
|-
| 08:20
+
| 09:09
| '''a''' denotes the position in '''s '''from which the extraction would start
+
| Let us have a look at the variable '''FILENAME''' now.
 +
 
 +
How can we print the name of the current file being processed?
  
 
|-
 
|-
| 08:26
+
| 09:18
| '''b''' denotes the number of characters that would be extracted.
+
| Switch to the '''terminal''' and type the command as shown.
  
Let us see one example.
+
|-
 +
| 09:23
 +
| Here we have used '''space '''as a '''string concatenation operator.'''
 +
 
 +
Press '''Enter''' to '''execute''' the '''command'''.
  
 
|-
 
|-
| 08:33
+
| 09:32
| Switch to the''' awkdemo.txt''' file.
+
| The output shows the '''input filename '''multiple times.
  
 
|-
 
|-
| 08:37
+
| 09:37
| The first letter of the roll numbers represents the '''Hostel code''' where the particular student resides.  
+
| This is because, this command prints the filename once for each row in the '''awkdemo.txt '''file.
 +
 
 +
How can we print this only once?
  
 
|-
 
|-
| 08:46
+
| 09:48
| Say we want to find the list of students, who are staying in Hostel '''A'''.
+
| Clear the '''terminal'''.
 +
 
 +
Press the '''up arrow '''key to get the previously executed command.
  
 
|-
 
|-
| 08:52
+
| 09:55
| To get that, let’s switch to the '''terminal'''.
+
| Modify the previous command as shown here.
 +
 
 +
Press '''Enter.'''
  
 
|-
 
|-
| 08:56
+
| 10:02
| Type the command as shown here.
+
| Now, We get the filename only once.
  
 
|-
 
|-
| 09:00
+
| 10:06
| Here we take the '''string''' denoted by '''$1'''.
+
| There are some other '''built-in variables '''in '''awk'''.
 +
 
 +
Please browse the internet to know more on them.
  
 
|-
 
|-
| 09:05
+
| 10:14
| As we know '''$1''' represents the first '''field''', that is roll number in our case.
+
| Suppose, we want to  find the students who have passed and have stipend more than Rs.8000
  
 
|-
 
|-
|09:12
+
| 10:22
| Next, we extract a '''substring''' that starts at position '''one''' with the character length '''one'''.  
+
| use '''comma '''as the '''output field separator''' and print “'''The data is shown for file'''” and the name of file in the '''footer section'''.  
 +
 
 +
How can we do this?
  
 
|-
 
|-
|09:19
+
| 10:36
| Then, if it is equal to capital '''A''', then that line from the file will get printed.
+
| In the '''terminal''', type the following command.  
  
Press '''Enter''' to see the output.
+
Press '''Enter'''.
  
 
|-
 
|-
| 09:29
+
| 10:43
| We got the list of students who are in Hostel '''A'''.
+
| We can see that only one student has passed and gets stipend more than Rs.8000.
 +
 
 +
And, the record number is 2.
  
 
|-
 
|-
| 09:34
+
| 10:53
| We have seen the '''function split''' earlier.
+
|We can also see the name of the file in the '''footer''', as desired.
  
So, I am not explaining the details here.
+
|-
 +
| 10:58
 +
| We can use '''awk''' for more and more complex tasks.
  
 
|-
 
|-
| 09:40
+
| 11:03
| Please refer to the earlier '''awk''' tutorials if you have any doubt.
+
|In that case, it becomes more difficult to write the '''commands''' every time on the '''terminal'''.
  
 
|-
 
|-
| 09:45
+
|11:09
| There are some other '''functions''' which are related to '''Input/Output'''.
+
| We can instead write the '''awk''' program in a separate file.
  
'''system() function''' - helps us to run any '''unix command '''within '''awk'''.  
+
|-
 +
| 11:14
 +
|To be executable, that file should have the '''dot awk '''extension.
  
 
|-
 
|-
| 09:56
+
|11:19
| Now, we will run the '''unix command date''' through '''awk command.'''
+
| While executing, we can just specify this '''awk''' program filename with the''' awk command.'''
  
 
|-
 
|-
|10:01
+
| 11:26
| In the '''terminal''' type the command as shown here.
+
|For doing so, we need to use '''hyphen small f''' option.  
  
And press '''Enter.'''
+
Let us see an example.  
  
 
|-
 
|-
|10:09
+
| 11:35
| Today’s date and time is displayed on the '''terminal''' as an output.
+
| I have already written an '''awk '''program and saved it as '''prog1 dot awk.'''
  
 
|-
 
|-
| 10:15
+
| 11:42
| Now, why do we need this? We have kept only the '''BEGIN''' section of the '''awk command. '''
+
|This '''code''' is also available in the '''Code Files''' link.
  
 
|-
 
|-
| 10:21
+
| 11:46
| In real world scenarios, we may want to print the '''system date, '''before displaying the required output.
+
| Switch to the '''terminal'''.
 +
 
 +
See, what have we written inside '''single quotes''' of the '''command '''last executed?
  
 
|-
 
|-
| 10:28
+
|11:55
| In that case, we would need to execute '''system commands''' from '''awk command.'''
+
| Content of '''prog1.awk''' file is exactly the same.
  
 
|-
 
|-
| 10:34
+
|12:00
| There are some '''functions''' dealing with '''time stamps''' like
+
| The only difference is that in the '''awk''' file, we have not written inside the '''single quotes.'''
  
'''systime()''''''strftime() ''',  etc.
+
|-
 +
| 12:07
 +
| To execute the file, type the following on the '''terminal-'''
 +
 
 +
'''awk space hyphen small f space prog1.awk space awkdemo.txt '''and press''' Enter'''.
  
 
|-
 
|-
| 10:43
+
| 12:24
| Browse through the Internet to know about these '''functions'''.
+
| We are getting exactly the same output as we have seen before.
  
 
|-
 
|-
| 10:48
+
|12:29
| This brings us to the end of this tutorial.
+
| So, this way you can write '''awk''' programs and use it multiple times.
 +
 
 +
|-
 +
| 12:35
 +
| This brings us to the end of this tutorial.
  
 
Let us summarize.
 
Let us summarize.
  
 
|-
 
|-
| 10:53
+
| 12:40
| In this tutorial we learnt about different types of '''built-in functions''' like
+
| In this tutorial we learnt about
 +
'''Built-in variables''',
  
'''Arithmetic functions''',  '''String functions''',  '''Input/Output functions''' and  '''Time stamps functions'''
+
'''awk script'''  
 +
using various examples.
  
 
|-
 
|-
| 11:06
+
| 12:48
| As an assignment-Write an '''awk''' program to print the last '''field''' of every record
+
| As an assignment-  
 +
write an '''awk''' script to print the last field of the 5th line in '''awkdemo.txt '''file.
  
 
|-
 
|-
| 11:13
+
| 12:58
where name of the student has small '''u''' as the third letter  using the '''awkdemo.txt '''file.
+
Open the system file '''/etc/passwd '''on the '''terminal. '''
  
 
|-
 
|-
| 11:22
+
|13:05
The video at the following link summarises the Spoken Tutorial project.
+
Identify all the '''separators '''therein.
  
Please download and watch it.
+
|-
 +
|13:09
 +
|  Now write a '''script '''to process the file from the 20th line onwards.
  
 
|-
 
|-
| 11:30
+
|13:15
The '''Spoken Tutorial Project''' team conducts workshops using spoken tutorials.  
+
That too, only for the lines that contain more than 6 fields.  
  
And gives certificates on passing online tests.
+
|-
 +
|13:20
 +
| You should print the '''line number''', entire line and count of '''fields '''in that particular line.
 +
 
 +
|-
 +
| 13:28
 +
| The video at the following link summarises the '''Spoken Tutorial''' project.
 +
 
 +
Please download and watch it.
 +
 
 +
|-
 +
| 13:36
 +
| The Spoken Tutorial Project team conducts workshops using spoken tutorials and gives certificates.
  
 
For more details, please write to us.
 
For more details, please write to us.
  
 
|-
 
|-
|11:43
+
| 13:47
| Please post your timed queries in this forum.
+
| Please post your timed queries in this Forum.
  
 
|-
 
|-
| 11:47
+
| 13:51
| Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.
+
| Spoken Tutorial Project is funded by '''NMEICT, MHRD''', Government of India.
  
 
More information on this mission is available at this link.
 
More information on this mission is available at this link.
  
 
|-
 
|-
| 11:59
+
| 14:03
| The script has been contributed by Antara. And this is Praveen from IIT Bombay signing off.
+
| The script has been contributed by Antara. And this is Praveen from '''IIT Bombay''', signing off.
  
Thank you for joining
+
Thanks for joining.
  
 
|}
 
|}

Latest revision as of 11:30, 10 July 2019

Time
Narration
00:01 Welcome to the spoken tutorial on awk built-in variables and awk script.
00:07 In this tutorial, we will learn about Built-in variables , awk script.
00:14 We will do this through some examples.
00:17 To record this tutorial, I am using:

Ubuntu Linux 16.04 Operating System and gedit text editor 3.20.1

00:30 The files used in this tutorial are available in the Code Files link on this tutorial page.

Please download and use them.

00:40 To practice this tutorial, you should have gone through the earlier awk tutorials on this website.
00:47 If not, then please go through the corresponding tutorials on this website.
00:52 First, let us see some of the built-in variables in awk.
00:57 Capital RS specifies the record separator in an input file. By default, it is newline.
01:07 Capital FS specifies the field separator in an input file.
01:13 By default, the value of FS is a whitespace.
01:18 Capital ORS defines the output record separator.

By default, it is newline.

01:27 Capital OFS defines the output field separator.

By default, it is whitespace.

01:36 Let us understand the meaning of each of these.
01:40 Let us have a look at the awkdemo file now.
01:44 When we are processing this awkdemo file with 'awk' command, this becomes our input file.
01:51 Observe that all the records are separated from each other by a newline character.
01:58 newline is the default value for record separator RS variable.

So, there is no need to do anything else.

02:08 Notice that all the fields are separated by the pipe symbol.

How can we inform awk about it?

Let us see.

02:18 By default, any number of spaces or tabs separate the fields.
02:24 We can reset this with the help of hyphen capital F option as learnt in our earlier tutorials.
02:33 Or else, we can reset this in the BEGIN section with the use of FS variable.
02:40 Let us do this through an example.

Suppose, I want to find out the name of students who are getting a stipend of more than Rs.5000.

02:51 Open the terminal by pressing CTRL, ALT and T keys.
02:57 Go to the folder in which you downloaded and extracted the Code Files using cd command.
03:04 Type the command as shown here.
03:08 Here, in the BEGIN section, we have assigned the value of FS as a pipe symbol.

Similarly, we can modify RS variable.

03:19 Press Enter to execute the command.
03:23 The output shows the list of students who are receiving more than Rs.5000 as a stipend.
03:30 Here, the name field and the stipend field are separated by a blank space.
03:36 Also, all the records are separated by a newline character.
03:42 Suppose we want colon as the output field separator

and double newline as output record separator.

03:52 How can we do this? Let us see.
03:55 In the terminal, press the up arrow key to get the previously executed command.
04:01 Modify the command as shown here

and then press Enter.

04:08 We get the output in the desired format.
04:12 Now, suppose our new input file is sample.txt.
04:18 Observe that the field separator here is newline and record separator is double newline.
04:27 How can we extract the roll no. and name information from this file?
04:32 Yes, you have guessed correctly. We have to modify both the FS and RS variables.
04:39 Pause this tutorial and do this as an assignment.
04:43 Next, let us see other built-in variables.
04:47 Capital NR gives the Number of Records processed by awk.
04:53 Capital NF gives the Number of Fields in the current record.
04:59 Let us see one example on this.

Suppose, we want to find incomplete lines in the file.

05:07 Here, incomplete line means it has less than the normal 6 fields.
05:13 Switch to the terminal. Let me clear the terminal using Ctrl and L keys.
05:20 Type the command as shown.
05:24 As the fields are separated by pipe symbol, set the FS value to pipe symbol in the BEGIN section.
05:33 Next we have written NF not equal to 6.
05:37 This checks whether the number of fields in the current line is not equal to 6.
05:43 If true, then print section will print the record’s line number NR, along with the entire line denoted by $0.

Press Enter.

05:55 In the output, we can see that record number 16 is the incomplete record.

It has only 5 fields instead of 6.

06:05 Let us see one more example.

How can we print the first and last field for each student regardless of how many fields there are?

06:16 Type the command as shown here on the terminal.
06:21 Here we have used hyphen capital F option instead of setting FS variable.

Press Enter.

06:30 We get only the first and the last fields for each record in the file.
06:36 Let’s try something else now.
06:39 Suppose, the student records are distributed across two files demo1.txt, demo2.txt.
06:48 We want to print the first 3 lines from each of these two files.

We can do this using NR variable.

06:57 Here are the contents of the two files.
07:02 Now, to display the first 3 lines from each file, type the following command on the terminal.
07:11 Press Enter.
07:13 The output shows only the first 3 records of demo1.txt file.
07:20 How can we print the same for the second file also?
07:24 The solution is to use FNR instead of NR.

FNR is the current record number in the current file.

07:34 FNR is incremented each time a new record is read.
07:39 It is re-initialized to zero each time a new input file is started.
07:46 But NR is the number of input records awk has processed since the starting of the program's execution.
07:55 It does not reset to zero with a new file.
07:59 Switch to the terminal.

Press the up arrow key to get the previously executed command.

08:06 Modify the previous command as follows.

Type FNR instead of NR.

08:14 In the Print section, next to NR, type FNR. Press Enter.
08:21 See, we get the correct output now.

FNR is set to zero with new file but NR keeps on increasing.

08:31 Let us now look at some other built-in variables.

FILENAME variable gives the name of the file being read.

08:40 ARGC specifies the number of arguments provided at the command line.
08:46 ARGV represents an array that stores the command line arguments.
08:52 ENVIRON specifies the array of the shell environment variables and corresponding values.
09:00 As ARGV and ENVIRON use array in awk, we will look at those in subsequent tutorials.
09:09 Let us have a look at the variable FILENAME now.

How can we print the name of the current file being processed?

09:18 Switch to the terminal and type the command as shown.
09:23 Here we have used space as a string concatenation operator.

Press Enter to execute the command.

09:32 The output shows the input filename multiple times.
09:37 This is because, this command prints the filename once for each row in the awkdemo.txt file.

How can we print this only once?

09:48 Clear the terminal.

Press the up arrow key to get the previously executed command.

09:55 Modify the previous command as shown here.

Press Enter.

10:02 Now, We get the filename only once.
10:06 There are some other built-in variables in awk.

Please browse the internet to know more on them.

10:14 Suppose, we want to find the students who have passed and have stipend more than Rs.8000
10:22 use comma as the output field separator and print “The data is shown for file” and the name of file in the footer section.

How can we do this?

10:36 In the terminal, type the following command.

Press Enter.

10:43 We can see that only one student has passed and gets stipend more than Rs.8000.

And, the record number is 2.

10:53 We can also see the name of the file in the footer, as desired.
10:58 We can use awk for more and more complex tasks.
11:03 In that case, it becomes more difficult to write the commands every time on the terminal.
11:09 We can instead write the awk program in a separate file.
11:14 To be executable, that file should have the dot awk extension.
11:19 While executing, we can just specify this awk program filename with the awk command.
11:26 For doing so, we need to use hyphen small f option.

Let us see an example.

11:35 I have already written an awk program and saved it as prog1 dot awk.
11:42 This code is also available in the Code Files link.
11:46 Switch to the terminal.

See, what have we written inside single quotes of the command last executed?

11:55 Content of prog1.awk file is exactly the same.
12:00 The only difference is that in the awk file, we have not written inside the single quotes.
12:07 To execute the file, type the following on the terminal-

awk space hyphen small f space prog1.awk space awkdemo.txt and press Enter.

12:24 We are getting exactly the same output as we have seen before.
12:29 So, this way you can write awk programs and use it multiple times.
12:35 This brings us to the end of this tutorial.

Let us summarize.

12:40 In this tutorial we learnt about-

Built-in variables,

awk script using various examples.

12:48 As an assignment-

write an awk script to print the last field of the 5th line in awkdemo.txt file.

12:58 Open the system file /etc/passwd on the terminal.
13:05 Identify all the separators therein.
13:09 Now write a script to process the file from the 20th line onwards.
13:15 That too, only for the lines that contain more than 6 fields.
13:20 You should print the line number, entire line and count of fields in that particular line.
13:28 The video at the following link summarises the Spoken Tutorial project.

Please download and watch it.

13:36 The Spoken Tutorial Project team conducts workshops using spoken tutorials and gives certificates.

For more details, please write to us.

13:47 Please post your timed queries in this Forum.
13:51 Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.

More information on this mission is available at this link.

14:03 The script has been contributed by Antara. And this is Praveen from IIT Bombay, signing off.

Thanks for joining.

Contributors and Content Editors

PoojaMoolya, Sandhya.np14