Linux-AWK/C2/Variables-and-Operators-in-awk/English

From Script | Spoken-Tutorial
Revision as of 16:08, 13 November 2017 by Pravin1389 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Title of script: Variables and Operators in Awk

Author: Antara Roy Choudhury

Keywords: Linux, awk commands, Variables in awk, operators in awk, String Concatenation in awk, Regular expressions in awk, BEGIN and END statement in awk, Spoken Tutorial, Video Tutorial.


Visual Cue
Narration
Slide 1: Introduction Hello and welcome to this spoken tutorial on variables and operators in awk command.
Slide 2: Learning Objectives In this tutorial we will learn about-


  • User defined variables
  • Operators
  • BEGIN and END statements

We will do this through some examples.

Slide 3: System requirement To record this tutorial, I am using Ubuntu Linux 16.04
Slide 4: Prerequisite To practice this tutorial, you should have gone through the earlier Linux tutorials on this website.


You should be familiar with basic operators used in general programming languages like C or C++.


If not, then please go through the corresponding tutorials on our website.

Slide 5: awk awk combines the power of a filter and a programming language.


So, it supports variables, constants, operators, etc.

Slide 6: Variable Let’s see what a variable in awk is.


A variable is an identifier that references a value.


Awk supports both user-defined variables and built-in variables.


We will learn about user-defined variables in this tutorial.


For user-defined variables, variable declaration is not required.


Variables do not have to be initialized explicitly.


Awk automatically initializes them to zero or null string.

Slide 7: Variable


A variable must begin with a letter and continue with letters, digits and underscores.


Variable are case-sensitive.


So, Salary with capital “S” and salary with small “s” are two different variables.

Let us look at some examples now.
Open the terminal by pressing CTRL + ALT and T keys Open the terminal by pressing CTRL + ALT and T keys.
Type


awk '{x=1;X="A";a="awk";b="tutorial"[enter]

print x [enter]

print X [enter]

print a [enter]

print b [enter]

print a b

print x b

print x+X}' [enter]


On the terminal, type-


awk space opening single quote opening curly brace small x equal to 1 semicolon capital X equal to within double quotes capital A semicolon small a equal to within double quotes awk semicolon small b equal to within double quotes tutorial.


Press Enter.

type print x Press Enter

print capital X Press Enter

print a Press Enter

print b Press Enter

print a space b Press Enter

print x space b Press Enter

print small x plus capital X closing curly brace closing single quote and press Enter

a [enter] Since we have not given a filename, awk would need some input from stdin.


And hence, we can type any letter.


say a and then press Enter.

Hover your mouse over x=1; X="A" and a="awk" This example shows a couple of things.


Variables can be initialized with a number.


It can also be initialized with value as a single character or a string.


If the value is a character or a string, variable is initialized with value within double quotes.

Hover your mouse over the values of small x and capital X We can see the values of the variables.


Observe that small x and capital X are treated as different variables.


This proves that variables are case sensitive.

Hover your mouse over the output of “print a b” Also, it shows how two strings can be concatenated.


Here variables small a and small b are concatenated.


So, string concatenation operator is simply a space.

Hover your mouse over the output of “print x b”


Similarly, when we concatenate small x, which is a number and string b, x is auto-converted into string.


And the concatenated output becomes 1tutorial.

Continue to hover your mouse over the output of “print x b” Why does the auto-conversion to string happen?


That's because awk finds a string concatenation operator space here between x and b.

Hover your mouse over the output of print x+X Now, look at the output of small x plus capital X.


Here, we have the arithmetic operator plus.


So, X is auto-converted to numeric zero.


And the addition output becomes numeric 1.

Until now, we have seen a couple of operators.


Let’s look at what other operators we can use.

Slide 8: Operators A variety of operators can be used in expressions.


Please pause the video here and take a look at all the operators mentioned here.


I assume you are familiar with these basic operators.


If not, then kindly visit our website for tutorials on operators in C and C++ series.

Same slide but highlight the relevant line item I am not going to discuss the working of all these operators in detail.


Only exception is the string matching operator, which may be new to you.


Let understand this with an example.

A file named awkdemo.txt has been provided in the Code files link.


Pls download it on your computer.

Switch to terminal


Switch to the terminal


Let us end the previous process by pressing Ctrl and D keys.


Let me clear the terminal.

cd /<saved folder> Now go to the folder in which you saved the awkdemo.txt file using the cd command.
Let us have a look at this file now.
Show the file opened Let's say we want to find the students who have passed but have marks less than 80.


In this case, we need to compare two different fields.


For such situations, we can use awk's relational operators.


These operators can compare both strings and numbers.

awk -F "|" '$5=="Pass" && $4<80 {print ++x,$2,$4,$5}' awkdemo.txt

[enter]

So, on the terminal type


awk space hyphen capital F within double quotes vertical bar space Within single quotes dollar 5 equal to equal to within double quotes Pass space ampersand ampersand space dollar 4 less than 80 space within curly braces print space plus plus x coma dollar 2 coma dollar 4 coma dollar 5 space awkdemo.txt and press Enter.

Show the output of the previous command


This command shows a number of things.


One, we compare a string with the fifth field.


Second, we only compare the fourth field with a number.


Third, we see that we can join two or more comparisons using ampersand operator.


Instead of specific numbers or strings, we can also compare regular expressions.


As we have seen in the slide, we have the tilde and the exclamation tilde operators for this purpose.

Show the awkdemo.txt file and highlight Computers with different case. Now suppose, we want to find computer science students who have passed.


Since computers can have both a small and capital C, we would use a regular expression.

Type:


awk -F "|" '$5=="Pass" && $3~/[cC]omputers/ {print ++x,$2,$3,$5}' awkdemo.txt

We would type


awk space hyphen capital F within double quotes pipe symbol space within single quote dollar 5 equal to equal to within double quotes Pass ampersand ampersand space dollar 3 tilde slash within square brackets small c capital C computers slash space within curly braces print space plus plus ++x coma dollar 2 coma dollar 3 coma dollar 5 space awkdemo.txt and press Enter.

If we want to negate the comparison, we can do so using the exclamation tilde operator.
Type:


awk -F "|" '$5=="Pass" && $3!~/[cC]omputers/ {print ++x,$2,$3,$5}' awkdemo.txt

Say now we want a list of all non-computer students who passed.


Use the up arrow to get the previous command.


Next to dollar 3 add exclamation symbol and press Enter.

Next, let's count the number of blank lines in the same file.
Show the awkdemo.txt file and highlight blank lines Open the file and check how many blank lines are there are.


So, it has 3 blank lines.

Type

awk '/^$/ {x=x+1; print x}' awkdemo.txt [enter]


Now to count the number of empty lines using awk, type


awk space within single quote within front slash caret symbol dollar space within curly braces x equal to x plus 1 semicolon space print x space awkdemo.txt


Press Enter.

Highlight the output We get 3 as our final answer.
Highlight the caret sign (^) >> then the dollar($) sign The caret sign signifies start of a line while dollar signifies the end of a line.
Highlight caret-dollar (^$). Hence an empty line would be matched by the regular expression caret-dollar.
Highlight x Note, we have not initialized the value of x.


Awk has initialized x to the initial value zero.

Highlight the output of the command executed This command gives us the running count of blank lines.


This is because every time a blank line is found, x would be incremented and then printed.

Slide 9: BEGIN and END sections In our last command, we have seen running count of blank lines.


But say we only want to print the total number of blank lines.


Then we need to print x only once, after the entire file has been traversed.


We may also want to give a heading saying what the output means.


For such requirements awk provides the BEGIN and the END sections.

Slide 10: BEGIN and END sections The BEGIN section contains procedures for pre-processing.


This section is executed before the main input loop is executed.


The END section can contain procedures for

post-processing.


This section is executed after the main input loop has terminated.


The BEGIN and END procedures are optional.

Type


awk 'BEGIN{print "The number of empty lines in awkdemo are"} [press enter]

/^$/ {x=x+1} [press enter]

end {print x}' awkdemo.txt[press enter]


Let's learn how to do this.


In the terminal type


awk space opening single quote BEGIN incaps within curly brace print space within double quotes The number of empty lines in awkdemo are


press Enter.


within front slash caret symbol dollar symbol space within curly brace x equal to x plus 1


press Enter.


end space within curly braces print space x close single quote space awkdemo.txt and press Enter.

Highlight the output See, we did not get the desired output!


We should get the output as 3, because we have 3 blank lines in the file.

Point to end in lowercase


What do you think happened?


Actually, we should have written end as upper case END.

So, let us modify the command.
awk 'BEGIN{print "The number of empty lines in awkdemo are"} [press enter]

/^$/ {x=x+1} [press enter]

END {print x}' awkdemo.txt [press enter]

Press up arrow key to get the previous executed command on the terminal.


Now modify lower case end to upper case END.


And press Enter.

Highlight the output Now the total number of empty lines is displayed in the output.
Next, let's find the average salary of all the students that we found in the awkdemo.txt file.
Type

awk -F"|" 'BEGIN {printf "%-3s %-25s %10s\n", "Sl","Student","Stipend"} [enter] {x=x+1; total = total+$6; printf"%-3d %-25s %-10d\n",x,$2,$6} [enter]

END{avg=total/x; print "The average stipend is" avg}' awkdemo.txt [Enter]

To get that, type the command as shown in the terminal


And press Enter.


And we get the desired output.

Slide 11: Summary This brings us to the end of this tutorial. Let us summarize.


In this tutorial we learnt about

  • User defined variables in awk
  • Operators
  • BEGIN and END statements


Slide 12: Assignment


As an assignment print every line where the value of the last field is more than 5000


And the student belongs to Electrical department.


Print the average marks of all the students with the heading “Average marks” in the output.

Slide 13: About Spoken Tutorial project The video at the following link summarises the Spoken Tutorial project.


Please download and watch it.

Slide 14: Spoken Tutorial workshops The Spoken Tutorial Project team conducts workshops using spoken tutorials


and gives certificates on passing online test.


For more details, please write to us.

Slide 15: Forum for specific questions: Do you have questions in THIS Spoken Tutorial?


Please visit this site.


Choose the minute and second where you have the question.


Explain your question briefly.


Someone from our team will answer them.

Slide 16: Forum for specific questions: The Spoken Tutorial forum is for specific questions on this tutorial.

Please do not post unrelated and general questions on them.


This will help reduce the clutter.


With less clutter, we can use these discussion as instructional material.

Slide 17: Acknowledgement Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.


More information on this mission is available at

this link.

Slide 18: Thank You The script has been contributed by Antara and this is Praveen from IIT Bombay signing off.


Thank you for joining.

Contributors and Content Editors

Nancyvarkey, Pravin1389