Linux-AWK/C2/MultiDimensional-Array-in-awk/English-timed

From Script | Spoken-Tutorial
Revision as of 20:53, 10 July 2019 by Sandhya.np14 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Time
Narration
00:01 Hello and welcome to the spoken tutorial on creating multidimensional arrays in awk.
00:07 In this tutorial, we will learn to-

create multidimensional array in awk and scan the multidimensional array.

00:18 We will do this through some examples.
00:21 To record this tutorial, I am using:

Ubuntu Linux 16.04 Operating System and gedit text editor 3.20.1

00:33 You can use any text editor of your choice.
00:37 To practice this tutorial, you should have gone through previous awk tutorials on array, on this website.
00:45 You should have some basic knowledge of any programming language like C or C++.
00:52 If not, then please go through the corresponding tutorials on our website.
00:58 The files used in this tutorial are available in the Code Files link on this tutorial page.

Please download and extract them.

01:08 What is a multidimensional array in awk?
01:12 We know that in single dimensional arrays, an array element is identified by a single index.
01:19 For example, array week is identified by a single index, day.
01:26 However, in multidimensional array, an element is identified by a sequence of multiple indices.
01:34 For example, a two dimensional array element is identified by a sequence of 2 indices.
01:42 Here, multiple indices are concatenated into a single string, with a separator between them.
01:50 The separator is the value of the built-in variable SUBSEP.
01:55 The combined string is used as a single index for a simple one dimensional array.
02:01 For example, suppose we write multi within square brackets 4 comma 6 equal to value in double quotes.
02:11 Here, multi is the name of multi-dimensional array.

Then, the numbers 4 and 6 are converted to a string.

02:21 Suppose, the value of SUBSEP is hash symbol (#).
02:26 Then, those numbers are concatenated with a hash (#) symbol between them.
02:32 So, the array element multi within square brackets within double quotes 4 hash 6 is set to value within double quotes.
02:43 The default value of SUBSEP is the string within double quotes backslash 034.
02:50 It is actually a nonprinting character.

It will not appear usually in most input data.

02:58 Let us try to declare a two dimensional array as shown in the slide.
03:03 Row 1 contains two elements A and B.
03:08 Row 2 has two elements C and D.
03:12 Open the terminal by pressing Ctrl, Alt and T keys.
03:17 Go to the folder in which you have downloaded and extracted the Code Files using 'cd' command.
03:24 Now, define the array as follows. Type the command carefully as shown here.

Then press Enter.

03:35 We get a command prompt back without any error.

So, the array is defined.

03:41 We do not get any output because we have not given anything to print in the code.
03:47 Let us add the print statement.
03:50 Press the up arrow key to get the previously executed command in the terminal.
03:56 Just before the closing curly bracket, type: semicolon space print space a within square brackets 2 comma 2.

Press Enter to execute the command.

04:13 Notice, we get the output as capital D.
04:18 How to test if a particular index sequence exists in a given multidimensional array?
04:25 We can use in operator.
04:28 We have already seen it in single-dimensional array earlier in this series.
04:34 We have to write the entire sequence of indices within parentheses and separated by commas.
04:42 Let us see this in an example.
04:45 I have already written a script named test_multi.awk.
04:51 The same is available in the Code Files link of this tutorial page.
04:56 I have defined a 2 by 2 array as seen in our previous discussion.
05:02 Then I have written two 'if' conditions.
05:06 The first if condition checks whether the element at the index one comma one, is present or not.
05:13 We have to write the index for multidimensional array within parentheses.
05:18 If the condition is true, it will print one comma one is present.
05:23 Else, it will print one comma one is absent.
05:28 Similarly, we will check for the presence of the element at index three comma one.

Let us execute the file.

05:36 Switch to the terminal and type: awk space hyphen small f space test underscore multi dot awk and press Enter.
05:49 The output says: one comma one is present and three comma one is absent.
05:55 Let us take one more example.

Say, we want to create the transpose of a matrix.

06:02 The transpose of a given matrix is formed by interchanging the rows and columns of a matrix.

How can we do this?

06:11 I have created a two-dimensional array matrix in the file 2D-array.txt.
06:19 I have written a code named transpose.awk.
06:24 First, look at the action section of this awk script.
06:29 Here, we are calculating the maximum number of fields in a row.

And, stored the calculated value in the variable max_nf.

06:40 As we know, NR is the number of current records processed by awk.

Value of NR is stored in max_nr variable.

06:50 Awk will process the input file from the first record to the last record.
06:56 When awk is processing the first record, max_nr will be equal to 1.
07:03 While processing second record, max_nr will be 2 and it continues this way.
07:11 When awk is processing the last record, max_nr will store the total number of records.
07:19 Now, we should read the data from input file and store the data into a two dimensional array.
07:26 Inside the 'for' loop, we have iterator variable x.
07:31 x will traverse from one to NF and x will be incremented by 1 after each iteration.
07:39 For every value of x, $x(dollar x) represents the value at field x.
07:46 That value will be stored in array matrix at index NR comma x.
07:53 For example, matrix of 1 comma 1 stores the value which is present at index 1 comma 1 from the input file.
08:02 So, after awk has processed the entire input file with this code, matrix array will be completely formed.
08:10 It will store entire data of input file into a two dimensional array format.
08:16 Now, let’s us look inside the END section.
08:20 We have written a nested for loop to print the transpose of the matrix.
08:25 I assume your familiarity with basic C programming.

So, I am not explaining this portion of code in detail.

08:34 Pause the video here to look at the code in detail and understand on your own.
08:40 Now, we will learn how to scan a multidimensional array.
08:45 Awk does not have a multi-dimensional array in the truest sense.
08:50 So, there cannot be any special 'for' statement to scan the multidimensional array.
08:56 You can have multidimensional way to scan an array.
09:00 You can combine the 'for' statement with the split function for this.
09:05 Let us see what the split function is.

split function is used to chop up or split a string into pieces

09:14 and place the various pieces into an array.
09:18 The syntax is as follows. First argument contains the string to be chopped.
09:25 Second argument specifies the name of the array where split() will put the chopped pieces into.
09:33 The third argument mentions the separator that will be used to chop the string up.
09:39 The first piece is stored in arr 1,
09:43 the second piece in arr 2 and so forth.
09:48 Suppose, we want to recover the original sequence of indices from an already created array.

How can we do this?

09:56 I have written a code named multi_scan.awk.
10:02 Entire code is written inside the BEGIN section.
10:06 First we have created an array named a and assigned these values to it.
10:12 Then we have the for loop with an iterator.
10:16 The iterator will be set to each of the indices values for each iteration-

say, 1 comma 1, then 1 comma 2 and so on.

10:27 The split() function breaks the iterator into pieces separated by SUBSEP.
10:34 The pieces will be stored in the array arr.
10:38 So, arr[1] and arr[2] will contain the first index and second index respectively.

Let us execute this file.

10:48 Switch to the terminal and type- awk space hyphen small f space multi underscore scan dot awk

and press Enter.

11:01 See the output; the original sequence of indices are recovered.
11:07 Let us summarize. In this tutorial, we learnt to- create a multidimensional array in awk and

scan a multidimensional array.

11:18 As an assignment,

write an awk script to rotate a two dimensional array by 90 degree and print the rotated matrix.

11:28 The video at the following link summarises the Spoken Tutorial project.

Please download and watch it.

11:36 The Spoken Tutorial Project team conducts workshops using spoken tutorials.

And, gives certificates on passing online tests.

11:45 For more details, please write to us.
11:49 Please post your timed queries in this forum.
11:53 Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.

More information on this mission is available at this link.

12:05 The script has been contributed by Antara. And this is Praveen from IIT Bombay, signing off.

Thank you for joining.

Contributors and Content Editors

PoojaMoolya, Sandhya.np14