Difference between revisions of "R/C2/Operations-on-Matrices-and-Data-Frames/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
(Created page with "'''Title of script''': Operations on Matrices and Data Frames '''Author''': Shaik Sameer (IIIT Vadodara) and Sudhakar Kumar (IIT Bombay) '''Keywords''': R, RStudio, matrices...")
 
Line 7: Line 7:
  
 
{| border =1
 
{| border =1
|'''Visual Cue’’’
+
|'''Visual Cue'''
|'''Narration’’’
+
|'''Narration'''
 
|-
 
|-
 
|| Show slide  
 
|| Show slide  
Line 21: Line 21:
 
* Perform operations on matrices  
 
* Perform operations on matrices  
 
* Add rows or columns to a '''data frame'''
 
* Add rows or columns to a '''data frame'''
 
  
 
|-  
 
|-  
Line 49: Line 48:
 
Download Files  
 
Download Files  
 
|| For this tutorial, we will use the '''data frame CaptaincyData.csv '''and the '''script '''file '''myMatrix.R'''.
 
|| For this tutorial, we will use the '''data frame CaptaincyData.csv '''and the '''script '''file '''myMatrix.R'''.
 
  
 
Please download these files from the '''Code files''' link of this tutorial.
 
Please download these files from the '''Code files''' link of this tutorial.
Line 57: Line 55:
 
Highlight '''CaptaincyData.csv '''and '''myMatrix.R''' in the folder '''MatrixOps'''
 
Highlight '''CaptaincyData.csv '''and '''myMatrix.R''' in the folder '''MatrixOps'''
 
|| I have downloaded and moved these files to '''MatrixOps '''folder''' '''in '''myProject''' folder on the '''Desktop'''.  
 
|| I have downloaded and moved these files to '''MatrixOps '''folder''' '''in '''myProject''' folder on the '''Desktop'''.  
 
  
 
I have also set '''MatrixOps''' folder as my '''Working directory'''.  
 
I have also set '''MatrixOps''' folder as my '''Working directory'''.  
Line 65: Line 62:
 
|-  
 
|-  
 
|| Click '''myMatrix.R''' in '''RStudio'''
 
|| Click '''myMatrix.R''' in '''RStudio'''
 
  
 
Point to''' myMatrix.R''' in '''Rstudio'''
 
Point to''' myMatrix.R''' in '''Rstudio'''
 
|| Open the '''script myMatrix.R''' in '''RStudio.'''
 
|| Open the '''script myMatrix.R''' in '''RStudio.'''
 
  
 
For this, click on the '''script myMatrix.R'''  
 
For this, click on the '''script myMatrix.R'''  
 
  
 
'''Script myMatrix.R '''opens in''' Rstudio.'''
 
'''Script myMatrix.R '''opens in''' Rstudio.'''
Line 78: Line 72:
 
|| Highlight '''matrixA''' in the '''Source''' widow  
 
|| Highlight '''matrixA''' in the '''Source''' widow  
 
|| Recall that, we had created a matrix named '''matrixA.'''
 
|| Recall that, we had created a matrix named '''matrixA.'''
 
  
 
This matrix was extracted as a subset from the '''captaincy''' '''data frame'''.  
 
This matrix was extracted as a subset from the '''captaincy''' '''data frame'''.  
 
  
 
We will use '''matrixA''' here also.  
 
We will use '''matrixA''' here also.  
Line 104: Line 96:
 
'''solve(matrixA)'''
 
'''solve(matrixA)'''
 
|| In the '''Source''' window, type '''solve, '''within parentheses''' matrixA'''.  
 
|| In the '''Source''' window, type '''solve, '''within parentheses''' matrixA'''.  
 
  
 
Press '''Enter'''.  
 
Press '''Enter'''.  
 
  
 
Press '''Enter''' at the end of every command.  
 
Press '''Enter''' at the end of every command.  
Line 165: Line 155:
 
'''}'''
 
'''}'''
 
|| We will create two '''for''' loops to iterate through the rows and columns of the '''matrixA'''.
 
|| We will create two '''for''' loops to iterate through the rows and columns of the '''matrixA'''.
 
  
 
In the '''Source''' window, type '''for '''in parentheses''' i''' space '''in''' space 1 '''colon '''3 space
 
In the '''Source''' window, type '''for '''in parentheses''' i''' space '''in''' space 1 '''colon '''3 space
 
  
 
Now type opening curly bracket. '''RStudio''' automatically adds a closing curly bracket.
 
Now type opening curly bracket. '''RStudio''' automatically adds a closing curly bracket.
Line 207: Line 195:
 
Highlight the sum in '''Console''' window
 
Highlight the sum in '''Console''' window
 
|| Run the block of code from the comment line '''Calculating sum using for loop '''to the end.  
 
|| Run the block of code from the comment line '''Calculating sum using for loop '''to the end.  
 
  
 
The '''totalSum''' is evaluated to be 237.
 
The '''totalSum''' is evaluated to be 237.
Line 246: Line 233:
 
Highlight the time taken in the '''Console''' window
 
Highlight the time taken in the '''Console''' window
 
|| Save the script and run the block of code from the comment line '''Calculating sum using inbuilt function '''to the end.  
 
|| Save the script and run the block of code from the comment line '''Calculating sum using inbuilt function '''to the end.  
 
  
 
The time taken to calculate the sum of elements using '''sum''' function is 1.6 milliseconds.  
 
The time taken to calculate the sum of elements using '''sum''' function is 1.6 milliseconds.  
 
  
 
Whereas, it took 8 milliseconds to calculate the same sum using '''for''' loop.  
 
Whereas, it took 8 milliseconds to calculate the same sum using '''for''' loop.  
Line 296: Line 281:
 
|| We have used '''rbind()''' function with the following arguments* name of the '''data frame''' to which we want to add the new row. Here, it is '''captaincy'''.  
 
|| We have used '''rbind()''' function with the following arguments* name of the '''data frame''' to which we want to add the new row. Here, it is '''captaincy'''.  
 
* the row to be added as an argument to '''data.frame()'''.
 
* the row to be added as an argument to '''data.frame()'''.
 
  
 
|-  
 
|-  
Line 306: Line 290:
 
* '''lost''' = 9
 
* '''lost''' = 9
 
* '''victory''' = 20/30
 
* '''victory''' = 20/30
 
  
 
|-  
 
|-  
Line 330: Line 313:
 
|-  
 
|-  
 
|| '''captaincy <- cbind(captaincy, defeat)'''
 
|| '''captaincy <- cbind(captaincy, defeat)'''
 
 
  
 
|| Now we add '''defeat''' as a new column in '''captaincy data frame'''.  
 
|| Now we add '''defeat''' as a new column in '''captaincy data frame'''.  
Line 340: Line 321:
 
|| We have used '''cbind()''' function with following two arguments:* name of the '''data frame''' to which we want to add the new column. Here, it is '''captaincy'''.  
 
|| We have used '''cbind()''' function with following two arguments:* name of the '''data frame''' to which we want to add the new column. Here, it is '''captaincy'''.  
 
* name of the column to be added. Here, it is '''defeat'''.  
 
* name of the column to be added. Here, it is '''defeat'''.  
 
  
 
|-  
 
|-  
Line 356: Line 336:
  
 
Summary
 
Summary
 
 
 
 
|| In this tutorial, we have learned how to:
 
|| In this tutorial, we have learned how to:
 
* Perform operation on '''matrices'''  
 
* Perform operation on '''matrices'''  
 
* Add rows or columns to a '''data frame'''
 
* Add rows or columns to a '''data frame'''
 
  
 
|-  
 
|-  
Line 368: Line 344:
  
 
Assignment
 
Assignment
 
 
  
 
|| We now suggest an assignment.
 
|| We now suggest an assignment.

Revision as of 23:30, 13 March 2019

Title of script: Operations on Matrices and Data Frames

Author: Shaik Sameer (IIIT Vadodara) and Sudhakar Kumar (IIT Bombay)

Keywords: R, RStudio, matrices, data frames, adding row, adding column, video tutorial


Visual Cue Narration
Show slide

Opening slide

Welcome to the spoken tutorial on Operations on Matrices and Data Frames.
Show slide

Learning Objectives

In this tutorial, we will learn how to:
  • Perform operations on matrices
  • Add rows or columns to a data frame
Show slide

Pre-requisites

To understand this tutorial, you should know
  • Data frames and Matrices in R
  • R script in RStudio
  • How to set working directory in RStudio

If not, please locate the relevant tutorials on R on this website.

Show slide

System Specifications

This tutorial is recorded on
  • Ubuntu Linux OS version 16.04
  • R version 3.4.4
  • RStudio version 1.1.456

Install R version 3.2.0 or higher.

Show slide

Download Files

For this tutorial, we will use the data frame CaptaincyData.csv and the script file myMatrix.R.

Please download these files from the Code files link of this tutorial.

[Computer screen]

Highlight CaptaincyData.csv and myMatrix.R in the folder MatrixOps

I have downloaded and moved these files to MatrixOps folder in myProject folder on the Desktop.

I have also set MatrixOps folder as my Working directory.

Let us switch to RStudio.
Click myMatrix.R in RStudio

Point to myMatrix.R in Rstudio

Open the script myMatrix.R in RStudio.

For this, click on the script myMatrix.R

Script myMatrix.R opens in Rstudio.

Highlight matrixA in the Source widow Recall that, we had created a matrix named matrixA.

This matrix was extracted as a subset from the captaincy data frame.

We will use matrixA here also.

Highlight the Source button Run this script myMatrix.R to load the values in the Workspace.
Highlight captaincy in the Source window captaincy data frame opens in the Source window.
I will resize the Source window.
Highlight matrixA in the Environment window Now let us learn how to find the inverse of matrixA.
Click on the script myMatrix.R
[RStudio]

solve(matrixA)

In the Source window, type solve, within parentheses matrixA.

Press Enter.

Press Enter at the end of every command.

Press Ctrl + S >> Press Ctrl+Enter keys. Save the script and run this line.
Highlight the output in the Console window The inverse of matrixA in shown in the Console window.
For more information on calculating inverse of a matrix in R, please refer to the Additional materials section on this website.
[RStudio]

Highlight matrixA in the Environment window

Now let us calculate the sum of all the elements in matrixA.

First, we shall use nested for loops for calculating this sum.

Also, we shall estimate the time taken to calculate the sum in this way.

[RStudio]

# Calculating sum using for loop

let us add a comment Calculating sum using for loop in the script.

In the Source window, type # hash space Calculating sum using for loop. Press Enter.

[RStudio]

startTime <- Sys.time()

To calculate the time taken, we record the present time.

In the Source window, type startTime then press Alt and -(hyphen) keys simultaneously.

Next, type Sys dot time followed by parentheses.

Highlight Sys.time in the Source window Sys.time() is used to find the absolute date-time value in the current time zone.
[RStudio]

totalSum <- 0

Let us initialise a variable totalSum.

In the Source window, type totalSum then press Alt and -(hyphen) keys simultaneously.

Then type 0.

This variable will store the sum of elements in matrixA.

[RStudio]

for(i in 1:3) {

}

We will create two for loops to iterate through the rows and columns of the matrixA.

In the Source window, type for in parentheses i space in space 1 colon 3 space

Now type opening curly bracket. RStudio automatically adds a closing curly bracket.

[RStudio]

for(j in 1:3) {

totalSum <- totalSum + matrixA[i,j]

}

Press Enter just after the opening curly bracket and type the following command.
[RStudio]

print(totalSum)

Press Enter after the closing curly bracket of outer loop for(i in 1:3).

In the Source window, type print totalSum in parentheses.

[RStudio]

endTime <- Sys.time()

endTime - startTime

Now, we record the current time again.

Type endTime then press Alt and -(hyphen) keys simultaneously.

Now type Sys dot time parentheses. Press Enter.

Type endTime space minus sign space startTime to know the total time taken and save the script.

I am resizing the Source window.

[RStudio]

Highlight the sum in Console window

Run the block of code from the comment line Calculating sum using for loop to the end.

The totalSum is evaluated to be 237.

[RStudio]

Highlight the time taken in Console window

The time taken to calculate the sum of elements in matrixA using for loop is approximately 8 milliseconds.

However, it may vary from system to system.

Highlight matrixA in the Source window Now, let us calculate the sum of all elements in matrixA using the sum function.
We shall estimate the time taken to calculate the sum in this way also.
[RStudio]

# Calculating sum using inbuilt function

In the script myMatrix.R, add a comment Calculating sum using inbuilt function.

In the Source window, type # hash space Calculating sum using inbuilt function.

[RStudio]

startTime <- Sys.time()

sum(matrixA)

endTime <- Sys.time()

endTime - startTime

Now, type the following commands.
[RStudio]

Highlight the time taken in the Console window

Save the script and run the block of code from the comment line Calculating sum using inbuilt function to the end.

The time taken to calculate the sum of elements using sum function is 1.6 milliseconds.

Whereas, it took 8 milliseconds to calculate the same sum using for loop.

Cursor on the interface. So, it is advisable to use inbuilt functions of R.
Cursor on the interface Let us learn how to calculate sum of each row and sum of each column.
[RStudio]

rowSums(matrixA)

colSums(matrixA)

In the Source window, type rowSums matrixA in parentheses.

Next, type colSums matrixA in parentheses.

Save the script and run these two lines to see the corresponding sums on the Console.

I am resizing the Console window to see the output properly.
Now let us learn how to add rows and columns to an existing data frame.
Click on the captaincy data frame.
Highlight captaincy data frame in the Source window Let us add a new row to the captaincy data frame.
Click on the script myMatrix.R
[RStudio]

captaincy <- rbind(captaincy, data.frame(names="Kohli", Y = 2016, played = 30, won = 20, lost = 9, victory = 20/30))

In the Source window, type the following command. Press Enter after the comma for better visibility. .
I am resizing the Source window
Highlight captaincy in the Source window We have used rbind() function with the following arguments* name of the data frame to which we want to add the new row. Here, it is captaincy.
  • the row to be added as an argument to data.frame().
Highlight data.frame in the Source window To data.frame() function, we provide the values according to the columns of the actual data frame.* names = “Kohli”
  • Y = 2016
  • played = 30
  • won = 20
  • lost = 9
  • victory = 20/30
[RStudio]

View(captaincy)

Highlight new row in the Source window

In the Source window, type View within parentheses captaincy

Save the script and run the last two lines of code.

One new row with the details of Kohli is added in the captaincy data frame.

[RStudio]

defeat <- captaincy$lost / captaincy$played

Now let us create a new column named defeat from the captaincy data frame.

Click on the script myMatrix.R.

In the Source window, type the following command Press Enter.

captaincy <- cbind(captaincy, defeat) Now we add defeat as a new column in captaincy data frame.

In the Source window, type the following command Press Enter.

Highlight cbind in the Source window We have used cbind() function with following two arguments:* name of the data frame to which we want to add the new column. Here, it is captaincy.
  • name of the column to be added. Here, it is defeat.
View(captaincy) Now type View, captaincy in parentheses

Save the script and run the block of code starting from the declaration of defeat to the end.

One column of defeat is added in the captaincy data frame.

Let us summarize what we have learnt.
Show slide

Summary

In this tutorial, we have learned how to:
  • Perform operation on matrices
  • Add rows or columns to a data frame
Show slide

Assignment

We now suggest an assignment.
  • Consider 2 vectors c(9,10,11,12) and c(13,14,15,16).
  • Create a 4 by 2 matrix from these two vectors.
  • Add another vector c(17,18,19,20) as a column to the previous matrix.

For solutions, please refer to the Additional materials section on this website.

Show slide

About the Spoken Tutorial Project

The video at the following link summarises the Spoken Tutorial project.

Please download and watch it.

Show slide

Spoken Tutorial Workshops

We conduct workshops using Spoken Tutorials and give certificates.

Please contact us.

Show Slide

Forum to answer questions

Please post your timed queries in this forum.
Show Slide

Forum to answer questions

Please post your general queries in this forum.
Show Slide

Textbook Companion

The FOSSEE team coordinates the TBC project.

For more details, please visit these sites.

Show Slide

Acknowledgement

The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India
Show Slide

Thank You

The script for this tutorial was contributed by Shaik Sameer (FOSSEE Fellow 2018).

This is Sudhakar Kumar from IIT Bombay signing off. Thanks for watching.

Contributors and Content Editors

Nancyvarkey, Sudhakarst