Difference between revisions of "R/C2/Indexing-and-Slicing-Data-Frames/English"
Line 20: | Line 20: | ||
||Show slide | ||Show slide | ||
− | Learning Objectives | + | '''Learning Objectives''' |
Line 33: | Line 33: | ||
||Show slide | ||Show slide | ||
− | Pre-requisites | + | '''Pre-requisites''' |
||To understand this tutorial, you should have knowledge about | ||To understand this tutorial, you should have knowledge about | ||
Line 46: | Line 46: | ||
||Show slide | ||Show slide | ||
− | System Specifications | + | '''System Specifications''' |
||This tutorial is recorded on | ||This tutorial is recorded on | ||
Line 86: | Line 86: | ||
||Let us open the '''script mydataframe.R''' in '''RStudio.''' | ||Let us open the '''script mydataframe.R''' in '''RStudio.''' | ||
− | For this, click on the '''script mydataframe.R''' | + | For this, click on the '''script mydataframe.R'''. |
− | '''Script mydataframe.R '''opens in''' Rstudio.''' | + | '''Script mydataframe.R '''opens in '''Rstudio.''' |
|- | |- | ||
Line 190: | Line 190: | ||
|- | |- | ||
− | || | + | || Point to the command. |
|| So, to extract a '''column''', we shouldn’t use a '''comma''' within the square brackets. | || So, to extract a '''column''', we shouldn’t use a '''comma''' within the square brackets. | ||
Line 196: | Line 196: | ||
|- | |- | ||
− | || | + | || Click on the '''captaincy data frame''' |
|| In the '''Source '''window, click on the '''captaincy data frame'''. | || In the '''Source '''window, click on the '''captaincy data frame'''. | ||
Line 228: | Line 228: | ||
|- | |- | ||
− | || | + | ||Click on the '''captaincy data frame''' |
|| In the '''Source '''window, click on the '''captaincy data frame'''. | || In the '''Source '''window, click on the '''captaincy data frame'''. | ||
|- | |- | ||
− | || | + | ||Cursor on the interface. |
||Now, we’ll find who has played 25 matches from the '''played''' column of '''captaincy data frame'''. | ||Now, we’ll find who has played 25 matches from the '''played''' column of '''captaincy data frame'''. | ||
Line 238: | Line 238: | ||
|- | |- | ||
− | || | + | ||Click on the '''script mydataframe.R''' |
||Click on the '''script mydataframe.R''' | ||Click on the '''script mydataframe.R''' | ||
Line 272: | Line 272: | ||
|- | |- | ||
− | || | + | ||Cursor on the interface. |
||Now let us learn how to get the values of any particular '''attribute''' for all the players. | ||Now let us learn how to get the values of any particular '''attribute''' for all the players. | ||
Line 328: | Line 328: | ||
|- | |- | ||
− | || | + | ||In the '''Source '''window, click on the '''captaincy data frame''' |
|| In the '''Source '''window, click on the '''captaincy data frame'''. | || In the '''Source '''window, click on the '''captaincy data frame'''. | ||
Line 336: | Line 336: | ||
|- | |- | ||
− | || | + | ||Click on the '''script mydataframe.R''' |
||Click on the '''script mydataframe.R''' | ||Click on the '''script mydataframe.R''' | ||
Line 347: | Line 347: | ||
|- | |- | ||
− | ||Highlight''' c() ''' | + | ||Highlight '''c()''' |
||Here''' c''' '''function '''is used''' '''to concatenate '''names '''and '''won'''. | ||Here''' c''' '''function '''is used''' '''to concatenate '''names '''and '''won'''. | ||
Line 363: | Line 363: | ||
|- | |- | ||
− | || | + | ||Point to '''captaincy data frame''' |
||Now let us extract a '''subset '''from '''captaincy data frame'''. | ||Now let us extract a '''subset '''from '''captaincy data frame'''. | ||
Line 395: | Line 395: | ||
* number of matches '''won''' | * number of matches '''won''' | ||
|- | |- | ||
− | || | + | ||Click on the '''script mydataframe.R''' |
||Click on the '''script mydataframe.R''' | ||Click on the '''script mydataframe.R''' | ||
|- | |- | ||
− | || | + | ||Drag the source window to resize the '''Source''' window |
||I will resize the '''Source''' window. | ||I will resize the '''Source''' window. | ||
Line 411: | Line 411: | ||
|- | |- | ||
− | || | + | ||Press '''Enter'''. |
||You can press '''Enter''' after a''' comma''' for better visibility. | ||You can press '''Enter''' after a''' comma''' for better visibility. | ||
Line 428: | Line 428: | ||
|- | |- | ||
− | ||Highlight two lines | + | ||Highlight two lines. |
Press '''Ctrl + S''' >> Press '''Ctrl+Enter''' keys. | Press '''Ctrl + S''' >> Press '''Ctrl+Enter''' keys. | ||
Line 434: | Line 434: | ||
|- | |- | ||
− | || | + | ||Drag the '''Console''' window. |
||I am resizing the '''Console''' window to see the output properly. | ||I am resizing the '''Console''' window to see the output properly. | ||
Line 456: | Line 456: | ||
|- | |- | ||
− | || | + | ||Click on the '''script mydataframe.R'''. |
− | ||Click on the '''script mydataframe.R''' | + | ||Click on the '''script mydataframe.R'''. |
|- | |- | ||
Line 497: | Line 497: | ||
||We now suggest an assignment. | ||We now suggest an assignment. | ||
− | * Create a '''subset '''from '''captaincy data frame''' with the captains who have played > 20 matches and lost < 14 matches. | + | * Create a '''subset '''from '''captaincy data frame''' with the captains who have played more than(>) 20 matches and lost less than(<) 14 matches. |
|- | |- |
Revision as of 23:39, 19 February 2019
Title of script: Indexing and Slicing Data Frames
Author: Shaik Sameer (IIIT Vadodara) and Sudhakar Kumar (IIT Bombay)
Keywords: R, RStudio, data frames, indexing, slicing, working directory, video tutorial.
Visual Cue | Narration |
Show slide
Opening slide |
Welcome to the spoken tutorial on Indexing and Slicing Data Frames. |
Show slide
Learning Objectives
|
In this tutorial, we will learn how to:
|
Show slide
Pre-requisites |
To understand this tutorial, you should have knowledge about
If not, please locate the relevant tutorials on R on this website. |
Show slide
System Specifications |
This tutorial is recorded on
Install R version 3.2.0 or higher. |
Show slide
Download Files |
For this tutorial, we will use the data frame CaptaincyData.csv and a script file mydataframe.R.
|
[Computer screen]
Highlight CaptaincyData.csv and mydataframe.R in the folder myProject |
I have downloaded and moved these files to the folder myProject on my Desktop.
|
Let us switch to Rstudio. | |
Point to script mydataframe.R.
Click mydataframe.R in RStudio
|
Let us open the script mydataframe.R in RStudio.
For this, click on the script mydataframe.R.
|
[RStudio]
captaincy <- read.csv("CaptaincyData.csv") View(captaincy) |
Here, we have declared a variable captaincy to store and read CaptaincyData.csv.
|
Highlight <- symbol in the Source window | Remember, you may also use equal to sign in place of less than symbol followed by hyphen.
|
[RStudio]
Alt + - testvar <- 2 |
In the Console window, type testvar and then press
Alt and -(hyphen) keys simultaneously. Then type 2 and press Enter. |
Highlight mydataframe.R in the Source window | Let us get back to the Source window.
Run the script mydataframe.R by clicking on the Source button. |
[RStudio]
Highlight third row of captaincy |
Now let us extract the contents of the third row of the captaincy data frame. |
Click on script mydataframe.R | Click on the script mydataframe.R |
[RStudio]
|
In the Source window, type captaincy
Remember one of the most important features of RStudio include intelligent auto-completion of function names, packages, and R objects. |
Highlight comma in the Source window | We use a comma within square brackets when we wish to extract a row. |
Press Ctrl + S >> Press Ctrl+Enter keys. | Save the script and execute the current line by pressing Ctrl+Enter keys. |
Highlight the output in the Console window | The third row of the captaincy data frame is seen in the Console window. |
Highlight comma in captaincy[3,] | Now, let us run the same command without a comma. |
[RStudio]
captaincy[3] |
In the Source window, type captaincy then within square brackets 3. |
Press Ctrl + S >> Press Ctrl+Enter keys. | Save the script and run this line only, as shown earlier. |
Highlight the output in the Console window | The contents of the third column of the data frame, are displayed on the Console window. |
Point to the command. | So, to extract a column, we shouldn’t use a comma within the square brackets.
When we extract data using row number or column number, it is known as numeric indexing. |
Click on the captaincy data frame | In the Source window, click on the captaincy data frame. |
[RStudio]
Highlight second and third rows of captaincy |
Let us now extract the contents of second and third rows of the captaincy data frame. |
[RStudio]
|
To retrieve more than one row, we use a numeric index vector. Click on the script mydataframe.R
In the Source window, type the following command and press Enter. |
Highlight the c() function | c function is being used to concatenate the second and third rows. |
Press Ctrl + S >> Press Ctrl+Enter keys. | Save the script and execute the current line. |
Highlight the output in the Console window | The second and third rows of captaincy data frame are seen in the Console window. |
Click on the captaincy data frame | In the Source window, click on the captaincy data frame. |
Cursor on the interface. | Now, we’ll find who has played 25 matches from the played column of captaincy data frame.
Extracting this type of information is known as logical indexing. |
Click on the script mydataframe.R | Click on the script mydataframe.R |
[RStudio]
|
In the Source window, type captaincy
Within square brackets captaincy dollar sign played equal to equal to 25 comma. Press Enter. |
Highlight dollar sign
|
Remember, dollar sign allows you to extract elements by name.
Please note that there is no space between the two equal to signs. |
Press Ctrl + S >> Press Ctrl+Enter keys. | Save the script and execute the current line. |
Highlight the output in the Console window | The details of captain Dravid are shown in the Console window. |
click on the captaincy data frame in Source window. | In the Source window, click on the captaincy data frame. |
Cursor on the interface. | Now let us learn how to get the values of any particular attribute for all the players.
We will fetch the names of all the captains. |
[RStudio]
Highlight first column of captaincy captaincy[1] |
For this, we need to know the values in the first column.
Click on the script mydataframe.R In the Source window, type captaincy and within square brackets 1. |
Highlight 1 in square brackets | Please note that I have not used a comma inside the square brackets. |
Press Ctrl + S >> Press Ctrl+Enter keys. | Save the script and execute the current line. |
Highlight the output in the Console window | The names of the captains are seen in the Console window. |
Usually we use column names instead of column numbers. | |
[RStudio]
captaincy["names"] |
To know the names of the captains, type captaincy.
Within square brackets inside double quotes names |
Press Ctrl + S >> Press Ctrl+Enter keys. | Save the script and execute the current line. |
Point to the names in the Console window. | Names of the captains are shown in the Console window.
Extracting data by column names is known as name indexing. |
Highlight the broom icon in the Console window | Clear the Console window by clicking on the broom icon. |
In the Source window, click on the captaincy data frame | In the Source window, click on the captaincy data frame. |
cursor in the Source window. | Now let us view the names of the captains along with the number of matches they have won. |
Click on the script mydataframe.R | Click on the script mydataframe.R |
[RStudio]
captaincy[c("names", "won")] |
In the Source window, type the following command and press Enter. |
Highlight c() | Here c function is used to concatenate names and won. |
Highlight names and won | Observe that names and won have been written within double quotes. |
Press Ctrl + S >> Press Ctrl+Enter keys. | Save the script and execute this line. |
Point to the names of the captains in the Console window. | The names of captains and the number of matches won are seen in the Console window. |
Point to captaincy data frame | Now let us extract a subset from captaincy data frame.
We will create a subset of captains, who have won more than 30% of their matches. This is called slicing a data frame. |
click on the captaincy data frame in Source window. | In the Source window, click on the captaincy data frame. |
[RStudio]
|
Please note that there is a column named victory in the captaincy data frame.
For required subset of captains, their victory should be greater than 0.3. |
[RStudio]
|
For this subset, we shall only show the
|
Click on the script mydataframe.R | Click on the script mydataframe.R |
Drag the source window to resize the Source window | I will resize the Source window. |
[RStudio]
subData <- subset(captaincy, victory > 0.3, select = c("names", "played", "won")) |
In the Source window, type the following command
Now, press Enter after the comma followed by 0 point 3. |
Press Enter. | You can press Enter after a comma for better visibility. |
[RStudio]
|
The select parameter is used to select the required columns, names, played, and won.
|
Highlight two lines.
Press Ctrl + S >> Press Ctrl+Enter keys. |
Save the script and run these two lines. |
Drag the Console window. | I am resizing the Console window to see the output properly. |
Highlight the output in the Console window | The subData is shown in the Console window. |
Click on the captaincy data frame in Source window. | In the Source window, click on the captaincy data frame. |
[RStudio]
|
Finally, let us learn how to extract a particular entry from some column of a data frame.
|
Click on the script mydataframe.R. | Click on the script mydataframe.R. |
[RStudio]
|
In the Source window, type captaincy within double square brackets 4 within single square brackets 3. |
Press Ctrl + S >> Press Ctrl+Enter keys. | Save the script and execute the current line. |
Highlight the output in the Console window | The expected value 14 is seen in the Console window. |
For more information on indexing and slicing data frames, please refer to the Additional materials section on this website. | |
Let us summarize what we have learnt. | |
Show slide
Summary |
In this tutorial, we have learnt how to:
|
Show slide
Assignment |
We now suggest an assignment.
|
Show slide
About the Spoken Tutorial Project |
The video at the following link summarises the Spoken Tutorial project.
Please download and watch it. |
Show slide
Spoken Tutorial Workshops |
We conduct workshops using Spoken Tutorials and give Certificates.
Please contact us. |
Show Slide
Forum to answer questions |
Please post your timed queries in this forum. |
Show Slide
Forum to answer questions |
Please post your general queries in this forum. |
Show Slide
Textbook Companion |
The FOSSEE team coordinates the TBC project.
For more details, please visit these sites. |
Show Slide
Acknowledgement |
The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India |
Show Slide
Thank You |
The script for this tutorial was contributed by Shaik Sameer (FOSSEE Fellow 2018).
|