R/C2/Plotting-Bar-Charts-and-Scatter-Plot/English
Title of the script: Plotting Bar Charts and Scatter Plots
Author: Tushar Bajaj (TISS Mumbai) and Sudhakar Kumar (IIT Bombay)
Keywords: R, RStudio, graphs, bar chart, labels, scatter plot, correlation, video tutorial, spoken tutorial
Visual Cue | Narration |
Show slide
Opening slide |
Welcome to this tutorial on Plotting bar charts and scatter plot. |
Show slide
Learning Objectives |
In this tutorial, we will learn how to:
|
Show slide
Pre-requisites |
To understand this tutorial, you should know,
If not, please locate the relevant tutorials R on this website. |
Show slide
System Specifications |
This tutorial is recorded on* Ubuntu Linux OS version 16.04
Install R version 3.2.0 or higher. |
Show slide
Download Files |
For this tutorial, we will use
Please download these files from the Code files link of this tutorial. |
[Computer screen]
Highlight moviesData.csv and barPlots.R in the folder Plots |
I have downloaded and moved these files to Plots folder.
This folder is located in myProject folder on my Desktop. I have also set Plots folder as my Working Directory. |
Let us switch to Rstudio. | |
Highlight barPlots.R in the Files window of RStudio | Open the script barPlots.R in RStudio. |
Highlight the Source button | Run this script by clicking on Source button. |
Highlight movies in the Source window | movies data frame opens in the Source window. |
Highlight dim(movies) in the Console window | It has 600 observations of 31 variables. |
Highlight the scroll bar in the Source window | In the Source window, scroll from left to right. This will enable us to see the remaining objects of movies data frame. |
Highlight imdb_rating in the Source window | Now, we will learn how to draw a bar chart of the object named imdb underscore rating in movies. |
Show slide
Bar Chart |
|
Let us switch to RStudio. | |
Highlight movies in the Source window | For the sake of simplicity, we are considering only the first 20 observations of movies to draw a bar chart. |
Highlight barPlots.R in the Source window | Click on the script barPlots.R |
[Rstudio] moviesSub <- movies[1:20,] |
In the Source window, type the following command.
Save the script and run the current line by pressing Ctrl + Enter keys simultaneously. |
Let me resize the Source window. | |
Highlight moviesSub in the Environment window | moviesSub with 20 observations is loaded in the Environment.
Now, we draw a bar chart of imdb_rating for these movies. |
[RStudio]
barplot(moviesSub$imdb_rating, ylab="IMDB Rating", xlab = "Movies", col="blue", ylim=c(0,10), main="Movies' IMDB Rating") |
In the Source window, type the following command. |
Highlight barplot in the Source window | Here, we have used the following arguments:
|
Highlight Run button in the Source window | Run the current line. |
Highlight the plot in the Plots window | The bar chart is displayed with Movies on X-axis and their imdb_rating on Y-axis. |
Highlight Files and Plots window | In the Plots window, click on Zoom to maximize the plot. |
Highlight the first bar in the plot | This particular movie has an IMDB rating of approximately 6. |
Highlight the third bar in the plot | Similarly, this particular movie has an IMDB rating of approximately 8.
However, we do not know the name of the movies. |
So, we will add more arguments in barplot function to show the names of movies on X-axis. | |
Close this plot. | |
[RStudio]
barplot(moviesSub$imdb_rating, ylab="IMDB Rating", col="blue", ylim=c(0,10), main="Movies' IMDB Rating", names.arg=moviesSub$title) |
In the Source window, type the following command. |
Highlight names.arg in the Source window | Here, we have used the argument names.arg and set it to title.
Remember, title column in moviesSub contains the names of movies. |
Highlight Run button in the Source window | Run the current line. |
Highlight Files and Plots window | In the Plots window, click on Zoom to maximize the plot. |
Highlight X-axis of the plot | Now, the names of movies are displayed on the X-axis.
But not for all movies. This is due to the point that the names are too long to be accommodated. That’s why, we will make these names perpendicular to X-axis. |
Close this plot. | |
[RStudio]
barplot(moviesSub$imdb_rating, ylab="IMDB Rating", col="blue", ylim=c(0,10), main="Movies' IMDB Rating", names.arg=moviesSub$title, las = 2) |
In the Source window, type the following command. |
Highlight las in the Source window |
Here, we have used las argument.
las equal to 2 produces labels which are at right angles to the axis. |
Highlight Run button in the Source window | Run the current line. |
Highlight Files and Plots window | In the Plots window, click on Zoom to maximize the plot. |
Highlight the plot in the Plots window | Now the names for all the movies are displayed on X-axis.
For example, Filly Brown has an IMDB rating of approximately 6. |
Highlight the plot in the Plots window | However, longer names are being truncated.
We can add more arguments to barplot function for adjusting labels. For more information, please refer to the Additional Material section on this website. |
Close this plot. | |
Highlight movies in the Source window | In the Source window, click on movies. |
Highlight imdb_rating and audience_score in the Source window | Let us analyze the relation between imdb underscore rating and audience underscore score.
For this, we will draw a scatter plot with these two objects by using plot function. Remember, we have already learnt how to plot a single object. |
Show Slide
Scatter Plot |
|
Let us switch to RStudio. | |
Highlight barPlots.R in the Source window | In the Source window, click on the script barPlots.R |
[RStudio]
plot(x = movies$imdb_rating, y = movies$audience_score, main = "IMDB Rating vs Audience Score", xlab = "IMDB Rating", ylab = "Audience Score", xlim = c(0,10), ylim = c(0,100), col = "blue") |
In the Source window, type the following command. |
Highlight plot function in the Source window | Here, we have kept imdb underscore rating on the X-axis and audience underscore score on the Y-axis. |
Highlight xlim in the Source window | As imdb underscore rating of any movie varies between 0 and 10, we have set the range of values on X-axis from 0 to 10. |
Highlight ylim in the Source window | Similarly, we have set the range of values on Y-axis from 0 to 100. |
Highlight Run button in the Source window | Save the script and run the current line. |
Highlight Files and Plots window | In the Plots window, click on Zoom to maximize the plot. |
Highlight the plot in the Plots window | We can observe that the movies having higher imdb underscore rating has a high audience underscore score. |
Close this plot. | |
Now we will learn how to calculate the correlation coefficient between imdb underscore rating and audience underscore score.
For this, we use cor function. | |
[RStudio]
cor(movies$imdb_rating, movies$audience_score) |
In the Source window, type the following command. |
Highlight Run button in the Source window | Save the script and run the current line. |
Highlight the output in the Console window | The correlation coefficient between imdb underscore rating and audience underscore score is evaluated as 0.865. |
Highlight the output in the Console window | The value of correlation coefficient is always between -1 and +1.
A positive value indicates that the variables are positively related. |
Let us summarize what we have learnt. | |
Show slide
Summary |
In this tutorial, we have learnt how to:
|
Show slide
Assignment |
We now suggest an assignment.
|
Show slide
About the Spoken Tutorial Project |
The video at the following link summarises the Spoken Tutorial project.
Please download and watch it. |
Show slide
Spoken Tutorial Workshops |
We conduct workshops using Spoken Tutorials and give certificates.
Please contact us. |
Show Slide
Forum to answer questions |
Please post your timed queries in this forum. |
Show Slide
Forum to answer questions |
Please post your general queries in this forum. |
Show Slide
Textbook Companion |
The FOSSEE team coordinates the TBC project.
For more details, please visit these sites. |
Show Slide
Acknowledgment |
The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India |
Show Slide
Thank You |
The script for this tutorial was contributed by Tushar Bajaj (TISS Mumbai).
This is Sudhakar Kumar from IIT Bombay signing off. Thanks for watching. |