R/C2/Introduction-to-ggplot2/English
Title of the script: Introduction to ggplot2
Author: Varshit Dubey (CoE Pune) and Sudhakar Kumar (IIT Bombay)
Keywords: R, RStudio, graphics, plot, ggplot2, ggplot, video tutorial, spoken tutorial
Visual Cue | Narration |
Show slide
Opening Slide |
Welcome to this tutorial on Introduction to ggplot2. |
Show slide Learning Objective |
In this tutorial, we will learn about
|
Show slide Pre-requisites |
To understand this tutorial, you should know,
If not, please locate the relevant tutorials on R on this website. |
Show slide
System Specifications |
This tutorial is recorded on
Install R version 3.2.0 or higher. |
Show slide
Download Files |
For this tutorial, we will use
Please download these files from the Code files link of this tutorial. |
[Computer screen]
Highlight moviesData.csv and ggPlots.R in the folder Plots |
I have downloaded and moved these files to ggPlots folder.
This folder is located in myProject folder on my Desktop. I have also set ggPlots folder as my Working Directory. |
Now let us learn about visualization. | |
Show slide
Need for Data Visualization |
|
Show slide Data visualization in R |
There are 2 methods of data visualization in R:
|
Let us switch to RStudio. | |
Highlight ggPlots.R in the Files window of RStudio | Open the script ggPlots.R in RStudio. |
Highlight the Source button.
Click on Source button. |
Let us run this script by clicking on the Source button. |
Highlight movies in the Environment window | movies data frame is loaded in the workspace.
This data frame will be used later in this tutorial. |
[RStudio]
x <- seq(-pi, pi, 0.1) y <- sin(x) plot(x, y) |
First, we will plot a sine curve by taking equally spaced samples.
In the Source window, type the following commands. |
Highlight seq in the Source window | Here, we have used the seq function to generate a sequence.
This sequence is from minus pi to plus pi with an interval of zero point one. |
Highlight plot in the Source window | In plot command, the first argument is x, and the second argument y is sine of x. |
Highlight run button in the Source window | Save the script and run the last three lines of code by pressing Ctrl + Enter keys simultaneously. |
Highlight the plot in the Plots window | A plot of sine curve appears in the Plots window. |
Highlight Plots window | In the Plots window, click on the Zoom button to maximize the plot. |
Highlight the plot | Now we will add some more layers in this plot. |
Click on X button to close. | Click on Close (X) button to close this plot. |
[RStudio]
plot(x, y, main="Plotting a Sine Curve", ylab="sin(x)") |
In the Source window, type the following commands.
Here, we have added main and ylab arguments to the plot function. |
Highlight Run button in the Source window | Run the current line. |
Highlight Plots window | In the Plots window, click on Zoom button to maximize the plot. |
Highlight Y-axis of the plot | The title of the plot and label of Y-axis have been added to the plot. |
Click on X button to close. | Close this plot. |
[RStudio]
plot(x, y, main="Plotting sine curve", ylab="sin(x)", type="l", col="blue") |
Now we will learn how to change the type of plot.
In the Source window, type the following commands. |
Highlight type in the plot function | Here, we have used the type argument and set it to l.
It means that the type of plot we need is lines. |
Highlight col in the plot function | col equal to blue, changes the colour of the plot to blue. |
Highlight Run button in the Source window | Run the current line. |
Highlight the plot | The type and color of the plot have been changed. |
Cursor on the interface. | Now, we will plot one more graph on the same plot.
Let us plot cosine of x along with sine of x on the same plot. |
[RStudio]
plot(x, sin(x), main="Plotting Sine and Cosine graphs on the same plot", ylab=" ", type="l", col="blue") lines(x, cos(x), col="red") |
In the Source window, type the following commands. |
Highlight plot in the Source window | This command plots sine of x using the plot function. |
Highlight lines in the Source window | Next, we use lines function to plot cosine of x. |
Highlight lines in the Source window | After the first line is plotted, the lines function is used.
It takes an additional vector cos of x as an input to draw the second line in the plot. |
Highlight Run button in the Source window | Run the last two lines of code by pressing Ctrl+Enter keys simultaneously. |
Highlight the plot | The two graphs appear in the same plot window.
Here we can add a legend to the plot to differentiate between the multiple graphs. For this, we will use legend function. |
[RStudio] legend("topleft",
c("sin(x)", "cos(x)"), fill=c("blue","red")) |
In the Source window, type the following command. |
I will resize the Source window. | |
Highlight topleft in the Source window | The first argument refers to the coordinates for placing the legend in our plot.
We have set the coordinates to topleft. |
Highlight c("sin(x)", "cos(x)") in the Source window | The second argument is the names to be given.
Since we have plotted sine and cosine functions, we will pass these two names as a vector. |
Highlight fill in the Source window | Next, we have used the fill argument to specify the graphs by their colors.
Recall that, sine function is plotted in blue and cosine function in red. |
I will resize the Plots window. | |
Highlight Run button in the Source window | Run the last three lines of code by pressing Ctrl+Enter keys simultaneously. |
Highlight Files and Plots window | In the Plots window, click on Zoom button to maximize the plot. |
Highlight the plot | The two plots with their names appear in the same graph. |
Click on X button in the Plot window. | Close the plot. |
So far, we have discussed the basic graphics in R language.
Now, we will learn about the grammar of graphics by using ggplot2 package. | |
Show slide Introduction to ggplot2 package |
* ggplot2 package was created by Hadley Wickham in 2005.
|
Let us switch to RStudio. | |
I will resize the PLots window. | |
Cursor on the interface. | To use any package in R, we need to install and then load it.
As I have already installed ggplot2 package, I will load this directly. |
[RStudio]
install.packages("ggplot2") |
If you have not installed the package, please use install dot packages function.
|
Click at the top of the script ggPlots.R | To load this package, we will add the library at the top of the script. |
[RStudio]
library(ggplot2) |
In the Source window, scroll up to the top of the script.
Now, at the top of the script, type library and ggplot2 in parentheses. Save the script and run this line. |
[RStudio]
Point to the line having legend function |
Now, in the Source window, click on the next line after the legend function. |
Highlight movies in the Environment window | We will use movies data frame for exploring ggplot2 package. |
[RStudio]
View(movies) |
Let us view the objects available in movies data frame.
In the Source window, type View and movies in parentheses. |
Highlight Run button in the Source window | Run the current line. |
Highlight movies in the Source window | movies data frame opens in the Source window. |
Highlight movies in the Source window | Now, we will create a simple scatter plot with two different objects of movies.
Remember, a scatter plot is a graph in which the values of two variables are plotted along the axes. |
Highlight the scroll bar in the Source window | In the Source window, scroll from left to right to see the remaining objects of movies data frame. |
Highlight critics_score and audience_score in the Source window | Suppose, we want to visualize the correlation between critics_score and audience_score. |
Highlight ggPlots.R in the Source window | In the Source window, click on the script ggPlots.R |
[RStudio]
ggplot(data = movies, mapping = aes(x=critics_score, y = audience_score)) + geom_point() |
In the Source window, type the following command. |
Highlight ggplot in the Source window | ggplot function takes three basic arguments:
|
Highlight data in the Source window
Highlight mapping in the Source window Highlight aes in the Source window |
In ggplot function, we have used the following arguments:
We have set data equal to movies.
We will learn more about aesthetics mapping later in this series. |
Highlight geom_point in the Source window | * Geom underscore point is used to draw points defined by X and Y coordinates. |
Highlight Run button in the Source window | Run the current line. |
Highlight Plots window | Scatter plot appears in the Plots window. |
Highlight Plots window | In the Plots window, click on the Zoom button to maximize the plot. |
Highlight the plot | We can see that there is a positive correlation between critics_score and audience_score.
Now we will learn how to save a plot generated by ggplot function. |
Click on x button to close the plot. | Close this plot. |
For saving the plots, there is a function named ggsave in ggplot2 package. | |
[RStudio]
?ggsave |
To know the syntax of ggsave function, we will access the Help section in RStudio.
|
I will resize the Help window. | |
Highlight Help in RStudio
Highlight plot in Help |
The first argument in this function is the filename.
Next, there is the argument named plot which means the plot to be saved. By default, it will save the last plot. |
Highlight Plots window | Click on the Plots window. |
Highlight plot in the Plots window | Let us save our scatter plot with a name scatter underscore plot in png format. |
[RStudio]
ggsave("scatter_plot.png") |
In the Source window, type the following command. |
Highlight Run button in the Source window | Save the script and run the current line. |
Highlight Files window | Click on the Files tab. |
Highlight scatter_plot.png in the Files window | The plot has been saved in our current working directory. |
Let us summarize what we have learnt. | |
Show slide Summary |
In this tutorial, we have learnt about,
|
Show slide Assignment |
We now suggest an assignment.
|
Show slide
About the Spoken Tutorial Project |
The video at the following link summarises the Spoken Tutorial project.
Please download and watch it. |
Show slide
Spoken Tutorial Workshops |
We conduct workshops using Spoken Tutorials and give certificates.
Please contact us. |
Show Slide
Forum to answer questions |
Please post your timed queries in this forum. |
Show Slide
Forum to answer questions |
Please post your general queries in this forum. |
Show Slide
Textbook Companion |
The FOSSEE team coordinates the TBC project.
For more details, please visit these sites. |
Show Slide
Acknowledgment |
The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India |
Show Slide
Thank You |
The script for this tutorial was contributed by Varshit Dubey (CoE Pune).
|