Difference between revisions of "R/C2/Aesthetic-Mapping-in-ggplot2/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
Line 93: Line 93:
 
|-  
 
|-  
 
|| Highlight '''movies''' in the '''Source''' window  
 
|| Highlight '''movies''' in the '''Source''' window  
|| '''movies '''data frame opens in the '''Source''' window.  
+
|| '''movies data frame''' opens in the '''Source''' window.  
 
|-  
 
|-  
 
|| Highlight the plot in the '''Plots''' window  
 
|| Highlight the plot in the '''Plots''' window  
Line 126: Line 126:
  
 
'''geom_point()'''
 
'''geom_point()'''
|  | In the '''Source''' window, type the following commands.  
+
|  | In the '''Source''' window, type the following '''commands'''.  
 
|-
 
|-
 
|  | Highlight '''aes''' in the '''Source''' window  
 
|  | Highlight '''aes''' in the '''Source''' window  
|  | Inside '''aes''', we have added '''color''' argument and set it to '''genre'''.  
+
|  | Inside '''aes''', we have added '''color argument''' and set it to '''genre'''.  
 
|-  
 
|-  
 
|| Highlight '''run''' button in the '''Source''' window
 
|| Highlight '''run''' button in the '''Source''' window
|| Save the '''script''' and run the current line by pressing '''Ctrl+Enter''' keys simultaneosuly.
+
|| Save the '''script''' and run the current line by pressing '''Ctrl+Enter''' keys simultaneously.
 
|-  
 
|-  
 
|| Highlight '''Plots''' window  
 
|| Highlight '''Plots''' window  
Line 144: Line 144:
 
|| Highlight the plot  
 
|| Highlight the plot  
 
|| We can see that each point is assigned a unique '''color''' according to its '''genre'''.  
 
|| We can see that each point is assigned a unique '''color''' according to its '''genre'''.  
 +
  
 
In the right side of the plot, we can view the mapping of '''genres''' with their colors.  
 
In the right side of the plot, we can view the mapping of '''genres''' with their colors.  
Line 151: Line 152:
 
|-  
 
|-  
 
|| Highlight '''ggplot''' in the '''Source''' window  
 
|| Highlight '''ggplot''' in the '''Source''' window  
|| Now, we will learn how to draw a '''bar chart''' using '''ggplot '''function.  
+
|| Now, we will learn how to draw a '''bar chart''' using '''ggplot function'''.  
 
|-  
 
|-  
 
|| Highlight '''movies''' in the '''Source''' window  
 
|| Highlight '''movies''' in the '''Source''' window  
Line 160: Line 161:
 
|-  
 
|-  
 
|| Highlight '''mpaa_rating''' in the '''Source''' window  
 
|| Highlight '''mpaa_rating''' in the '''Source''' window  
|| Let us inspect the object named '''mpaa '''underscore '''rating''' in '''movies'''.  
+
|| Let us inspect the '''object''' named '''mpaa underscore rating''' in '''movies'''.  
 
|-  
 
|-  
 
|| Highlight the '''script aesPlots.R''' in the '''Source''' window  
 
|| Highlight the '''script aesPlots.R''' in the '''Source''' window  
Line 170: Line 171:
  
 
'''levels(movies$mpaa_rating)'''
 
'''levels(movies$mpaa_rating)'''
|| In the '''Source''' window, type the following commands.  
+
|| In the '''Source''' window, type the following '''commands'''.  
 
|-  
 
|-  
 
|| Highlight '''run''' button in the '''Source''' window
 
|| Highlight '''run''' button in the '''Source''' window
Line 178: Line 179:
 
|| '''mpaa_rating '''is a '''factor'''.  
 
|| '''mpaa_rating '''is a '''factor'''.  
  
It has 6''' '''levels like  
+
It has 6 '''levels''' like  
 
* '''G'''  
 
* '''G'''  
 
* '''NC-17'''  
 
* '''NC-17'''  
Line 190: Line 191:
 
|| So, our '''bar chart''' will have 6 different bars.  
 
|| So, our '''bar chart''' will have 6 different bars.  
  
Each bar will represent the number of movies in each level.  
+
 
 +
Each bar will represent the number of movies in each '''level'''.  
 
|-  
 
|-  
 
|| '''[RStudio]'''
 
|| '''[RStudio]'''
Line 199: Line 201:
  
 
'''geom_bar()'''
 
'''geom_bar()'''
|| In the '''Source''' window, type the following command.  
+
|| In the '''Source''' window, type the following '''command'''.  
 
|-  
 
|-  
 
|| Highlight '''aes''' in the '''Source''' window  
 
|| Highlight '''aes''' in the '''Source''' window  
|| Here, we have mapped '''mpaa '''underscore '''rating''' on X-axis.  
+
|| Here, we have mapped '''mpaa underscore rating''' on X-axis.  
 
|-  
 
|-  
 
|| Highlight '''geom_bar''' in the Source window
 
|| Highlight '''geom_bar''' in the Source window
|| Next, we have used '''geom '''underscore '''bar''' as we are plotting a '''bar chart'''.  
+
|| Next, we have used '''geom underscore bar''' as we are plotting a '''bar chart'''.  
  
 
Similarly, we can use  
 
Similarly, we can use  
Line 224: Line 226:
  
 
Type''' Space '''plus sign >> press Enter.  
 
Type''' Space '''plus sign >> press Enter.  
|| In the '''Source''' window, after '''geom_bar(), '''type space plus sign and press '''Enter'''.  
+
|| In the '''Source''' window, after '''geom_bar(), '''type '''space plus sign''' and press '''Enter'''.  
 
|-  
 
|-  
 
|| '''[RStudio]'''
 
|| '''[RStudio]'''
Line 233: Line 235:
 
||  
 
||  
  
Now type the following command.  
+
Now type the following '''command'''.  
 
|-  
 
|-  
 
|| Highlight labs in the Source window  
 
|| Highlight labs in the Source window  
|| Here, we have used '''labs''' argument to add label and title to the bar chart.  
+
|| Here, we have used '''labs argument''' to add label and title to the bar chart.  
 
|-  
 
|-  
 
|| Highlight '''run''' button in the '''Source''' window
 
|| Highlight '''run''' button in the '''Source''' window
Line 260: Line 262:
  
 
'''title = "Count of mpaa_rating by genre")'''
 
'''title = "Count of mpaa_rating by genre")'''
|| In the '''Source''' window, type the following command.  
+
|| In the '''Source''' window, type the following '''command'''.  
 
|-  
 
|-  
 
|| Highlight '''fill''' in the '''Source''' window  
 
|| Highlight '''fill''' in the '''Source''' window  
|| Inside '''aes''', we have added '''fill''' argument and set it to '''genre'''.  
+
|| Inside '''aes''', we have added '''fill argument''' and set it to '''genre'''.  
 
|-  
 
|-  
 
|| Highlight '''run''' button in the '''Source''' window
 
|| Highlight '''run''' button in the '''Source''' window
Line 275: Line 277:
 
|-  
 
|-  
 
|| Highlight fifth bar in the '''Plots''' window  
 
|| Highlight fifth bar in the '''Plots''' window  
|| There are seven different colors in each bar. Besides the plot, the meaning of each color has been given
+
|| There are seven different colors in each bar.  
 +
 
 +
 
 +
Besides the plot, the meaning of each color has been given.
 
|-  
 
|-  
 
|| Click on X button to close the plot.
 
|| Click on X button to close the plot.
Line 284: Line 289:
 
|-  
 
|-  
 
|| Highlight '''runtime '''in the '''Source''' window  
 
|| Highlight '''runtime '''in the '''Source''' window  
|| Now we will plot a '''histogram '''for''' '''the '''object''' named as '''runtime '''in '''movies'''.
+
|| Now we will plot a '''histogram '''for the '''object''' named as '''runtime '''in '''movies'''.
 +
 
 +
 
 +
Recall that, we have already learned how to plot a histogram using '''hist function'''.  
  
Recall that, we have already learned how to plot a histogram using '''hist''' function.
 
  
Now we will use '''ggplot2 '''package to plot a '''histogram'''.  
+
Now we will use '''ggplot2 package''' to plot a '''histogram'''.  
 
|-  
 
|-  
 
|| Highlight the '''script aesPlots.R''' in the '''Source''' window.
 
|| Highlight the '''script aesPlots.R''' in the '''Source''' window.
Line 304: Line 311:
  
 
'''title = "Distribution of movies' runtime")'''
 
'''title = "Distribution of movies' runtime")'''
|| In the '''Source''' window, type the following command.  
+
|| In the '''Source''' window, type the following '''command'''.  
 
|-  
 
|-  
 
|| Highlight '''run''' button in the '''Source''' window
 
|| Highlight '''run''' button in the '''Source''' window
Line 332: Line 339:
  
 
|| We now suggest an assignment.
 
|| We now suggest an assignment.
* Using built-in data set '''mtcars, '''draw a bar chart from the object '''cyl'''.  
+
* Using '''built-in data set mtcars, '''draw a bar chart from the '''object cyl'''.  
 
* Add suitable labels to this bar chart.  
 
* Add suitable labels to this bar chart.  
  

Revision as of 07:32, 26 July 2019

Title of the script: Aesthetic Mapping in ggplot2

Author: Varshit Dubey (CoE Pune) and Sudhakar Kumar (IIT Bombay)

Keywords: R, RStudio, ggplot, aesthetic, mapping, video tutorial, spoken tutorial

Visual Cue Narration
Show slide

Opening Slide

Welcome to this tutorial on Aesthetic Mapping in ggplot2.
Show slide

Learning Objective

In this tutorial, we will learn,
  • What is aesthetic
  • How to create plots using aesthetic
  • Tuning parameters in aesthetic

Show slide

Pre-requisites

To understand this tutorial, you should know,
  • Basics of statistics
  • Basics of ggplot2 package
  • Data frames

If not, please locate the relevant tutorials on R on this website.

Show slide

System Specifications

This tutorial is recorded on
  • Ubuntu Linux OS version 16.04
  • R version 3.4.4
  • RStudio version 1.1.463

Install R version 3.2.0 or higher.

Show slide

Download Files

For this tutorial, we will use
  • A data frame moviesData.csv, and
  • A script file aesPlots.R.

Please download these files from the Code files link of this tutorial.

[Computer screen]

Highlight moviesData.csv and aesPlots.R in the folder aesPlots

Point to aesPlots folder.

I have downloaded and moved these files to aesPlots folder.

This folder is located in myProject folder on my Desktop.

I have set aesPlots folder as my Working Directory.

Now let us see what is Aesthetic?
Show slide

What is Aesthetics

  • Aesthetic is a visual property of the objects in a plot.
  • It includes lines, points, symbols, colors, and position.
  • It is used to add customization to our plots.
Let us switch to RStudio.
Highlight aesPlots.R in the Files window of RStudio Open the script aesPlots.R in RStudio.
Highlight ggplot function in the Source window Here, we are plotting a scatter plot between critics_score and audience_score of movies.
Highlight the Source button Run this script by clicking on the Source button.
Highlight Plots window Scatter plot appears in the Plots window.
Highlight movies in the Source window movies data frame opens in the Source window.
Highlight the plot in the Plots window In this scatter plot, each point refers to a particular movie.

Suppose we want to color these points according to the genre of the movies.

Highlight the scroll bar in the Source window In the Source window, scroll from left to right.
Highlight genre in the Source window As we can see that there are different genres like
  • Drama
  • Comedy
  • Horror
  • Documentary, etc.

So, we will assign a unique color to each genre.

Highlight the script aesPlots.R in the Source window Click on the script aesPlots.R
[RStudio]

ggplot(data = movies,

mapping = aes(x = critics_score,

y = audience_score,

color = genre)) +

geom_point()

In the Source window, type the following commands.
Highlight aes in the Source window Inside aes, we have added color argument and set it to genre.
Highlight run button in the Source window Save the script and run the current line by pressing Ctrl+Enter keys simultaneously.
Highlight Plots window Modified scatter plot appears in the Plots window.
Click on Zoom button to maximize the plot.

Highlight Plots window

In the Plots window, click on the Zoom button to maximize the plot.
Highlight the plot We can see that each point is assigned a unique color according to its genre.


In the right side of the plot, we can view the mapping of genres with their colors.

Click on X button to close the plot. Close this plot.
Highlight ggplot in the Source window Now, we will learn how to draw a bar chart using ggplot function.
Highlight movies in the Source window In the Source window, click on movies.
Highlight the scroll bar in the Source window In the Source window, scroll from left to right.
Highlight mpaa_rating in the Source window Let us inspect the object named mpaa underscore rating in movies.
Highlight the script aesPlots.R in the Source window Click on the script aesPlots.R
[RStudio]

str(movies$mpaa_rating)

levels(movies$mpaa_rating)

In the Source window, type the following commands.
Highlight run button in the Source window Run the last two lines of code.
Highlight output in the Console window mpaa_rating is a factor.

It has 6 levels like

  • G
  • NC-17
  • PG
  • PG-13
  • R, and
  • Unrated.
Highlight output in the Console window So, our bar chart will have 6 different bars.


Each bar will represent the number of movies in each level.

[RStudio]

ggplot(data = movies,

mapping = aes(x = mpaa_rating)) +

geom_bar()

In the Source window, type the following command.
Highlight aes in the Source window Here, we have mapped mpaa underscore rating on X-axis.
Highlight geom_bar in the Source window Next, we have used geom underscore bar as we are plotting a bar chart.

Similarly, we can use

  • geom_line to draw a line chart
  • geom_boxplot to draw a box plot
Highlight run button in the Source window Run the current line.
Highlight Plots window The bar chart appears in the Plots window.
Highlight plot in the Plots window Now, we will learn how to add labels to this bar chart.
Point to geom_bar().

Type Space plus sign >> press Enter.

In the Source window, after geom_bar(), type space plus sign and press Enter.
[RStudio]

labs(y = "Rating count",

title = "Count of mpaa_rating")

Now type the following command.

Highlight labs in the Source window Here, we have used labs argument to add label and title to the bar chart.
Highlight run button in the Source window Run the current line.
Highlight Plots window The modified bar chart appears in the Plots window.
Highlight fifth bar in the Plots window We can see that most of the movies have been rated as R in mpaa_rating.

Suppose, in this bar chart, we want to view the distribution of movies by genre.

[RStudio]

ggplot(data = movies,

mapping = aes(x = mpaa_rating, fill = genre)) +

geom_bar() +

labs(y = "Rating count",

title = "Count of mpaa_rating by genre")

In the Source window, type the following command.
Highlight fill in the Source window Inside aes, we have added fill argument and set it to genre.
Highlight run button in the Source window Run the current line.
Highlight Plots window The modified bar chart appears in the Plots window.
Highlight Plots window In the Plots window, click on the Zoom button to maximize the plot.
Highlight fifth bar in the Plots window There are seven different colors in each bar.


Besides the plot, the meaning of each color has been given.

Click on X button to close the plot. Close the plot.
Highlight movies in the Source window In the Source window, click on movies data frame.
Highlight runtime in the Source window Now we will plot a histogram for the object named as runtime in movies.


Recall that, we have already learned how to plot a histogram using hist function.


Now we will use ggplot2 package to plot a histogram.

Highlight the script aesPlots.R in the Source window. Click on the script aesPlots.R
[RStudio]

ggplot(data = movies,

mapping = aes(x = runtime)) +

geom_histogram() +

labs(x = "Runtime of movies",

title = "Distribution of movies' runtime")

In the Source window, type the following command.
Highlight run button in the Source window Save the script and run the current line.
Highlight output in the Console window There are some warning messages, which we will ignore for now.
Highlight Plots window The histogram appears in the Plots window.
Let us summarize what we have learnt.
Show slide

Summary

In this tutorial, we have learnt,
  • What is aesthetic
  • How to create plots using aesthetic
  • Tuning parameters in aesthetic
Show slide

Assignment

We now suggest an assignment.
  • Using built-in data set mtcars, draw a bar chart from the object cyl.
  • Add suitable labels to this bar chart.
Show slide

About the Spoken Tutorial Project

The video at the following link summarises the Spoken Tutorial project.

Please download and watch it.

Show slide

Spoken Tutorial Workshops

We conduct workshops using Spoken Tutorials and give certificates.


Please contact us.

Show Slide

Forum to answer questions

Please post your timed queries in this forum.
Show Slide

Forum to answer questions

Please post your general queries in this forum.
Show Slide

Textbook Companion

The FOSSEE team coordinates the TBC project.

For more details, please visit these sites.

Show Slide

Acknowledgment

The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India
Show Slide

Thank You

The script for this tutorial was contributed by Varshit Dubey (CoE Pune).

This is Sudhakar Kumar from IIT Bombay signing off. Thanks for watching.

Contributors and Content Editors

Madhurig, Nancyvarkey, Sudhakarst