Difference between revisions of "R/C2/Plotting-Histograms-and-Pie-Chart/English"
Sudhakarst (Talk | contribs) |
Nancyvarkey (Talk | contribs) |
||
(One intermediate revision by one other user not shown) | |||
Line 4: | Line 4: | ||
'''Keywords''': R, RStudio, graphs, histogram, frequency, pie chart, video tutorial | '''Keywords''': R, RStudio, graphs, histogram, frequency, pie chart, video tutorial | ||
− | |||
{| border =1 | {| border =1 | ||
Line 48: | Line 47: | ||
Download Files | Download Files | ||
|| For this tutorial, we will use | || For this tutorial, we will use | ||
− | * A | + | * A '''data frame moviesData.csv''' |
− | * A | + | * A '''script file myPlots.R'''. |
Please download these files from the '''Code files''' link of this tutorial. | Please download these files from the '''Code files''' link of this tutorial. | ||
Line 65: | Line 64: | ||
|| Let us switch to '''Rstudio'''. | || Let us switch to '''Rstudio'''. | ||
|- | |- | ||
− | || Highlight '''myPlots.R''' in the '''Files '''window | + | || Highlight '''myPlots.R''' in the '''Files '''window of '''RStudio ''' |
|| Open the '''script myPlots.R '''in''' RStudio'''. | || Open the '''script myPlots.R '''in''' RStudio'''. | ||
|- | |- | ||
Line 72: | Line 71: | ||
|- | |- | ||
|| Highlight '''movies''' in the '''Source''' window | || Highlight '''movies''' in the '''Source''' window | ||
− | || '''movies | + | || '''movies data frame''' opens in the '''Source''' window. |
|- | |- | ||
|| Highlight '''dim(movies)''' in the '''Console''' window | || Highlight '''dim(movies)''' in the '''Console''' window | ||
|| This '''data frame''' has 600 rows and 31 columns. | || This '''data frame''' has 600 rows and 31 columns. | ||
− | It means this data frame has 600 observations of 31 variables. | + | It means this '''data frame''' has 600 '''observations''' of 31 '''variables'''. |
|- | |- | ||
|| '''https://spoken-tutorial.org''' | || '''https://spoken-tutorial.org''' | ||
Line 83: | Line 82: | ||
|- | |- | ||
|| Highlight the scroll bar in the '''Source''' window | || Highlight the scroll bar in the '''Source''' window | ||
− | || In the '''Source''' window, scroll from left to right to see the remaining | + | || In the '''Source''' window, scroll from left to right to see the remaining '''objects''' of '''movies data frame'''. |
|- | |- | ||
|| Highlight '''runtime''' in the '''Source''' window | || Highlight '''runtime''' in the '''Source''' window | ||
− | || Now we will learn how to plot a '''histogram''' of the object named '''runtime '''in '''movies'''. | + | || Now we will learn how to plot a '''histogram''' of the '''object''' named '''runtime '''in '''movies'''. |
|- | |- | ||
|| Show slide | || Show slide | ||
− | Histogram | + | '''Histogram''' |
− | || A histogram is | + | || A '''histogram''' is |
− | * A visual representation of the distribution of a dataset. | + | * A visual representation of the distribution of a '''dataset'''. |
− | * Used to plot the frequency of score occurrences in a continuous dataset | + | * Used to plot the frequency of score occurrences in a continuous '''dataset''' |
|- | |- | ||
Line 100: | Line 99: | ||
|- | |- | ||
|| Highlight '''myPlots.R '''in the '''Source''' window | || Highlight '''myPlots.R '''in the '''Source''' window | ||
− | || Click on the '''script | + | || Click on the '''script myPlots.R''' |
|- | |- | ||
|| [RStudio] | || [RStudio] | ||
Line 111: | Line 110: | ||
|- | |- | ||
|| Highlight '''Plots''' window | || Highlight '''Plots''' window | ||
− | || The histogram appears in the '''Plots''' window. | + | || The '''histogram''' appears in the '''Plots''' window. |
|- | |- | ||
|| Highlight '''Zoom''' in the '''Plots''' window | || Highlight '''Zoom''' in the '''Plots''' window | ||
− | || Click on Zoom to maximize this plot. | + | || Click on '''Zoom''' to maximize this plot. |
|- | |- | ||
|| Highlight the plot in the '''Plots''' window | || Highlight the plot in the '''Plots''' window | ||
− | || In the histogram there are 9 bins. | + | || In the '''histogram''' there are 9 '''bins'''. |
− | Height''' ''' | + | Height of a '''bin''' represents the number of observations lying in that interval. |
|- | |- | ||
|| Highlight the plot in the '''Plots''' window | || Highlight the plot in the '''Plots''' window | ||
− | || Now, we will learn how to add labels to this histogram. | + | || Now, we will learn how to add labels to this '''histogram'''. |
− | Also, we will change the color of bins in this histogram. | + | Also, we will change the color of '''bins''' in this '''histogram'''. |
|- | |- | ||
|| Highlight '''hist''' in the '''Source''' window | || Highlight '''hist''' in the '''Source''' window | ||
− | || For this, we will add more arguments to the '''hist''' | + | || For this, we will add more '''arguments''' to the '''hist function'''. |
|- | |- | ||
|| | || | ||
− | || Close the histogram. | + | || Close the '''histogram'''. |
|- | |- | ||
|| [RStudio] | || [RStudio] | ||
Line 143: | Line 142: | ||
'''col = "blue")''' | '''col = "blue")''' | ||
− | || In the '''Source''' window, type the following command. | + | || In the '''Source''' window, type the following '''command'''. |
|- | |- | ||
|| Highlight '''hist''' in the '''Source''' window | || Highlight '''hist''' in the '''Source''' window | ||
− | || Here, we have used the following arguments:* '''main''' for adding title to the histogram | + | || Here, we have used the following '''arguments''': |
+ | * '''main''' for adding title to the '''histogram''' | ||
* '''xlab''' for adding label to the x axis | * '''xlab''' for adding label to the x axis | ||
* '''xlim''' to set the range of values on x axis | * '''xlim''' to set the range of values on x axis | ||
− | * '''col''' to set the color of bins | + | * '''col''' to set the color of '''bins''' |
|- | |- | ||
Line 159: | Line 159: | ||
|- | |- | ||
|| Highlight labels and title of the histogram | || Highlight labels and title of the histogram | ||
− | || The labels and the title of histogram have been changed. | + | || The labels and the title of '''histogram''' have been changed. |
|- | |- | ||
|| Highlight plot in the '''Plots''' window | || Highlight plot in the '''Plots''' window | ||
− | || We can observe that most of the | + | || We can observe that most of the movies have the runtime of around 75 to 125 minutes. |
|- | |- | ||
|| Highlight plot in the '''Plots''' window | || Highlight plot in the '''Plots''' window | ||
− | || Now we will modify the number of '''breaks''' in the histogram. | + | || Now we will modify the number of '''breaks''' in the '''histogram'''. |
We can make the groups finer or coarser by modifying the number of '''breaks.''' | We can make the groups finer or coarser by modifying the number of '''breaks.''' | ||
Line 173: | Line 173: | ||
|- | |- | ||
|| Highlight '''hist''' in the '''Source''' window | || Highlight '''hist''' in the '''Source''' window | ||
− | || Let us add '''breaks''' | + | || Let us add '''breaks argument''' in '''hist function''' and set it to 4. |
|- | |- | ||
|| [RStudio] | || [RStudio] | ||
Line 189: | Line 189: | ||
'''breaks = 4)''' | '''breaks = 4)''' | ||
− | || In the '''Source''' window, type the following command. | + | || In the '''Source''' window, type the following '''command'''. |
|- | |- | ||
|| Highlight '''Run''' button in the '''Source''' window | || Highlight '''Run''' button in the '''Source''' window | ||
− | || Save the script and run the current line. | + | || Save the '''script''' and run the current line. |
|- | |- | ||
|| Highlight '''Files''' and '''Plots''' window | || Highlight '''Files''' and '''Plots''' window | ||
Line 198: | Line 198: | ||
|- | |- | ||
|| Highlight plot in the '''Plots''' window | || Highlight plot in the '''Plots''' window | ||
− | || Now, there are five bins in the histogram. | + | || Now, there are five '''bins''' in the '''histogram'''. |
− | Remember we had set breaks to be 4. | + | Remember we had set '''breaks''' to be 4. |
|- | |- | ||
|| | || | ||
Line 212: | Line 212: | ||
|- | |- | ||
|| Highlight '''genre''' in the '''Source''' window | || Highlight '''genre''' in the '''Source''' window | ||
− | || Now, we will learn how to create a pie chart from the | + | || Now, we will learn how to create a '''pie chart''' from the '''object genre''' in the '''movies data frame'''. |
|- | |- | ||
|| Show slide | || Show slide | ||
− | Pie chart | + | '''Pie chart''' |
|| A '''pie chart''' is | || A '''pie chart''' is | ||
Line 222: | Line 222: | ||
* Divided into wedge-like sectors, illustrating proportion. | * Divided into wedge-like sectors, illustrating proportion. | ||
− | The total value of the pie is always 100 percent | + | The total value of the '''pie''' is always 100 percent. |
|- | |- | ||
|| | || | ||
Line 230: | Line 230: | ||
|| First, we will make a table of the number of different '''genres'''. | || First, we will make a table of the number of different '''genres'''. | ||
− | For this, we use '''table | + | For this, we use '''table function'''. |
|- | |- | ||
|| Highlight '''myPlots.R '''in the '''Source''' window | || Highlight '''myPlots.R '''in the '''Source''' window | ||
− | || Click on the '''script | + | || Click on the '''script myPlots.R''' |
|- | |- | ||
|| [RStudio] | || [RStudio] | ||
Line 240: | Line 240: | ||
'''View(genreCount)''' | '''View(genreCount)''' | ||
− | || In the '''Source''' window, type the following command. | + | || In the '''Source''' window, type the following '''command'''. |
|- | |- | ||
− | || Highlight '''Run''' button in the '''Source''' window | + | || Highlight '''Run''' button in the '''Source''' window. |
|| Run the last two lines. | || Run the last two lines. | ||
|- | |- | ||
Line 249: | Line 249: | ||
|- | |- | ||
|| Highlight '''genreCount''' in the Source window | || Highlight '''genreCount''' in the Source window | ||
− | || Now, we draw a pie chart with '''genreCount'''. | + | || Now, we draw a '''pie chart''' with '''genreCount'''. |
|- | |- | ||
|| Highlight '''myPlots.R '''in the '''Source''' window | || Highlight '''myPlots.R '''in the '''Source''' window | ||
− | || Click on the '''script | + | || Click on the '''script myPlots.R''' |
|- | |- | ||
|| [RStudio] | || [RStudio] | ||
'''pie(genreCount)''' | '''pie(genreCount)''' | ||
− | || In the '''Source''' window, type '''pie''' | + | || In the '''Source''' window, type '''pie''' within parentheses '''genreCount'''. |
− | + | ||
− | within parentheses '''genreCount'''. | + | |
|- | |- | ||
|| Highlight '''Run''' button in the '''Source''' window | || Highlight '''Run''' button in the '''Source''' window | ||
Line 265: | Line 263: | ||
|- | |- | ||
|| Highlight '''Plots''' window | || Highlight '''Plots''' window | ||
− | || The pie chart appears in the '''Plots''' window. | + | || The '''pie chart''' appears in the '''Plots''' window. |
|- | |- | ||
|| Highlight '''Zoom''' in the '''Plots''' window | || Highlight '''Zoom''' in the '''Plots''' window | ||
− | || Click on '''Zoom''' to maximize this pie chart. | + | || Click on '''Zoom''' to maximize this '''pie chart'''. |
|- | |- | ||
|| Highlight plot in the '''Plots''' window | || Highlight plot in the '''Plots''' window | ||
Line 274: | Line 272: | ||
|- | |- | ||
|| Highlight plot in the '''Plots''' window | || Highlight plot in the '''Plots''' window | ||
− | || Now, we will change the color of border in this '''pie | + | || Now, we will change the color of border in this '''pie chart'''. |
|- | |- | ||
|| Highlight '''pie''' in the '''Source''' window | || Highlight '''pie''' in the '''Source''' window | ||
− | || For this, we will add more arguments to the '''pie''' | + | || For this, we will add more '''arguments''' to the '''pie function'''. |
|- | |- | ||
|| | || | ||
Line 291: | Line 289: | ||
'''col = "orange")''' | '''col = "orange")''' | ||
− | || In the '''Source''' window, type the following command. | + | || In the '''Source''' window, type the following '''command'''. |
|- | |- | ||
|| Highlight '''Run''' button in the '''Source''' window | || Highlight '''Run''' button in the '''Source''' window | ||
− | || Save the script and | + | || Save the script and run the current line. |
|- | |- | ||
|| Highlight the plot in the '''Plots '''window | || Highlight the plot in the '''Plots '''window | ||
|| The modified '''pie chart''' appears in the '''Plots''' window. | || The modified '''pie chart''' appears in the '''Plots''' window. | ||
|- | |- | ||
− | || Highlight '''Zoom''' in the '''Plots''' window | + | || Highlight '''Zoom''' in the '''Plots''' window. |
− | || Click on '''Zoom''' to maximize the plot | + | || Click on '''Zoom''' to maximize the plot. |
|- | |- | ||
− | || Highlight the plot in the '''Plots '''window | + | || Highlight the plot in the '''Plots '''window. |
− | || Now, we will learn how to save this pie chart as an image on our computer. | + | || Now, we will learn how to save this '''pie chart''' as an image on our computer. |
|- | |- | ||
|| | || | ||
Line 311: | Line 309: | ||
|| I am resizing the '''Files''' and '''Plots''' window. | || I am resizing the '''Files''' and '''Plots''' window. | ||
|- | |- | ||
− | || Highlight Export in the '''Plots '''window | + | || Highlight '''Export''' in the '''Plots '''window |
Highlight '''Save as Image''' option | Highlight '''Save as Image''' option | ||
Line 340: | Line 338: | ||
|| Below the directory, in the field '''File name''', you can write the name of the image to be saved. | || Below the directory, in the field '''File name''', you can write the name of the image to be saved. | ||
− | + | I am saving the image with the name '''pieChart''' with capital '''C'''. | |
− | I am saving the image with the name '''pieChart''' with capital C. | + | |
|- | |- | ||
|| Highlight '''Width''' and '''Height''' field | || Highlight '''Width''' and '''Height''' field | ||
Line 353: | Line 350: | ||
|- | |- | ||
|| Highlight '''Update Preview''' option | || Highlight '''Update Preview''' option | ||
− | || Below '''Maintain | + | || Below '''Maintain aspect ratio''', click on '''Update Preview''' button. |
|- | |- | ||
|| Highlight the image | || Highlight the image | ||
Line 371: | Line 368: | ||
− | Click on this image to see the saved pie chart. | + | Click on this image to see the saved '''pie chart'''. |
|- | |- | ||
|| | || | ||
Line 381: | Line 378: | ||
|| In this tutorial, we have learnt how to: | || In this tutorial, we have learnt how to: | ||
* Plot '''histograms''' | * Plot '''histograms''' | ||
− | * Plot '''pie chart ''' | + | * Plot '''pie chart''' |
* Save plots | * Save plots | ||
Line 389: | Line 386: | ||
Assignment | Assignment | ||
|| We now suggest an assignment. | || We now suggest an assignment. | ||
− | * Read the file '''moviesData.csv'''. Create a histogram of the object named '''imdb_num_votes''' in this file. | + | * Read the file '''moviesData.csv'''. Create a '''histogram''' of the '''object''' named '''imdb_num_votes''' in this file. |
− | * Create a '''pie chart''' of the | + | * Create a '''pie chart''' of the '''object mpaa_rating'''. |
* Save both the plots. | * Save both the plots. | ||
Latest revision as of 07:36, 6 May 2019
Title of script: Plotting Histograms and Pie Chart
Author: Tushar Bajaj (TISS Mumbai) and Sudhakar Kumar (IIT Bombay)
Keywords: R, RStudio, graphs, histogram, frequency, pie chart, video tutorial
Visual Cue | Narration |
Show slide
Opening slide |
Welcome to the spoken tutorial on Plotting Histograms and Pie Chart. |
Show slide
Learning Objectives |
In this tutorial, we will learn how to:
|
Show slide
Pre-requisites |
To understand this tutorial, you should know,
If not, please locate the relevant tutorials on R on this website. |
Show slide
System Specifications |
This tutorial is recorded on
Install R version 3.2.0 or higher. |
Show slide
Download Files |
For this tutorial, we will use
Please download these files from the Code files link of this tutorial. |
[Computer screen]
Highlight moviesData.csv and myPlots.R in the folder Plots |
I have downloaded and moved these files to Plots folder.
This folder is located in myProject folder on my Desktop. I have also set this folder as my Working Directory. |
Let us switch to Rstudio. | |
Highlight myPlots.R in the Files window of RStudio | Open the script myPlots.R in RStudio. |
Highlight the Source button | Run this script by clicking on Source button. |
Highlight movies in the Source window | movies data frame opens in the Source window. |
Highlight dim(movies) in the Console window | This data frame has 600 rows and 31 columns.
It means this data frame has 600 observations of 31 variables. |
https://spoken-tutorial.org | To know about more this dataframe, please refer to the Additional Material section on this website. |
Highlight the scroll bar in the Source window | In the Source window, scroll from left to right to see the remaining objects of movies data frame. |
Highlight runtime in the Source window | Now we will learn how to plot a histogram of the object named runtime in movies. |
Show slide
Histogram |
A histogram is
|
Let us switch to RStudio. | |
Highlight myPlots.R in the Source window | Click on the script myPlots.R |
[RStudio]
hist(movies$runtime) |
In the Source window, type hist, within parentheses movies dollar sign runtime. |
Highlight Run button in the Source window | Save the script and run the current line by pressing Run button. |
Highlight Plots window | The histogram appears in the Plots window. |
Highlight Zoom in the Plots window | Click on Zoom to maximize this plot. |
Highlight the plot in the Plots window | In the histogram there are 9 bins.
Height of a bin represents the number of observations lying in that interval. |
Highlight the plot in the Plots window | Now, we will learn how to add labels to this histogram.
Also, we will change the color of bins in this histogram. |
Highlight hist in the Source window | For this, we will add more arguments to the hist function. |
Close the histogram. | |
[RStudio]
hist(movies$runtime, main = "Distribution of movies' length", xlab = "Run time of movies", xlim = c(0,300), col = "blue") |
In the Source window, type the following command. |
Highlight hist in the Source window | Here, we have used the following arguments:
|
Highlight Run button in the Source window | Run the current line. |
Highlight Files and Plots window | In the Files and Plots window, click on Zoom to maximize the plot. |
Highlight labels and title of the histogram | The labels and the title of histogram have been changed. |
Highlight plot in the Plots window | We can observe that most of the movies have the runtime of around 75 to 125 minutes. |
Highlight plot in the Plots window | Now we will modify the number of breaks in the histogram.
We can make the groups finer or coarser by modifying the number of breaks. |
Close this plot. | |
Highlight hist in the Source window | Let us add breaks argument in hist function and set it to 4. |
[RStudio]
hist(movies$runtime, main = "Distribution of movies' length", xlab = "Run time of movies", xlim = c(0,300), col = "blue", breaks = 4) |
In the Source window, type the following command. |
Highlight Run button in the Source window | Save the script and run the current line. |
Highlight Files and Plots window | In the Files and Plots window, click on Zoom to maximize the plot. |
Highlight plot in the Plots window | Now, there are five bins in the histogram.
Remember we had set breaks to be 4. |
Close the plot. | |
Highlight movies in the Source window | In the Source window, click on movies. |
I am scrolling from right to left. | |
Highlight genre in the Source window | Now, we will learn how to create a pie chart from the object genre in the movies data frame. |
Show slide
Pie chart |
A pie chart is
The total value of the pie is always 100 percent. |
Let us switch to RStudio. | |
Highlight genre in the Source window | First, we will make a table of the number of different genres.
For this, we use table function. |
Highlight myPlots.R in the Source window | Click on the script myPlots.R |
[RStudio]
genreCount <- table(movies$genre) View(genreCount) |
In the Source window, type the following command. |
Highlight Run button in the Source window. | Run the last two lines. |
Highlight genreCount in the Source window | We can see that there are 65 movies in Action & Adventure and 87 movies in Comedy. |
Highlight genreCount in the Source window | Now, we draw a pie chart with genreCount. |
Highlight myPlots.R in the Source window | Click on the script myPlots.R |
[RStudio]
pie(genreCount) |
In the Source window, type pie within parentheses genreCount. |
Highlight Run button in the Source window | Run the current line. |
Highlight Plots window | The pie chart appears in the Plots window. |
Highlight Zoom in the Plots window | Click on Zoom to maximize this pie chart. |
Highlight plot in the Plots window | We can see that Drama has largest share of movies. |
Highlight plot in the Plots window | Now, we will change the color of border in this pie chart. |
Highlight pie in the Source window | For this, we will add more arguments to the pie function. |
Close this plot. | |
[RStudio]
pie(genreCount, main = "Proportion of movies' genre", border = "blue", col = "orange") |
In the Source window, type the following command. |
Highlight Run button in the Source window | Save the script and run the current line. |
Highlight the plot in the Plots window | The modified pie chart appears in the Plots window. |
Highlight Zoom in the Plots window. | Click on Zoom to maximize the plot. |
Highlight the plot in the Plots window. | Now, we will learn how to save this pie chart as an image on our computer. |
Close this plot. | |
Highlight Files and Plots window | I am resizing the Files and Plots window. |
Highlight Export in the Plots window
Highlight Save as Image option |
In the Plots window, click on Export.
From the drop-down menu, select Save as Image. |
Highlight Save Plot as Image window | A window named Save Plot as Image appears. |
Highlight Image format option
Highlight JPEG option |
You can select the format in which you want to save your image.
I am saving it in JPEG format. |
Highlight Directory option
Highlight path of Directory |
Below the Image format option, you can select the directory where you want to save your image.
By default, RStudio will save the image in the directory where the script has been placed. I will save the image in this folder. |
Highlight File name option | Below the directory, in the field File name, you can write the name of the image to be saved.
I am saving the image with the name pieChart with capital C. |
Highlight Width and Height field
Highlight Maintain aspect ratio option |
In the top right corner of this window, you can modify the dimensions of this image.
I am setting the width to 650. Please make sure that Maintain aspect ratio is checked. |
Highlight Update Preview option | Below Maintain aspect ratio, click on Update Preview button. |
Highlight the image | The image with larger dimensions is shown. |
Highlight Save option | Finally, click on Save. |
I am resizing the Files and Plots window. | |
Highlight Files tab | Now, click on the Files tab. |
Highlight pieChart.jpeg in the Files tab | The image pieChart.jpeg has been saved in the Plots folder.
|
Let us summarize what we have learnt. | |
Show slide
Summary |
In this tutorial, we have learnt how to:
|
Show slide
Assignment |
We now suggest an assignment.
|
Show slide
About the Spoken Tutorial Project |
The video at the following link summarises the Spoken Tutorial project.
Please download and watch it. |
Show slide
Spoken Tutorial Workshops |
We conduct workshops using Spoken Tutorials and give certificates.
Please contact us. |
Show Slide
Forum to answer questions |
Please post your timed queries in this forum. |
Show Slide
Forum to answer questions |
Please post your general queries in this forum. |
Show Slide
Textbook Companion |
The FOSSEE team coordinates the TBC project.
For more details, please visit these sites. |
Show Slide
Acknowledgements |
The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India |
Show Slide
Thank You |
The script for this tutorial was contributed by Tushar Bajaj (TISS Mumbai).
This is Sudhakar Kumar from IIT Bombay signing off. Thanks for watching. |