R/C2/Plotting-Histograms-and-Pie-Chart/English

From Script | Spoken-Tutorial
Revision as of 07:36, 6 May 2019 by Nancyvarkey (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Title of script: Plotting Histograms and Pie Chart

Author: Tushar Bajaj (TISS Mumbai) and Sudhakar Kumar (IIT Bombay)

Keywords: R, RStudio, graphs, histogram, frequency, pie chart, video tutorial

Visual Cue Narration
Show slide

Opening slide

Welcome to the spoken tutorial on Plotting Histograms and Pie Chart.
Show slide

Learning Objectives

In this tutorial, we will learn how to:
  • Plot histograms
  • Plot pie chart
  • Save plots
Show slide

Pre-requisites

To understand this tutorial, you should know,
  • Data frames in R
  • Basics of Statistics

If not, please locate the relevant tutorials on R on this website.

Show slide

System Specifications

This tutorial is recorded on
  • Ubuntu Linux OS version 16.04
  • R version 3.4.4
  • RStudio version 1.1.456

Install R version 3.2.0 or higher.

Show slide

Download Files

For this tutorial, we will use
  • A data frame moviesData.csv
  • A script file myPlots.R.

Please download these files from the Code files link of this tutorial.

[Computer screen]

Highlight moviesData.csv and myPlots.R in the folder Plots

I have downloaded and moved these files to Plots folder.

This folder is located in myProject folder on my Desktop.

I have also set this folder as my Working Directory.

Let us switch to Rstudio.
Highlight myPlots.R in the Files window of RStudio Open the script myPlots.R in RStudio.
Highlight the Source button Run this script by clicking on Source button.
Highlight movies in the Source window movies data frame opens in the Source window.
Highlight dim(movies) in the Console window This data frame has 600 rows and 31 columns.

It means this data frame has 600 observations of 31 variables.

https://spoken-tutorial.org To know about more this dataframe, please refer to the Additional Material section on this website.
Highlight the scroll bar in the Source window In the Source window, scroll from left to right to see the remaining objects of movies data frame.
Highlight runtime in the Source window Now we will learn how to plot a histogram of the object named runtime in movies.
Show slide

Histogram

A histogram is
  • A visual representation of the distribution of a dataset.
  • Used to plot the frequency of score occurrences in a continuous dataset
Let us switch to RStudio.
Highlight myPlots.R in the Source window Click on the script myPlots.R
[RStudio]

hist(movies$runtime)

In the Source window, type hist, within parentheses movies dollar sign runtime.
Highlight Run button in the Source window Save the script and run the current line by pressing Run button.
Highlight Plots window The histogram appears in the Plots window.
Highlight Zoom in the Plots window Click on Zoom to maximize this plot.
Highlight the plot in the Plots window In the histogram there are 9 bins.

Height of a bin represents the number of observations lying in that interval.

Highlight the plot in the Plots window Now, we will learn how to add labels to this histogram.

Also, we will change the color of bins in this histogram.

Highlight hist in the Source window For this, we will add more arguments to the hist function.
Close the histogram.
[RStudio]

hist(movies$runtime,

main = "Distribution of movies' length",

xlab = "Run time of movies",

xlim = c(0,300),

col = "blue")

In the Source window, type the following command.
Highlight hist in the Source window Here, we have used the following arguments:
  • main for adding title to the histogram
  • xlab for adding label to the x axis
  • xlim to set the range of values on x axis
  • col to set the color of bins
Highlight Run button in the Source window Run the current line.
Highlight Files and Plots window In the Files and Plots window, click on Zoom to maximize the plot.
Highlight labels and title of the histogram The labels and the title of histogram have been changed.
Highlight plot in the Plots window We can observe that most of the movies have the runtime of around 75 to 125 minutes.
Highlight plot in the Plots window Now we will modify the number of breaks in the histogram.

We can make the groups finer or coarser by modifying the number of breaks.

Close this plot.
Highlight hist in the Source window Let us add breaks argument in hist function and set it to 4.
[RStudio]

hist(movies$runtime,

main = "Distribution of movies' length",

xlab = "Run time of movies",

xlim = c(0,300),

col = "blue",

breaks = 4)

In the Source window, type the following command.
Highlight Run button in the Source window Save the script and run the current line.
Highlight Files and Plots window In the Files and Plots window, click on Zoom to maximize the plot.
Highlight plot in the Plots window Now, there are five bins in the histogram.

Remember we had set breaks to be 4.

Close the plot.
Highlight movies in the Source window In the Source window, click on movies.
I am scrolling from right to left.
Highlight genre in the Source window Now, we will learn how to create a pie chart from the object genre in the movies data frame.
Show slide

Pie chart

A pie chart is
  • A circular chart.
  • Divided into wedge-like sectors, illustrating proportion.

The total value of the pie is always 100 percent.

Let us switch to RStudio.
Highlight genre in the Source window First, we will make a table of the number of different genres.

For this, we use table function.

Highlight myPlots.R in the Source window Click on the script myPlots.R
[RStudio]

genreCount <- table(movies$genre)

View(genreCount)

In the Source window, type the following command.
Highlight Run button in the Source window. Run the last two lines.
Highlight genreCount in the Source window We can see that there are 65 movies in Action & Adventure and 87 movies in Comedy.
Highlight genreCount in the Source window Now, we draw a pie chart with genreCount.
Highlight myPlots.R in the Source window Click on the script myPlots.R
[RStudio]

pie(genreCount)

In the Source window, type pie within parentheses genreCount.
Highlight Run button in the Source window Run the current line.
Highlight Plots window The pie chart appears in the Plots window.
Highlight Zoom in the Plots window Click on Zoom to maximize this pie chart.
Highlight plot in the Plots window We can see that Drama has largest share of movies.
Highlight plot in the Plots window Now, we will change the color of border in this pie chart.
Highlight pie in the Source window For this, we will add more arguments to the pie function.
Close this plot.
[RStudio]

pie(genreCount,

main = "Proportion of movies' genre",

border = "blue",

col = "orange")

In the Source window, type the following command.
Highlight Run button in the Source window Save the script and run the current line.
Highlight the plot in the Plots window The modified pie chart appears in the Plots window.
Highlight Zoom in the Plots window. Click on Zoom to maximize the plot.
Highlight the plot in the Plots window. Now, we will learn how to save this pie chart as an image on our computer.
Close this plot.
Highlight Files and Plots window I am resizing the Files and Plots window.
Highlight Export in the Plots window

Highlight Save as Image option

In the Plots window, click on Export.

From the drop-down menu, select Save as Image.

Highlight Save Plot as Image window A window named Save Plot as Image appears.
Highlight Image format option

Highlight JPEG option

You can select the format in which you want to save your image.

I am saving it in JPEG format.

Highlight Directory option

Highlight path of Directory

Below the Image format option, you can select the directory where you want to save your image.

By default, RStudio will save the image in the directory where the script has been placed.

I will save the image in this folder.

Highlight File name option Below the directory, in the field File name, you can write the name of the image to be saved.

I am saving the image with the name pieChart with capital C.

Highlight Width and Height field

Highlight Maintain aspect ratio option

In the top right corner of this window, you can modify the dimensions of this image.

I am setting the width to 650.

Please make sure that Maintain aspect ratio is checked.

Highlight Update Preview option Below Maintain aspect ratio, click on Update Preview button.
Highlight the image The image with larger dimensions is shown.
Highlight Save option Finally, click on Save.
I am resizing the Files and Plots window.
Highlight Files tab Now, click on the Files tab.
Highlight pieChart.jpeg in the Files tab The image pieChart.jpeg has been saved in the Plots folder.


Click on this image to see the saved pie chart.

Let us summarize what we have learnt.
Show slide

Summary

In this tutorial, we have learnt how to:
  • Plot histograms
  • Plot pie chart
  • Save plots
Show slide

Assignment

We now suggest an assignment.
  • Read the file moviesData.csv. Create a histogram of the object named imdb_num_votes in this file.
  • Create a pie chart of the object mpaa_rating.
  • Save both the plots.
Show slide

About the Spoken Tutorial Project

The video at the following link summarises the Spoken Tutorial project.

Please download and watch it.

Show slide

Spoken Tutorial Workshops

We conduct workshops using Spoken Tutorials and give certificates.

Please contact us.

Show Slide

Forum to answer questions

Please post your timed queries in this forum.
Show Slide

Forum to answer questions

Please post your general queries in this forum.
Show Slide

Textbook Companion

The FOSSEE team coordinates the TBC project.

For more details, please visit these sites.

Show Slide

Acknowledgements

The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India
Show Slide

Thank You

The script for this tutorial was contributed by Tushar Bajaj (TISS Mumbai).

This is Sudhakar Kumar from IIT Bombay signing off. Thanks for watching.

Contributors and Content Editors

Nancyvarkey, Sudhakarst