R/C2/Plotting-Bar-Charts-and-Scatter-Plot/English-timed

From Script | Spoken-Tutorial
Jump to: navigation, search
Time Narration
00:01 Welcome to this tutorial on Plotting bar charts and scatter plot.
00:08 In this tutorial, we will learn how to:
00:12 Plot bar charts
00:14 Plot scatter plot
00:18 Find the correlation coefficient between two objects.
00:22 To understand this tutorial, you should know,
00:27 Data frames in R
00:29 Basics of Statistics
00:33 If not, please locate the relevant tutorials R on this website.
00:40 This tutorial is recorded on
00:43 Ubuntu Linux OS version 16.04
00:48 R version 3.4.4
00:51 RStudio version 1.1.463
00:57 Install R version 3.2.0 or higher.
01:02 For this tutorial, we will use
01:06 A data frame moviesData.csv
01:10 A script file barPlots.R.
01:15 Please download these files from the Code files link of this tutorial.
01:21 I have downloaded and moved these files to Plots folder.
01:28 This folder is located in myProject folder on my Desktop.
01:33 I have also set Plots folder as my Working Directory.
01:40 Let us switch to Rstudio.
01:42 Open the script barPlots.R in RStudio.
01:49 Run this script by clicking on Source button.
01:53 movies data frame opens in the Source window.
01:58 It has 600 observations of 31 variables.
02:04 In the Source window, scroll from left to right.
02:10 This will enable us to see the remaining objects of movies data frame.
02:17 Now, we will learn how to draw a bar chart of the object named imdb underscore rating in movies.
02:27 A bar chart represents data in rectangular bars with length of the bar proportional to the value of the variable.
02:37 R uses the function barplot to create bar charts.
02:42 Let us switch to RStudio.
02:45 For the sake of simplicity, we are considering only the first 20 observations of movies to draw a bar chart.
02:54 Click on the script barPlots.R
02:58 In the Source window, type the following command.
03:02 Save the script and run the current line by pressing Ctrl + Enter keys simultaneously.
03:11 Let me resize the Source window.
03:14 moviesSub with 20 observations is loaded in the Environment.
03:21 Now, we draw a bar chart of imdb_rating for these movies.
03:28 In the Source window, type the following command.
03:34 Here, we have used the following arguments:
03:39 moviesSub dollar sign imdb underscore rating is the data for plotting
03:46 ylab and xlab for adding labels to the respective axes.
03:53 col to set the color of bins
03:57 ylim to set the range of values on Y-axis
04:02 main for adding a title to the bar chart.
04:07 Run the current line.
04:09 The bar chart is displayed with Movies on X-axis and their imdb_rating on Y-axis.
04:18 In the Plots window, click on Zoom to maximize the plot.
04:23 This particular movie has an IMDB rating of approximately 6.
04:31 Similarly, this particular movie has an IMDB rating of approximately 8.
04:39 However, we do not know the name of the movies.
04:44 So, we will add more arguments in barplot function to show the names of movies on X-axis.
04:52 Close this plot.
04:55 In the Source window, type the following command.
05:00 Here, we have used the argument names.arg and set it to title.
05:06 Remember, title column in moviesSub contains the names of movies.
05:13 Run the current line.
05:16 In the Plots window, click on Zoom to maximize the plot.
05:22 Now, the names of movies are displayed on the X-axis.
05:27 But not for all movies.
05:30 This is due to the point that the names are too long to be accommodated.
05:36 That’s why, we will make these names perpendicular to X-axis.
05:42 Close this plot.
05:44 In the Source window, type the following command.
05:49 Here, we have used las argument.
05:53 las equal to 2 produces labels which are at right angles to the axis.
06:01 Run the current line.
06:03 In the Plots window, click on Zoom to maximize the plot.
06:10 Now the names for all the movies are displayed on X-axis.
06:15 For example, Filly Brown has an IMDB rating of approximately 6.
06:23 However, longer names are being truncated.
06:28 We can add more arguments to barplot function for adjusting labels.
06:34 For more information, please refer to the Additional Material section on this website.
06:42 Close this plot.
06:44 In the Source window, click on movies.
06:47 Let us analyze the relation between imdb underscore rating
06:54 and audience underscore score.
06:58 For this, we will draw a scatter plot with these two objects by using plot function.
07:05 Remember, we have already learnt how to plot a single object.
07:11 Scatter plot is a graph in which the values of two variables are plotted along two axes.
07:18 The pattern of the resulting points reveals the correlation.
07:24 Let us switch to RStudio.
07:27 In the Source window, click on the script barPlots.R
07:32 In the Source window, type the following command.
07:39 Here, we have kept imdb underscore rating on the X-axis and audience underscore score on the Y-axis.
07:50 As imdb underscore rating of any movie varies between 0 and 10,
07:56 we have set the range of values on X-axis from 0 to 10.
08:02 Similarly, we have set the range of values on Y-axis from 0 to 100.
08:08 Save the script and run the current line.
08:13 In the Plots window, click on Zoom to maximize the plot.
08:18 We can observe that the movies having higher imdb underscore rating has a high audience underscore score.
08:28 Close this plot.
08:31 Now we will learn how to calculate the correlation coefficient between imdb underscore rating and audience underscore score.
08:42 For this, we use cor function.
08:46 In the Source window, type the following command.
08:50 Save the script and run the current line.
08:55 The correlation coefficient between imdb underscore rating and audience underscore score is evaluated as 0.865.
09:08 The value of correlation coefficient is always between -1 and +1.
09:15 A positive value indicates that the variables are positively related.
09:21 Let us summarize what we have learnt.
09:25 In this tutorial, we have learnt how to:
09:29 Plot bar charts
09:31 Plot scatter plot
09:34 Find the correlation coefficient between two objects
09:39 We now suggest an assignment.
09:43 Read the file moviesData.csv.
09:48 Create a bar chart of critics underscore score for the first 10 movies.
09:55 Create a scatter plot of imdb underscore rating and imdb underscore num underscore votes to see their relation.
10:08 Save both the plots.
10:11 The video at the following link summarises the Spoken Tutorial project.
10:19 We conduct workshops using Spoken Tutorials and give certificates.
10:24 Please contact us.
10:27 Please post your timed queries in this forum.
10:31 Please post your general queries in this forum.
10:35 The FOSSEE team coordinates the TBC project.
10:40 For more details, please visit these sites.
10:43 The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India
10:50 The script for this tutorial was contributed by Tushar Bajaj (TISS Mumbai).
10:58 This is Sudhakar Kumar from IIT Bombay signing off. Thanks for watching.

Contributors and Content Editors

Sakinashaikh