R/C2/Plotting-Bar-Charts-and-Scatter-Plot/English-timed
From Script | Spoken-Tutorial
| Time | Narration |
| 00:01 | Welcome to this tutorial on Plotting bar charts and scatter plot. |
| 00:08 | In this tutorial, we will learn how to: |
| 00:12 | Plot bar charts |
| 00:14 | Plot scatter plot |
| 00:18 | Find the correlation coefficient between two objects. |
| 00:22 | To understand this tutorial, you should know, |
| 00:27 | Data frames in R |
| 00:29 | Basics of Statistics |
| 00:33 | If not, please locate the relevant tutorials R on this website. |
| 00:40 | This tutorial is recorded on |
| 00:43 | Ubuntu Linux OS version 16.04 |
| 00:48 | R version 3.4.4 |
| 00:51 | RStudio version 1.1.463 |
| 00:57 | Install R version 3.2.0 or higher. |
| 01:02 | For this tutorial, we will use |
| 01:06 | A data frame moviesData.csv |
| 01:10 | A script file barPlots.R. |
| 01:15 | Please download these files from the Code files link of this tutorial. |
| 01:21 | I have downloaded and moved these files to Plots folder. |
| 01:28 | This folder is located in myProject folder on my Desktop. |
| 01:33 | I have also set Plots folder as my Working Directory. |
| 01:40 | Let us switch to Rstudio. |
| 01:42 | Open the script barPlots.R in RStudio. |
| 01:49 | Run this script by clicking on Source button. |
| 01:53 | movies data frame opens in the Source window. |
| 01:58 | It has 600 observations of 31 variables. |
| 02:04 | In the Source window, scroll from left to right. |
| 02:10 | This will enable us to see the remaining objects of movies data frame. |
| 02:17 | Now, we will learn how to draw a bar chart of the object named imdb underscore rating in movies. |
| 02:27 | A bar chart represents data in rectangular bars with length of the bar proportional to the value of the variable. |
| 02:37 | R uses the function barplot to create bar charts. |
| 02:42 | Let us switch to RStudio. |
| 02:45 | For the sake of simplicity, we are considering only the first 20 observations of movies to draw a bar chart. |
| 02:54 | Click on the script barPlots.R |
| 02:58 | In the Source window, type the following command. |
| 03:02 | Save the script and run the current line by pressing Ctrl + Enter keys simultaneously. |
| 03:11 | Let me resize the Source window. |
| 03:14 | moviesSub with 20 observations is loaded in the Environment. |
| 03:21 | Now, we draw a bar chart of imdb_rating for these movies. |
| 03:28 | In the Source window, type the following command. |
| 03:34 | Here, we have used the following arguments: |
| 03:39 | moviesSub dollar sign imdb underscore rating is the data for plotting |
| 03:46 | ylab and xlab for adding labels to the respective axes. |
| 03:53 | col to set the color of bins |
| 03:57 | ylim to set the range of values on Y-axis |
| 04:02 | main for adding a title to the bar chart. |
| 04:07 | Run the current line. |
| 04:09 | The bar chart is displayed with Movies on X-axis and their imdb_rating on Y-axis. |
| 04:18 | In the Plots window, click on Zoom to maximize the plot. |
| 04:23 | This particular movie has an IMDB rating of approximately 6. |
| 04:31 | Similarly, this particular movie has an IMDB rating of approximately 8. |
| 04:39 | However, we do not know the name of the movies. |
| 04:44 | So, we will add more arguments in barplot function to show the names of movies on X-axis. |
| 04:52 | Close this plot. |
| 04:55 | In the Source window, type the following command. |
| 05:00 | Here, we have used the argument names.arg and set it to title. |
| 05:06 | Remember, title column in moviesSub contains the names of movies. |
| 05:13 | Run the current line. |
| 05:16 | In the Plots window, click on Zoom to maximize the plot. |
| 05:22 | Now, the names of movies are displayed on the X-axis. |
| 05:27 | But not for all movies. |
| 05:30 | This is due to the point that the names are too long to be accommodated. |
| 05:36 | That’s why, we will make these names perpendicular to X-axis. |
| 05:42 | Close this plot. |
| 05:44 | In the Source window, type the following command. |
| 05:49 | Here, we have used las argument. |
| 05:53 | las equal to 2 produces labels which are at right angles to the axis. |
| 06:01 | Run the current line. |
| 06:03 | In the Plots window, click on Zoom to maximize the plot. |
| 06:10 | Now the names for all the movies are displayed on X-axis. |
| 06:15 | For example, Filly Brown has an IMDB rating of approximately 6. |
| 06:23 | However, longer names are being truncated. |
| 06:28 | We can add more arguments to barplot function for adjusting labels. |
| 06:34 | For more information, please refer to the Additional Material section on this website. |
| 06:42 | Close this plot. |
| 06:44 | In the Source window, click on movies. |
| 06:47 | Let us analyze the relation between imdb underscore rating |
| 06:54 | and audience underscore score. |
| 06:58 | For this, we will draw a scatter plot with these two objects by using plot function. |
| 07:05 | Remember, we have already learnt how to plot a single object. |
| 07:11 | Scatter plot is a graph in which the values of two variables are plotted along two axes. |
| 07:18 | The pattern of the resulting points reveals the correlation. |
| 07:24 | Let us switch to RStudio. |
| 07:27 | In the Source window, click on the script barPlots.R |
| 07:32 | In the Source window, type the following command. |
| 07:39 | Here, we have kept imdb underscore rating on the X-axis and audience underscore score on the Y-axis. |
| 07:50 | As imdb underscore rating of any movie varies between 0 and 10, |
| 07:56 | we have set the range of values on X-axis from 0 to 10. |
| 08:02 | Similarly, we have set the range of values on Y-axis from 0 to 100. |
| 08:08 | Save the script and run the current line. |
| 08:13 | In the Plots window, click on Zoom to maximize the plot. |
| 08:18 | We can observe that the movies having higher imdb underscore rating has a high audience underscore score. |
| 08:28 | Close this plot. |
| 08:31 | Now we will learn how to calculate the correlation coefficient between imdb underscore rating and audience underscore score. |
| 08:42 | For this, we use cor function. |
| 08:46 | In the Source window, type the following command. |
| 08:50 | Save the script and run the current line. |
| 08:55 | The correlation coefficient between imdb underscore rating and audience underscore score is evaluated as 0.865. |
| 09:08 | The value of correlation coefficient is always between -1 and +1. |
| 09:15 | A positive value indicates that the variables are positively related. |
| 09:21 | Let us summarize what we have learnt. |
| 09:25 | In this tutorial, we have learnt how to: |
| 09:29 | Plot bar charts |
| 09:31 | Plot scatter plot |
| 09:34 | Find the correlation coefficient between two objects |
| 09:39 | We now suggest an assignment. |
| 09:43 | Read the file moviesData.csv. |
| 09:48 | Create a bar chart of critics underscore score for the first 10 movies. |
| 09:55 | Create a scatter plot of imdb underscore rating and imdb underscore num underscore votes to see their relation. |
| 10:08 | Save both the plots. |
| 10:11 | The video at the following link summarises the Spoken Tutorial project. |
| 10:19 | We conduct workshops using Spoken Tutorials and give certificates. |
| 10:24 | Please contact us. |
| 10:27 | Please post your timed queries in this forum. |
| 10:31 | Please post your general queries in this forum. |
| 10:35 | The FOSSEE team coordinates the TBC project. |
| 10:40 | For more details, please visit these sites. |
| 10:43 | The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India |
| 10:50 | The script for this tutorial was contributed by Tushar Bajaj (TISS Mumbai). |
| 10:58 | This is Sudhakar Kumar from IIT Bombay signing off. Thanks for watching. |