Difference between revisions of "R"
Sudhakarst (Talk | contribs) |
Nancyvarkey (Talk | contribs) |
||
Line 5: | Line 5: | ||
'''R''' can be used for simple calculations, matrix calculations, differential equations, optimisation, statistical analysis, plotting graphs, etc. Also, it is useful to anybody who wishes to undertake extensive statistical computations and data visualization. | '''R''' can be used for simple calculations, matrix calculations, differential equations, optimisation, statistical analysis, plotting graphs, etc. Also, it is useful to anybody who wishes to undertake extensive statistical computations and data visualization. | ||
− | The | + | The spoken tutorials (ST) for R series was initially created by '''Prof. Kannan Moudgalya, IIT Bombay'''. Later, the domain expert for this series was '''Prof. Radhendushka Srivastava, Maths Dept. IIT Bombay'''. Content for this series was contributed by FOSSEE Fellows 2018 Shaik Sameer and Varshit Dubey and the tutorials were recorded by Sudhakar Kumar, M.Tech student IIT Bombay. Overall coordination for the series was done by Smita Wangikar from FOSSEE project, IIT Bombay. Madhuri Ganapati and Vidhya Iyer from Spoken Tutorial project, IIT Bombay, were the reviewers from ST end. |
Line 12: | Line 12: | ||
'''Note:''' Each numbered topic corresponds to a single spoken tutorial. Each bulleted point corresponds to a command or topic that must be covered in the given spoken tutorial. | '''Note:''' Each numbered topic corresponds to a single spoken tutorial. Each bulleted point corresponds to a command or topic that must be covered in the given spoken tutorial. | ||
− | + | #'''Installing R and RStudio on Linux''' | |
− | + | #*Install '''R''' on Linux | |
− | * Install '''R''' on Linux | + | #*Use the command-line interface of '''R''' |
− | + | #*Show the value of the exponential function in R | |
− | * Use the command-line interface of '''R''' | + | #*Install <code>wget</code> utility |
− | + | #*Install <code>gdebi</code> utility | |
− | * Show the value of the exponential function in R | + | #*Install '''RStudio''' on Linux |
− | + | #*Launch '''RStudio''' on Linux | |
− | * Install <code>wget</code> utility | + | #*Run a plot in '''RStudio''' |
− | + | #*View packages in '''RStudio''' | |
− | * Install <code>gdebi</code> utility | + | #*Install packages |
− | + | #'''Introduction to basics of R''' | |
− | * Install '''RStudio''' on Linux | + | #*Version of '''R''' and '''RStudio''' used |
− | + | #*Operating systems on which these run | |
− | * Launch '''RStudio''' on Linux | + | #*Quick intro to '''R''' and '''RStudio''' |
− | + | #*Resizing the font and window size | |
− | * Run a plot in '''RStudio''' | + | #*Using <code>+</code>, <code>-</code>, <code>^</code>, <code>sqrt</code> |
− | + | #*Using <code>exp</code>, <code>log</code>, <code>sin</code> | |
− | * View packages in '''RStudio''' | + | #*Different ways of invoking <code>log</code> |
− | + | #*Vectors using <code>seq</code> and <code>length</code> | |
− | * Install packages | + | #*Using <code>pi</code> |
− | + | #*Plotting a <code>sine</code> function | |
− | + | #*Defining more points to get a smooth plot | |
− | + | #*Plotting with points and as line | |
− | * Version of '''R''' and '''RStudio''' used | + | #*Introduction to '''help''' |
− | + | #'''Introduction to data frames in R''' | |
− | * Operating systems on which these run | + | #*Storing captaincy information in '''vectors''' |
− | + | #*Constructing a '''data frame''' using '''vectors''' | |
− | * Quick intro to '''R''' and '''RStudio''' | + | #*Plotting one vector of a '''data frame''' vs. another one |
− | + | #*Adding a vector to a '''data frame''' | |
− | * Resizing the font and window size | + | #*Saving a data frame into a '''csv''' file |
− | + | #*Preventing the writing of row numbers into the '''csv''' file | |
− | * Using <code>+</code>, <code>-</code>, <code>^</code>, <code>sqrt</code> | + | #*Changing the contents of a '''csv''' file through a text editor |
− | + | #*Loading a '''csv''' file into a '''data frame''' | |
− | * Using <code>exp</code>, <code>log</code>, <code>sin</code> | + | #*Accessing the data sets that come with '''R''' |
− | + | #'''Introduction to RStudio''' | |
− | * Different ways of invoking <code>log</code> | + | #*Features of '''RStudio''' |
− | + | #*A look at the windows in '''RStudio''' interface: | |
− | * Vectors using <code>seq</code> and <code>length</code> | + | #**'''Source''' and '''Console''' windows |
− | + | #**'''Workspace''' window | |
− | * Using <code>pi</code> | + | #**'''Plots''' and '''Files''' window |
− | + | #*Example to plot a simple data set | |
− | * Plotting a <code>sine</code> function | + | #*Introduction to packages in '''R''' |
− | + | #*How to find the list of packages installed in '''R''' | |
− | * Defining more points to get a smooth plot | + | #*Installation of '''R''' packages in '''RStudio''' |
− | + | #*Loading and using '''R''' packages | |
− | * Plotting with points and as line | + | #'''Introduction to R script''' |
− | + | #*What is an '''R''' script | |
− | * Introduction to '''help''' | + | #*Features of '''R''' script |
− | + | #*How to create and save an '''R''' script from the user interface (UI) of '''RStudio''' | |
− | + | #*Shortcut keys to create an '''R''' script | |
− | + | #*How to use auto-completion of commands | |
− | + | #*How to run an entire script | |
− | * Storing captaincy information in '''vectors''' | + | #*How to run a block of a script |
− | + | #*How to add comments | |
− | * Constructing a '''data frame''' using '''vectors''' | + | #*How to comment an existing line |
− | + | #*How to load one script into another script | |
− | * Plotting one vector of a '''data frame''' vs. another one | + | #'''Working Directories in RStudio''' |
− | + | #*What is working directory in '''R''' | |
− | * Adding a vector to a '''data frame''' | + | #*How to know current working directory |
− | + | #*How to use <code>getwd</code> function | |
− | * Saving a data frame into a '''csv''' file | + | #*How to set a working directory from the user interface of '''RStudio''' |
− | + | #*How to set a working directory from the '''Console''' window of '''RStudio''' | |
− | * Preventing the writing of row numbers into the '''csv''' file | + | #*How to use <code>setwd</code> function |
− | + | #*How to read and store a '''csv''' file in '''R''' | |
− | * Changing the contents of a '''csv''' file through a text editor | + | #*How to use <code>read.csv</code> function |
− | + | #*How to view a stored '''csv''' file in '''R''' | |
− | * Loading a '''csv''' file into a '''data frame''' | + | #*How to use <code>View</code> function |
− | + | #'''Indexing and Slicing Data Frames''' | |
− | * Accessing the data sets that come with '''R''' | + | #*Shortcut key for assignment operator (<code><-</code>) |
− | + | #*How to perform numeric indexing | |
− | + | #*How to extract a row or column from a data frame | |
− | + | #*How to retrieve multiple rows from a data frame | |
− | + | #*How to combine objects to form a vector | |
− | * Features of '''RStudio''' | + | #*How to perform logical indexing on a data frame |
− | + | #*How to perform name indexing on a data frame | |
− | * A look at the windows in '''RStudio''' interface: | + | #*How to slice a data frame using <code>subset</code> function |
− | ** '''Source''' and '''Console''' windows | + | #*How to select required columns (by name) from a data frame |
− | ** '''Workspace''' window | + | #*How to retrieve data using double square brackets |
− | ** '''Plots''' and '''Files''' window | + | #'''Creating Matrices using Data Frames''' |
− | + | #*Data required in a matrix format | |
− | * Example to plot a simple data set | + | #*Convert a data frame into a matrix |
− | + | #*Create a matrix with known data | |
− | * Introduction to packages in '''R''' | + | #*Add two matrices |
− | + | #*Subtract two matrices | |
− | * How to find the list of packages installed in '''R''' | + | #*Multiply two matrices element wise |
− | + | #*Perform true matrix multiplication | |
− | * Installation of '''R''' packages in '''RStudio''' | + | #*Calculate the transpose of a matrix |
− | + | #*Calculate the determinant of a matrix | |
− | * Loading and using '''R''' packages | + | #'''Operations on Matrices and Data Frames''' |
− | + | #*How to find the inverse of a matrix | |
− | + | #*How to calculate the sum of elements in a matrix using <code>for</code> loop | |
− | + | #*How to calculate the sum of elements in a matrix using <code>sum</code> function | |
− | + | #*How to calculate the time elapsed in an operation | |
− | * What is an '''R''' script | + | #*How to find out the sum of rows of a matrix |
− | + | #*How to find out the sum of columns of a matrix | |
− | * Features of '''R''' script | + | #*How to add a new column or row to an existing data-frame |
− | + | #*How to use <code>cbind</code> and <code>rbind</code> function | |
− | * How to create and save an '''R''' script from the user interface (UI) of '''RStudio''' | + | #'''Merging and Importing Data''' |
− | + | #*Use of built-in functions in '''R''' for exploring a data frame | |
− | * Shortcut keys to create an '''R''' script | + | #*Access help in '''RStudio''' |
− | + | #*Advantages of merging data frames | |
− | * How to use auto-completion of commands | + | #*Merge two data frames |
− | + | #*Import data from the command line | |
− | * How to run an entire script | + | #*Import <code>xml</code> file and <code>txt</code> file in R |
− | + | #*Import data from the user interface of '''RStudio''' | |
− | * How to run a block of a script | + | #'''Data Types and Factors''' |
− | + | #*What is an object in '''R''' | |
− | * How to add comments | + | #*Types of '''R''' - objects |
− | + | #*What is an atomic vector in '''R''' | |
− | * How to comment an existing line | + | #*Types of atomic vectors |
− | + | #*How to find types of vectors | |
− | * How to load one script into another script | + | #*Factors in '''R''' |
− | + | #*Levels of a factor in '''R''' | |
− | + | #*Identification of categorical variables | |
− | + | #*How to change the type of a vector | |
− | + | #*How to change the values of levels | |
− | * What is working directory in '''R''' | + | #'''Lists and its Operations''' |
− | + | #*Lists in '''R''' | |
− | * How to know current working directory | + | #*Atomic vectors in '''R''' |
− | + | #*Difference between atomic vectors and lists in '''R''' | |
− | * How to use <code>getwd</code> function | + | #*How to create a list |
− | + | #*How to assign names to the elements of a list | |
− | * How to set a working directory from the user interface of '''RStudio''' | + | #*Named list in '''R''' |
− | + | #*How to access elements of a list by its index | |
− | * How to set a working directory from the '''Console''' window of '''RStudio''' | + | #*How to access an element of a list by its name |
− | + | #*How to access an element of an element of a list | |
− | * How to use <code>setwd</code> function | + | #*Combine two different lists |
− | + | #'''Plotting Histograms and Pie Chart''' | |
− | * How to read and store a '''csv''' file in '''R''' | + | #*How to find the dimensions of a data frame |
− | + | #*Define a histogram | |
− | * How to use <code>read.csv</code> function | + | #*Plot a histogram in '''R''' |
− | + | #*Add labels to the histogram | |
− | * How to view a stored '''csv''' file in '''R''' | + | #*Add color to the bins of a histogram |
− | + | #*Change the number of breaks in the histogram | |
− | * How to use <code>View</code> function | + | #*Define a pie chart |
− | + | #*Plotting a pie chart in '''R''' | |
− | + | #*Add a label to the pie chart | |
− | + | #*Saving the plot as an image | |
− | + | #'''Plotting Bar Charts and Scatter Plot''' | |
− | * Shortcut key for assignment operator (<code><-</code>) | + | #*What is a bar chart |
− | + | #*Draw a bar chart | |
− | * How to perform numeric indexing | + | #*Use the <code>barplot</code> function |
− | + | #*Add labels to the bar chart | |
− | * How to extract a row or column from a data frame | + | #*Adjust the labels of the bar chart |
− | + | #*What is a scatter plot | |
− | * How to retrieve multiple rows from a data frame | + | #*Draw a scatter plot |
− | + | #*Use <code>plot</code> function with two objects | |
− | * How to combine objects to form a vector | + | #*Find the correlation coefficient |
− | + | #*Range of correlation coefficient | |
− | * How to perform logical indexing on a data frame | + | #'''Introduction to ggplot2''' |
− | + | #*Define visualization | |
− | * How to perform name indexing on a data frame | + | #*About grammar of graphics - '''ggplot2''' |
− | + | #*Use of the <code>plot</code> function | |
− | * How to slice a data frame using <code>subset</code> function | + | #*Add labels to a plot |
− | + | #*Change the color and type of plot | |
− | * How to select required columns (by name) from a data frame | + | #*Plot two graphs in the same plot |
− | + | #*Add a legend to the plot | |
− | * How to retrieve data using double square brackets | + | #*About '''ggplot2''' package |
− | + | #*Draw a scatter plot using <code>ggplot</code> function | |
− | + | #*Save plots using <code>ggsave</code> function | |
− | + | #'''Aesthetic Mapping in ggplot2''' | |
− | + | #*Define aesthetic | |
− | * Data required in a matrix format | + | #*Need for aesthetic in plotting |
− | + | #*Draw a scatter plot | |
− | * Convert a data frame into a matrix | + | #*Customize a scatter plot |
− | + | #*View the structure of an object | |
− | * Create a matrix with known data | + | #*View the levels of a categorical variable |
− | + | #*Draw a bar chart using <code>ggplot</code> | |
− | * Add two matrices | + | #*Add labels to a plot in <code>ggplot</code> |
− | + | #*Use the <code>fill</code> argument in aesthetic mapping | |
− | * Subtract two matrices | + | #*Draw a histogram using <code>ggplot</code> |
− | + | #'''Data Manipulation using dplyr Package''' | |
− | * Multiply two matrices element wise | + | #*What is data visualization |
− | + | #*Need for data manipulation | |
− | * Perform true matrix multiplication | + | #*What is '''dplyr''' package |
− | + | #*Functions in '''dplyr''' package | |
− | * Calculate the transpose of a matrix | + | #*Install '''dplyr''' package |
− | + | #*Use <code>filter</code> function | |
− | * Calculate the determinant of a matrix | + | #*Use <code>filter</code> function with a logical operator |
− | + | #*Use '''match''' operator | |
− | + | #*Use <code>arrange</code> function for ascending order | |
− | + | #*Use <code>arrange</code> function for descending order | |
− | + | #'''More functions in the dplyr Package''' | |
− | * How to find the inverse of a matrix | + | #*Functions in the '''dplyr''' package |
− | + | #*Select multiple variables in a data frame | |
− | * How to calculate the sum of elements in a matrix using <code>for</code> loop | + | #*Remove variables from a data frame |
− | + | #*Use of <code>select</code> function | |
− | * How to calculate the sum of elements in a matrix using <code>sum</code> function | + | #*Use of <code>starts_with</code> function |
− | + | #*Change the name of a variable | |
− | * How to calculate the time elapsed in an operation | + | #*Use of <code>rename</code> function |
− | + | #*Create a new variable from existing variables | |
− | * How to find out the sum of rows of a matrix | + | #*Use of <code>mutate</code> function |
− | + | #*Property of <code>mutate</code> function | |
− | * How to find out the sum of columns of a matrix | + | #'''Pipe Operator''' |
− | + | #*About <code>summarise</code> function in '''dplyr''' package | |
− | * How to add a new column or row to an existing data-frame | + | #*About <code>group_by</code> function in '''dplyr''' package |
− | + | #*Difference between <code>summarise</code> and <code>group_by</code> functions | |
− | * How to use <code>cbind</code> and <code>rbind</code> function | + | #*Use <code>summarise</code> and <code>group_by</code> functions together |
− | + | #*About '''pipe''' operator | |
− | + | #*Examples of '''pipe''' operator | |
− | + | #*Benefits of using '''pipe''' operator | |
− | + | #*Use '''ggplot2''' and '''dplyr''' package together using '''pipe''' | |
− | *Use of built-in functions in '''R''' for exploring a data frame | + | #*Plot '''boxplot''' |
− | + | #*Use <code>count</code> in <code>summarise</code> function | |
− | *Access help in '''RStudio''' | + | #'''Conditional Statements''' |
− | + | #*About conditional statements | |
− | *Advantages of merging data frames | + | #*Syntax of <code>if</code>, <code>else</code> and <code>else if</code> statements |
− | + | #*Use <code>if</code>, <code>else</code> and <code>else if</code> statements | |
− | *Merge two data frames | + | #*Use <code>ifelse</code> function |
− | + | #*Arguments of <code>ifelse</code> function | |
− | *Import data from the command line | + | #*Add a new column in an existing data frame |
− | + | #*Read and store a <code>csv</code> file | |
− | *Import <code>xml</code> file and <code>txt</code> file in R | + | #*View a data frame |
− | + | #*Count true values in a column | |
− | *Import data from the user interface of '''RStudio''' | + | #*Use <code>sum</code> function |
− | + | #'''Functions in R''' | |
− | + | #*About functions | |
− | + | #*About '''built-in''' functions and '''user-defined''' functions | |
− | + | #*Need for '''user-defined''' functions | |
− | * What is an object in '''R''' | + | #*Syntax of a function |
− | + | #*Parts of a function | |
− | * Types of '''R''' - objects | + | #*Create a '''user-defined''' function with arguments |
− | + | #*Create a '''user-defined''' function without arguments | |
− | * What is an atomic vector in '''R''' | + | #*About <code>readline</code> function |
− | + | #*Scope of variables | |
− | * Types of atomic vectors | + | #*Use the <code>return</code> function |
− | + | ||
− | * How to find types of vectors | + | |
− | + | ||
− | * Factors in '''R''' | + | |
− | + | ||
− | * Levels of a factor in '''R''' | + | |
− | + | ||
− | * Identification of categorical variables | + | |
− | + | ||
− | * How to change the type of a vector | + | |
− | + | ||
− | * How to change the values of levels | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * Lists in '''R''' | + | |
− | + | ||
− | * Atomic vectors in '''R''' | + | |
− | + | ||
− | * Difference between atomic vectors and lists in '''R''' | + | |
− | + | ||
− | * How to create a list | + | |
− | + | ||
− | * How to assign names to the elements of a list | + | |
− | + | ||
− | * Named list in '''R''' | + | |
− | + | ||
− | * How to access elements of a list by its index | + | |
− | + | ||
− | * How to access an element of a list by its name | + | |
− | + | ||
− | * How to access an element of an element of a list | + | |
− | + | ||
− | * Combine two different lists | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * How to find the dimensions of a data frame | + | |
− | + | ||
− | * Define a histogram | + | |
− | + | ||
− | * Plot a histogram in '''R''' | + | |
− | + | ||
− | * Add labels to the histogram | + | |
− | + | ||
− | * Add color to the bins of a histogram | + | |
− | + | ||
− | * Change the number of breaks in the histogram | + | |
− | + | ||
− | * Define a pie chart | + | |
− | + | ||
− | * Plotting a pie chart in '''R''' | + | |
− | + | ||
− | * Add a label to the pie chart | + | |
− | + | ||
− | * Saving the plot as an image | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * What is a bar chart | + | |
− | + | ||
− | * Draw a bar chart | + | |
− | + | ||
− | * Use the <code>barplot</code> function | + | |
− | + | ||
− | * Add labels to the bar chart | + | |
− | + | ||
− | * Adjust the labels of the bar chart | + | |
− | + | ||
− | * What is a scatter plot | + | |
− | + | ||
− | * Draw a scatter plot | + | |
− | + | ||
− | * Use <code>plot</code> function with two objects | + | |
− | + | ||
− | * Find the correlation coefficient | + | |
− | + | ||
− | * Range of correlation coefficient | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * Define visualization | + | |
− | + | ||
− | * About grammar of graphics - '''ggplot2''' | + | |
− | + | ||
− | * Use of the <code>plot</code> function | + | |
− | + | ||
− | * Add labels to a plot | + | |
− | + | ||
− | * Change the color and type of plot | + | |
− | + | ||
− | * Plot two graphs in the same plot | + | |
− | + | ||
− | * Add a legend to the plot | + | |
− | + | ||
− | * About '''ggplot2''' package | + | |
− | + | ||
− | * Draw a scatter plot using <code>ggplot</code> function | + | |
− | + | ||
− | * Save plots using <code>ggsave</code> function | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * Define aesthetic | + | |
− | + | ||
− | * Need for aesthetic in plotting | + | |
− | + | ||
− | * Draw a scatter plot | + | |
− | + | ||
− | * Customize a scatter plot | + | |
− | + | ||
− | * View the structure of an object | + | |
− | + | ||
− | * View the levels of a categorical variable | + | |
− | + | ||
− | * Draw a bar chart using <code>ggplot</code> | + | |
− | + | ||
− | * Add labels to a plot in <code>ggplot</code> | + | |
− | + | ||
− | * Use the <code>fill</code> argument in aesthetic mapping | + | |
− | + | ||
− | * Draw a histogram using <code>ggplot</code> | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * What is data visualization | + | |
− | + | ||
− | * Need for data manipulation | + | |
− | + | ||
− | * What is '''dplyr''' package | + | |
− | + | ||
− | * Functions in '''dplyr''' package | + | |
− | + | ||
− | * Install '''dplyr''' package | + | |
− | + | ||
− | * Use <code>filter</code> function | + | |
− | + | ||
− | * Use <code>filter</code> function with a logical operator | + | |
− | + | ||
− | * Use '''match''' operator | + | |
− | + | ||
− | * Use <code>arrange</code> function for ascending order | + | |
− | + | ||
− | * Use <code>arrange</code> function for descending order | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * Functions in the '''dplyr''' package | + | |
− | + | ||
− | * Select multiple variables in a data frame | + | |
− | + | ||
− | * Remove variables from a data frame | + | |
− | + | ||
− | * Use of <code>select</code> function | + | |
− | + | ||
− | * Use of <code>starts_with</code> function | + | |
− | + | ||
− | * Change the name of a variable | + | |
− | + | ||
− | * Use of <code>rename</code> function | + | |
− | + | ||
− | * Create a new variable from existing variables | + | |
− | + | ||
− | * Use of <code>mutate</code> function | + | |
− | + | ||
− | * Property of <code>mutate</code> function | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * About <code>summarise</code> function in '''dplyr''' package | + | |
− | + | ||
− | * About <code>group_by</code> function in '''dplyr''' package | + | |
− | + | ||
− | * Difference between <code>summarise</code> and <code>group_by</code> functions | + | |
− | + | ||
− | * Use <code>summarise</code> and <code>group_by</code> functions together | + | |
− | + | ||
− | * About '''pipe''' operator | + | |
− | + | ||
− | * Examples of '''pipe''' operator | + | |
− | + | ||
− | * Benefits of using '''pipe''' operator | + | |
− | + | ||
− | * Use '''ggplot2''' and '''dplyr''' package together using '''pipe''' | + | |
− | + | ||
− | * Plot '''boxplot''' | + | |
− | + | ||
− | * Use <code>count</code> in <code>summarise</code> function | + | |
− | + | ||
− | + | ||
− | + | ||
− | * About conditional statements | + | |
− | + | ||
− | * Syntax of <code>if</code>, <code>else</code> and <code>else if</code> statements | + | |
− | + | ||
− | * Use <code>if</code>, <code>else</code> and <code>else if</code> statements | + | |
− | + | ||
− | * Use <code>ifelse</code> function | + | |
− | + | ||
− | * Arguments of <code>ifelse</code> function | + | |
− | + | ||
− | * Add a new column in an existing data frame | + | |
− | + | ||
− | * Read and store a <code>csv</code> file | + | |
− | + | ||
− | * View a data frame | + | |
− | + | ||
− | * Count true values in a column | + | |
− | + | ||
− | * Use <code>sum</code> function | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * About functions | + | |
− | + | ||
− | * About '''built-in''' functions and '''user-defined''' functions | + | |
− | + | ||
− | * Need for '''user-defined''' functions | + | |
− | + | ||
− | * Syntax of a function | + | |
− | + | ||
− | * Parts of a function | + | |
− | + | ||
− | * Create a '''user-defined''' function with arguments | + | |
− | + | ||
− | * Create a '''user-defined''' function without arguments | + | |
− | + | ||
− | * About <code>readline</code> function | + | |
− | + | ||
− | * Scope of variables | + | |
− | + | ||
− | * Use the <code>return</code> function | + |
Revision as of 17:38, 14 October 2019
R ( http://www.r-project.org/) is an open source software - a well organized and sophisticated package - that facilitates data analysis, modeling, inferential testing and forecasting. It is a user friendly software which allows to create new function commands to solve statistical problems. It runs on a variety of UNIX platforms (and similar systems such as LINUX), Windows and Mac OS.
R is the most preferred open-source language for analytics and data science. At Microsoft, R is used by its data scientists, who apply machine learning to data from Bing, Azure, Office, and the Sales, Marketing, and Finance departments. Twitter has been using R for measuring user-experience. On the other hand, the cross-platform compatibility of R and its capacity to handle large and complex data sets make it an ideal tool for academicians to analyze data in their labs.
R can be used for simple calculations, matrix calculations, differential equations, optimisation, statistical analysis, plotting graphs, etc. Also, it is useful to anybody who wishes to undertake extensive statistical computations and data visualization.
The spoken tutorials (ST) for R series was initially created by Prof. Kannan Moudgalya, IIT Bombay. Later, the domain expert for this series was Prof. Radhendushka Srivastava, Maths Dept. IIT Bombay. Content for this series was contributed by FOSSEE Fellows 2018 Shaik Sameer and Varshit Dubey and the tutorials were recorded by Sudhakar Kumar, M.Tech student IIT Bombay. Overall coordination for the series was done by Smita Wangikar from FOSSEE project, IIT Bombay. Madhuri Ganapati and Vidhya Iyer from Spoken Tutorial project, IIT Bombay, were the reviewers from ST end.
Note: Each numbered topic corresponds to a single spoken tutorial. Each bulleted point corresponds to a command or topic that must be covered in the given spoken tutorial.
- Installing R and RStudio on Linux
- Install R on Linux
- Use the command-line interface of R
- Show the value of the exponential function in R
- Install
wget
utility - Install
gdebi
utility - Install RStudio on Linux
- Launch RStudio on Linux
- Run a plot in RStudio
- View packages in RStudio
- Install packages
- Introduction to basics of R
- Version of R and RStudio used
- Operating systems on which these run
- Quick intro to R and RStudio
- Resizing the font and window size
- Using
+
,-
,^
,sqrt
- Using
exp
,log
,sin
- Different ways of invoking
log
- Vectors using
seq
andlength
- Using
pi
- Plotting a
sine
function - Defining more points to get a smooth plot
- Plotting with points and as line
- Introduction to help
- Introduction to data frames in R
- Storing captaincy information in vectors
- Constructing a data frame using vectors
- Plotting one vector of a data frame vs. another one
- Adding a vector to a data frame
- Saving a data frame into a csv file
- Preventing the writing of row numbers into the csv file
- Changing the contents of a csv file through a text editor
- Loading a csv file into a data frame
- Accessing the data sets that come with R
- Introduction to RStudio
- Features of RStudio
- A look at the windows in RStudio interface:
- Source and Console windows
- Workspace window
- Plots and Files window
- Example to plot a simple data set
- Introduction to packages in R
- How to find the list of packages installed in R
- Installation of R packages in RStudio
- Loading and using R packages
- Introduction to R script
- What is an R script
- Features of R script
- How to create and save an R script from the user interface (UI) of RStudio
- Shortcut keys to create an R script
- How to use auto-completion of commands
- How to run an entire script
- How to run a block of a script
- How to add comments
- How to comment an existing line
- How to load one script into another script
- Working Directories in RStudio
- What is working directory in R
- How to know current working directory
- How to use
getwd
function - How to set a working directory from the user interface of RStudio
- How to set a working directory from the Console window of RStudio
- How to use
setwd
function - How to read and store a csv file in R
- How to use
read.csv
function - How to view a stored csv file in R
- How to use
View
function
- Indexing and Slicing Data Frames
- Shortcut key for assignment operator (
<-
) - How to perform numeric indexing
- How to extract a row or column from a data frame
- How to retrieve multiple rows from a data frame
- How to combine objects to form a vector
- How to perform logical indexing on a data frame
- How to perform name indexing on a data frame
- How to slice a data frame using
subset
function - How to select required columns (by name) from a data frame
- How to retrieve data using double square brackets
- Shortcut key for assignment operator (
- Creating Matrices using Data Frames
- Data required in a matrix format
- Convert a data frame into a matrix
- Create a matrix with known data
- Add two matrices
- Subtract two matrices
- Multiply two matrices element wise
- Perform true matrix multiplication
- Calculate the transpose of a matrix
- Calculate the determinant of a matrix
- Operations on Matrices and Data Frames
- How to find the inverse of a matrix
- How to calculate the sum of elements in a matrix using
for
loop - How to calculate the sum of elements in a matrix using
sum
function - How to calculate the time elapsed in an operation
- How to find out the sum of rows of a matrix
- How to find out the sum of columns of a matrix
- How to add a new column or row to an existing data-frame
- How to use
cbind
andrbind
function
- Merging and Importing Data
- Use of built-in functions in R for exploring a data frame
- Access help in RStudio
- Advantages of merging data frames
- Merge two data frames
- Import data from the command line
- Import
xml
file andtxt
file in R - Import data from the user interface of RStudio
- Data Types and Factors
- What is an object in R
- Types of R - objects
- What is an atomic vector in R
- Types of atomic vectors
- How to find types of vectors
- Factors in R
- Levels of a factor in R
- Identification of categorical variables
- How to change the type of a vector
- How to change the values of levels
- Lists and its Operations
- Lists in R
- Atomic vectors in R
- Difference between atomic vectors and lists in R
- How to create a list
- How to assign names to the elements of a list
- Named list in R
- How to access elements of a list by its index
- How to access an element of a list by its name
- How to access an element of an element of a list
- Combine two different lists
- Plotting Histograms and Pie Chart
- How to find the dimensions of a data frame
- Define a histogram
- Plot a histogram in R
- Add labels to the histogram
- Add color to the bins of a histogram
- Change the number of breaks in the histogram
- Define a pie chart
- Plotting a pie chart in R
- Add a label to the pie chart
- Saving the plot as an image
- Plotting Bar Charts and Scatter Plot
- What is a bar chart
- Draw a bar chart
- Use the
barplot
function - Add labels to the bar chart
- Adjust the labels of the bar chart
- What is a scatter plot
- Draw a scatter plot
- Use
plot
function with two objects - Find the correlation coefficient
- Range of correlation coefficient
- Introduction to ggplot2
- Define visualization
- About grammar of graphics - ggplot2
- Use of the
plot
function - Add labels to a plot
- Change the color and type of plot
- Plot two graphs in the same plot
- Add a legend to the plot
- About ggplot2 package
- Draw a scatter plot using
ggplot
function - Save plots using
ggsave
function
- Aesthetic Mapping in ggplot2
- Define aesthetic
- Need for aesthetic in plotting
- Draw a scatter plot
- Customize a scatter plot
- View the structure of an object
- View the levels of a categorical variable
- Draw a bar chart using
ggplot
- Add labels to a plot in
ggplot
- Use the
fill
argument in aesthetic mapping - Draw a histogram using
ggplot
- Data Manipulation using dplyr Package
- What is data visualization
- Need for data manipulation
- What is dplyr package
- Functions in dplyr package
- Install dplyr package
- Use
filter
function - Use
filter
function with a logical operator - Use match operator
- Use
arrange
function for ascending order - Use
arrange
function for descending order
- More functions in the dplyr Package
- Functions in the dplyr package
- Select multiple variables in a data frame
- Remove variables from a data frame
- Use of
select
function - Use of
starts_with
function - Change the name of a variable
- Use of
rename
function - Create a new variable from existing variables
- Use of
mutate
function - Property of
mutate
function
- Pipe Operator
- About
summarise
function in dplyr package - About
group_by
function in dplyr package - Difference between
summarise
andgroup_by
functions - Use
summarise
andgroup_by
functions together - About pipe operator
- Examples of pipe operator
- Benefits of using pipe operator
- Use ggplot2 and dplyr package together using pipe
- Plot boxplot
- Use
count
insummarise
function
- About
- Conditional Statements
- About conditional statements
- Syntax of
if
,else
andelse if
statements - Use
if
,else
andelse if
statements - Use
ifelse
function - Arguments of
ifelse
function - Add a new column in an existing data frame
- Read and store a
csv
file - View a data frame
- Count true values in a column
- Use
sum
function
- Functions in R
- About functions
- About built-in functions and user-defined functions
- Need for user-defined functions
- Syntax of a function
- Parts of a function
- Create a user-defined function with arguments
- Create a user-defined function without arguments
- About
readline
function - Scope of variables
- Use the
return
function