Difference between revisions of "Machine-Learning-using-R - old 2022/C3/Quadratic-Discriminant-Analysis-in-R/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
(Created page with "Title of the script: Quadratic Discriminant Analysis in R Author: Tanmay Srinath Keywords: R, RStudio, machine learning, QDA, quadratic discriminant analysis, LDA, heterosc...")
 
Line 14: Line 14:
  
 
'''Opening Slide'''
 
'''Opening Slide'''
|| Welcome to this spoken tutorial on''' Quadratic Discriminant Analysis in R'''.
+
|| Welcome to this spoken tutorial on ''' Quadratic Discriminant Analysis in R'''.
 
|-  
 
|-  
 
|| '''Show slide'''
 
|| '''Show slide'''
Line 21: Line 21:
  
 
|| In this tutorial, we will learn about:  
 
|| In this tutorial, we will learn about:  
 +
 
* Quadratic Discriminant Analysis or QDA.
 
* Quadratic Discriminant Analysis or QDA.
 +
 
* Differences between '''linear discriminant analysis ''' and  '''quadratic discriminant analysis'''.
 
* Differences between '''linear discriminant analysis ''' and  '''quadratic discriminant analysis'''.
 +
 
* When to use '''quadratic discriminant analysis'''.
 
* When to use '''quadratic discriminant analysis'''.
* Implementation of '''quadratic discriminant analysis '''in '''R'''.
+
 
 +
* Implementation of '''quadratic discriminant analysis ''' in '''R'''.
  
 
|-  
 
|-  
Line 31: Line 35:
 
'''System Specifications'''
 
'''System Specifications'''
 
|| This tutorial is recorded using,
 
|| This tutorial is recorded using,
 +
 
* '''Ubuntu Linux ''' OS version 20.04
 
* '''Ubuntu Linux ''' OS version 20.04
 
* '''R ''' version''' 4.1.2
 
* '''R ''' version''' 4.1.2
Line 43: Line 48:
 
'''https://spoken-tutorial.org'''
 
'''https://spoken-tutorial.org'''
 
|| To follow this tutorial, the learner should know:
 
|| To follow this tutorial, the learner should know:
 +
 
* Basic programming in '''R'''.
 
* Basic programming in '''R'''.
 +
 
* '''Machine Learning ''' in '''R'''.
 
* '''Machine Learning ''' in '''R'''.
 
  
 
If not, please access the relevant tutorials on this website.
 
If not, please access the relevant tutorials on this website.
Line 53: Line 59:
 
'''Quadratic Discriminant Analysis'''
 
'''Quadratic Discriminant Analysis'''
 
|| '''Quadratic discriminant analysis '''.
 
|| '''Quadratic discriminant analysis '''.
* It is the discriminant analysis that is performed on''' heteroscedastic gaussian data'''.
+
 
* It is used when the covariance structures of the classes are different.  
+
* It is the discriminant analysis that is performed on ''' heteroscedastic gaussian data'''.
 +
 
 +
* It is used when the '''covariance structures''' of the classes are different.  
  
  
Line 62: Line 70:
 
'''Differences between LDA and QDA'''
 
'''Differences between LDA and QDA'''
 
|| Differences between''' LDA ''' and '''QDA'''.
 
|| Differences between''' LDA ''' and '''QDA'''.
* '''LDA''' assumes that each class has the same covariance matrix.
+
 
* On the other hand, '''QDA''' assumes that each class has a different covariance matrix.
+
* '''LDA''' assumes that each class has the same '''covariance matrix'''.
 +
 
 +
* On the other hand, '''QDA''' assumes that each class has a different '''covariance matrix'''.
 +
 
 
* '''LDA''' constructs a linear boundary, while '''QDA ''' constructs an elliptical boundary.
 
* '''LDA''' constructs a linear boundary, while '''QDA ''' constructs an elliptical boundary.
* When the covariance matrices of different classes are the same, '''QDA '''reduces to '''LDA'''.
+
 
 +
* When the '''covariance matrices''' of different classes are the same, '''QDA '''reduces to '''LDA'''.
  
  
Line 82: Line 94:
 
|| We will use a script file '''QDA.R'''
 
|| We will use a script file '''QDA.R'''
  
Please download this file from the''' Code files''' link of this tutorial.
+
Please download this file from the ''' Code files''' link of this tutorial.
  
 
Make a copy and then use it for practising.
 
Make a copy and then use it for practising.
Line 92: Line 104:
  
 
'''QDA.R '''and the folder '''QDA''' folder.
 
'''QDA.R '''and the folder '''QDA''' folder.
|| I have downloaded and moved these files to the '''QDA '''folder  
+
|| I have downloaded and moved these files to the '''QDA ''' folder.
  
 
This folder is located in the '''MLProject''' folder on my '''Desktop'''.
 
This folder is located in the '''MLProject''' folder on my '''Desktop'''.
Line 101: Line 113:
 
|| Let us switch to '''RStudio'''.  
 
|| Let us switch to '''RStudio'''.  
 
|-  
 
|-  
|| Doule-click '''QDA.R''' in '''Rstudio'''.
+
|| Double-click '''QDA.R''' in '''Rstudio'''.
  
 
Point to '''QDA.R '''in''' RStudio'''.
 
Point to '''QDA.R '''in''' RStudio'''.
Line 130: Line 142:
 
|| Cursor on '''iris dataset'''.
 
|| Cursor on '''iris dataset'''.
 
|| Now let us split our data into training and testing.
 
|| Now let us split our data into training and testing.
 +
 
|-
 
|-
 
||[RStudio]
 
||[RStudio]
Line 167: Line 180:
  
 
|-
 
|-
||Seclect the commands and Click the ''' Run'''  button.
+
||Select the commands and Click the ''' Run'''  button.
 
Click the '''test set ''' and '''train set'''.
 
Click the '''test set ''' and '''train set'''.
  
Line 177: Line 190:
 
Click the '''test set ''' and '''train set''' to load them in the '''Source '''window.
 
Click the '''test set ''' and '''train set''' to load them in the '''Source '''window.
 
|-  
 
|-  
||  
+
|| Point to '''iris dataset'''.
 
|| Now we will perform '''QDA''' on the '''iris''' '''dataset'''.
 
|| Now we will perform '''QDA''' on the '''iris''' '''dataset'''.
 
|-  
 
|-  
Line 201: Line 214:
 
This is the command that we use to create the model.  
 
This is the command that we use to create the model.  
  
It compares species against petal length and petal width.  
+
It compares species against '''petal length''' and '''petal width'''.  
  
  
Line 251: Line 264:
 
|| In the''' Source''' window type these commands.  
 
|| In the''' Source''' window type these commands.  
  
This command is used to predict the species from the test data.
+
This command is used to predict the species from the '''test data'''.
  
 
This command gives us the contents of the predicted variable.
 
This command gives us the contents of the predicted variable.
  
  
Save and run the commands
+
Save and run the commands.
 
|-  
 
|-  
 
|| Highlight output in '''console'''
 
|| Highlight output in '''console'''
Line 269: Line 282:
  
  
This gives the posterior probability of an observation belonging to each class.
+
This gives the '''posterior probability''' of an observation belonging to each class.
 
|-  
 
|-  
 
|| Cursor in the Source window.
 
|| Cursor in the Source window.
Line 288: Line 301:
 
Save and run the command.
 
Save and run the command.
  
It will tabulate the original species against predicted species.
+
It will tabulate the original species against the predicted species.
  
  
Line 294: Line 307:
  
 
This shows that '''QDA''' has successfully separated the 3 species of ''' iris dataset'''.
 
This shows that '''QDA''' has successfully separated the 3 species of ''' iris dataset'''.
 +
 
|-  
 
|-  
 
|| Only Narration.
 
|| Only Narration.
Line 304: Line 318:
 
'''Summary'''
 
'''Summary'''
 
|| In this tutorial we have learnt about:
 
|| In this tutorial we have learnt about:
 +
 
* '''Quadratic Discriminant Analysis''' or '''QDA.'''
 
* '''Quadratic Discriminant Analysis''' or '''QDA.'''
 +
 
* Differences between '''linear discriminant analysis '''and '''quadratic discriminant analysis'''.
 
* Differences between '''linear discriminant analysis '''and '''quadratic discriminant analysis'''.
 +
 
* When to use '''quadratic discriminant analysis'''.
 
* When to use '''quadratic discriminant analysis'''.
* Implementation of '''quadratic discriminant analysis '''in '''R'''.
+
 
 +
* Implementation of '''quadratic discriminant analysis ''' in '''R'''.
  
  
Line 317: Line 335:
  
 
* Apply '''QDA''' on the '''Wine dataset'''.
 
* Apply '''QDA''' on the '''Wine dataset'''.
 +
 
* This '''dataset''' can be found in the '''HDclassif '''package.  
 
* This '''dataset''' can be found in the '''HDclassif '''package.  
* Install the package and import the dataset using the '''data() '''command
+
 
 +
* Install the package and import the dataset using the '''data() ''' command
 +
 
 
* Measure the accuracy of the model.
 
* Measure the accuracy of the model.
  
Line 364: Line 385:
 
'''Textbook Companion'''
 
'''Textbook Companion'''
  
|| The FOSSEE team coordinates the coding of solved examples of popular books and case study projects.
+
|| The '''FOSSEE''' team coordinates the coding of solved examples of popular books and case study projects.
  
 
We give certificates to those who do this.
 
We give certificates to those who do this.

Revision as of 13:18, 6 March 2023

Title of the script: Quadratic Discriminant Analysis in R

Author: Tanmay Srinath

Keywords: R, RStudio, machine learning, QDA, quadratic discriminant analysis, LDA, heteroscedastic gaussian data, MASS library, video tutorial.


Visual Cue Narration
Show slide

Opening Slide

Welcome to this spoken tutorial on Quadratic Discriminant Analysis in R.
Show slide

Learning Objectives

In this tutorial, we will learn about:
  • Quadratic Discriminant Analysis or QDA.
  • Differences between linear discriminant analysis and quadratic discriminant analysis.
  • When to use quadratic discriminant analysis.
  • Implementation of quadratic discriminant analysis in R.
Show slide

System Specifications

This tutorial is recorded using,
  • Ubuntu Linux OS version 20.04
  • R version 4.1.2
  • RStudio version 1.4.1717.

It is recommended to install R version 4.1.0 or higher.

Show slide

Prerequisites

https://spoken-tutorial.org

To follow this tutorial, the learner should know:
  • Basic programming in R.
  • Machine Learning in R.

If not, please access the relevant tutorials on this website.

Show slide

Quadratic Discriminant Analysis

Quadratic discriminant analysis .
  • It is the discriminant analysis that is performed on heteroscedastic gaussian data.
  • It is used when the covariance structures of the classes are different.


Show Slide

Differences between LDA and QDA

Differences between LDA and QDA.
  • LDA assumes that each class has the same covariance matrix.
  • On the other hand, QDA assumes that each class has a different covariance matrix.
  • LDA constructs a linear boundary, while QDA constructs an elliptical boundary.
  • When the covariance matrices of different classes are the same, QDA reduces to LDA.


Show Slides

When to use QDA

QDA is primarily used when data is multivariate gaussian.
Only Narration Let us see how we can do it in RStudio.
Show slide

Download Files

We will use a script file QDA.R

Please download this file from the Code files link of this tutorial.

Make a copy and then use it for practising.

[Computer screen]

Highlight


QDA.R and the folder QDA folder.

I have downloaded and moved these files to the QDA folder.

This folder is located in the MLProject folder on my Desktop.

I have also set the QDA folder as my Working Directory.

Cursor in QDA folder. Let us switch to RStudio.
Double-click QDA.R in Rstudio.

Point to QDA.R in RStudio.

Let us open the script QDA.R in RStudio.

Script QDA.R opens in RStudio.

Highlight

library(MASS)

data(iris)


Highlight MASS


Click in the Environment tab to load the iris dataset.

The MASS library contains the qda() function.


Run these commands to import the library and the dataset.


Click in the Environment tab to load the iris dataset.

Cursor on iris dataset. Now let us split our data into training and testing.
[RStudio]

set.seed(1)

trn_ind=sample

(1:nrow(iris),size=0.7*nrow(iris),

replace=FALSE)

train <- iris[trn_ind, ]

test <- iris[-c(trn_ind), ]

In the Source window, type these commands.
Highlight

set.seed(1)


Highlight

trn_ind=sample(1:nrow(iris),

size=0.7*nrow(iris),replace=FALSE)

train <- iris[trn_ind, ]

test <- iris[-c(trn_ind), ]

We set a seed for reproducible results.

We sample 70% of the data from iris for training and 30% for testing.

Select the commands and Click the Run button.

Click the test set and train set.

Select the commands and run them.

The datasets are shown in the Environment tab


Click the test set and train set to load them in the Source window.

Point to iris dataset. Now we will perform QDA on the iris dataset.
[RStudio]

model <- qda(Species~Petal.Length+

Petal.Width, data=train)

model


Highlight

model <-

qda(Species~Petal.Length+Petal.Width, data=train)


Click Save and Click Run buttons.

In the Source window type these commands.

This is the command that we use to create the model.

It compares species against petal length and petal width.


Save and run the commands.


The output is shown in the console.

Drag boundary to see the console window. Drag boundary to see the console window clearly.
Highlight output in console


Highlight Prior probabilities of group


Highlight Group means

These are the parameters of our model.

This indicates the composition of the training data.

These indicate the mean values of the predictor variables for each species.

Drag boundary to see the Source window. Drag boundary to see the Source window clearly.
Cursor in the Source window. Let us now use our model to make predictions on test data.
[RStudio]

predicted <- predict(model, test)

names(predicted)


Highlight

predicted <- predict(model, test)


Highlight

names(predicted)


Click on Save and Run buttons.

In the Source window type these commands.

This command is used to predict the species from the test data.

This command gives us the contents of the predicted variable.


Save and run the commands.

Highlight output in console

Highlight class


Highlight posterior

This shows us that our predicted variable has two components.

This is the predicted class.


This gives the posterior probability of an observation belonging to each class.

Cursor in the Source window. Let us now compute the accuracy of our model.
[RStudio]

table(test$Species,predicted$class)

Highlight


output in console

In the Source window type this command.


Save and run the command.

It will tabulate the original species against the predicted species.


Our model has no erroneous predictions.

This shows that QDA has successfully separated the 3 species of iris dataset.

Only Narration. With this we come to the end of this tutorial.

Let us summarise.

Show Slide

Summary

In this tutorial we have learnt about:
  • Quadratic Discriminant Analysis or QDA.
  • Differences between linear discriminant analysis and quadratic discriminant analysis.
  • When to use quadratic discriminant analysis.
  • Implementation of quadratic discriminant analysis in R.


Show Slide

Assignment

Here is an assignment for you.
  • Apply QDA on the Wine dataset.
  • This dataset can be found in the HDclassif package.
  • Install the package and import the dataset using the data() command
  • Measure the accuracy of the model.
Show slide

About the Spoken Tutorial Project

The video at the following link summarises the Spoken Tutorial project.

Please download and watch it.

Show slide

Spoken Tutorial Workshops

We conduct workshops using Spoken Tutorials and give certificates.


For more details, please contact us.

Show Slide

Spoken Tutorial Forum to answer questions

Do you have questions in THIS Spoken Tutorial?

Choose the minute and second where you have the question.

Explain your question briefly.

Someone from the FOSSEE team will answer them.

Please visit this site.

Please post your timed queries in this forum.
Show Slide

Forum to answer questions

Do you have any general/technical questions?

Please visit the forum given in the link.

Show Slide

Textbook Companion

The FOSSEE team coordinates the coding of solved examples of popular books and case study projects.

We give certificates to those who do this.

For more details, please visit these sites.

Show Slide

Acknowledgment

The Spoken Tutorial and FOSSEE projects are funded by the Ministry of Education Govt of India.
Show Slide

Thank You

This tutorial is contributed by Tanmay Srinath and Madhuri Ganapathi from IIT Bombay.

Thank you for watching.

Contributors and Content Editors

Madhurig, Nancyvarkey