Difference between revisions of "Python-for-Machine-Learning/C3/Support-Vector-Machine/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
 
Line 1: Line 1:
  
<div style="margin-left:1.27cm;margin-right:0cm;"></div>
 
 
{| border="1"
 
{| border="1"
 
|-
 
|-
Line 6: Line 5:
 
|| '''Narration'''
 
|| '''Narration'''
 
|-
 
|-
|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.191cm;padding-right:0.191cm;"
+
 
|| <div style="color:#000000;">Show slide:</div>
+
|| Show slide:
 
'''Welcome'''
 
'''Welcome'''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Welcome to the Spoken Tutorial on '''Support Vector Machine.'''
+
|| Welcome to the Spoken Tutorial on '''Support Vector Machine.'''
 
|-
 
|-
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Slide:
+
|| Show Slide:
  
 
'''Learning Objectives'''
 
'''Learning Objectives'''
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | In this tutorial, we will learn about
+
|| In this tutorial, we will learn about
* <div style="margin-left:1.27cm;margin-right:0cm;">'''Support Vector Machine (SVM)'''</div>
+
* '''Support Vector Machine (SVM)'''
* <div style="margin-left:1.27cm;margin-right:0cm;">'''Linear SVM '''and</div>
+
* '''Linear SVM '''and
* <div style="margin-left:1.27cm;margin-right:0cm;">'''Non Linear SVM '''</div>
+
* '''Non Linear SVM '''
  
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Show Slide:
 
|| Show Slide:
  
 
'''System Requirements'''
 
'''System Requirements'''
 
|| To record this tutorial, I am using  
 
|| To record this tutorial, I am using  
* <div style="margin-left:1.27cm;margin-right:0cm;">'''Ubuntu Linux OS version 24.04'''</div>
+
* '''Ubuntu Linux OS version 24.04'''
* <div style="margin-left:1.27cm;margin-right:0cm;">'''Jupyter Notebook IDE'''</div>
+
* '''Jupyter Notebook IDE'''
 
|-
 
|-
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | <div style="color:#000000;">Show Slide:</div>
+
|| Show Slide:
  
<div style="color:#000000;">'''Prerequisite'''</div>
+
'''Prerequisite'''
  
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | To follow this tutorial,
+
|| To follow this tutorial,
* <div style="margin-left:1.27cm;margin-right:0cm;"><span style="color:#000000;">The learner must have basic knowledge of </span><span style="color:#000000;">'''Python.'''</span></div>
+
* The learner must have basic knowledge of '''Python.'''
* <div style="margin-left:1.27cm;margin-right:0cm;"><span style="color:#000000;">For prerequisite </span><span style="color:#000000;">'''Python'''</span><span style="color:#000000;"> tutorials, please visit this website.</span></div>
+
* For prerequisite '''Python''' tutorials, please visit this website.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Slide:
+
|| Show Slide:
  
 
'''Code files'''
 
'''Code files'''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" |
+
||
* <div style="margin-left:1.27cm;margin-right:0cm;"><span style="color:#000000;">The files used in this tutorial are provided in the </span><span style="color:#000000;">'''Code files '''</span><span style="color:#000000;">link.</span></div>
+
* The files used in this tutorial are provided in the '''Code files '''link.
* <div style="color:#252525;margin-left:1.27cm;margin-right:0cm;">Please download and extract the files.</div>
+
* Please download and extract the files.
* <div style="color:#252525;margin-left:1.27cm;margin-right:0cm;">Make a copy and then use them while practicing.</div>
+
* Make a copy and then use them while practicing.
  
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Slide
+
|| Show Slide
  
 
'''SVM'''
 
'''SVM'''
| style="border:0.6pt solid #000000;padding:0.106cm;" |  
+
||  
* <div style="margin-left:1.27cm;margin-right:0cm;">'''SVM''' is a '''supervised learning algorithm''' used for classification and regression.</div>
+
* '''SVM''' is a '''supervised learning algorithm''' used for classification and regression.
* <div style="margin-left:1.27cm;margin-right:0cm;">It finds the best boundary, called a '''hyperplane''', to separate classes.</div>
+
* It finds the best boundary, called a '''hyperplane''', to separate classes.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Slide  
+
|| Show Slide  
  
 
'''Hyperplane and Margin'''
 
'''Hyperplane and Margin'''
Line 58: Line 57:
  
 
Narration
 
Narration
| style="border:0.6pt solid #000000;padding:0.106cm;" |
+
||
* <div style="margin-left:1.27cm;margin-right:0cm;">The best '''hyperplane''' is the one that leaves the largest gap between classes. </div>
+
* The best '''hyperplane''' is the one that leaves the largest gap between classes.  
* <div style="margin-left:1.27cm;margin-right:0cm;">This gap is called the '''margin''', and a larger margin reduces errors.</div>
+
* This gap is called the '''margin''', and a larger margin reduces errors.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Narration
+
|| Narration
| style="border:0.6pt solid #000000;padding:0.106cm;" | Next we will see about Linear SVM and Non-Linear SVM.
+
|| Next we will see about Linear SVM and Non-Linear SVM.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Slide
+
|| Show Slide
  
 
'''Linear SVM'''
 
'''Linear SVM'''
| style="border:0.6pt solid #000000;padding:0.106cm;" |
+
||
* <div style="margin-left:1.27cm;margin-right:0cm;">If a straight line hyperplane can separate the data, we use '''Linear SVM'''.</div>
+
* If a straight line hyperplane can separate the data, we use '''Linear SVM'''.
* <div style="margin-left:1.27cm;margin-right:0cm;">'''Linear SVM''' aims to find the hyperplane that maximizes the margin.</div>
+
* '''Linear SVM''' aims to find the hyperplane that maximizes the margin.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Slide
+
|| Show Slide
  
 
'''Non-Linear SVM'''
 
'''Non-Linear SVM'''
| style="border:0.6pt solid #000000;padding:0.106cm;" |  
+
||  
* <div style="margin-left:1.27cm;margin-right:0cm;">When data is not linearly separable, we use '''Non Linear SVM.'''</div>
+
* When data is not linearly separable, we use '''Non Linear SVM.'''
* <div style="margin-left:1.27cm;margin-right:0cm;">Non Linear SVM uses the '''kernel trick''' to transform the data.</div>
+
* Non Linear SVM uses the '''kernel trick''' to transform the data.
* <div style="margin-left:1.27cm;margin-right:0cm;">'''Kernels '''help find decision boundaries for data that isn’t linearly separable.</div>
+
* '''Kernels '''help find decision boundaries for data that isn’t linearly separable.
  
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Hover over the files
+
|| Hover over the files
| style="border:0.6pt solid #000000;padding:0.106cm;" | I have created required files for the demonstration of '''SVM'''.  
+
|| I have created required files for the demonstration of '''SVM'''.  
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Open the file housingcalifornia.csv and point to the fields as per narration.
+
|| Open the file housingcalifornia.csv and point to the fields as per narration.
| style="border:0.6pt solid #000000;padding:0.106cm;" | To implement the '''SVM model, '''we use the '''californiahousing dot csv '''dataset.
+
|| To implement the '''SVM model, '''we use the '''californiahousing dot csv '''dataset.
  
 
The columns in the dataset helps to classify whether a house price is High or Low.
 
The columns in the dataset helps to classify whether a house price is High or Low.
 
|-
 
|-
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Point to the '''SVM.ipynb'''  
+
|| Point to the '''SVM.ipynb'''  
| style="border-top:0.6pt solid #000000;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | '''SVM dot ipynb''' is the python notebook file for this demonstration.
+
|| '''SVM dot ipynb''' is the python notebook file for this demonstration.
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Press '''Ctrl,Alt and T''' keys
 
|| Press '''Ctrl,Alt and T''' keys
  
Line 101: Line 100:
 
Activate the machine learning environment as shown
 
Activate the machine learning environment as shown
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Go to the '''Downloads '''folder
+
|| Go to the '''Downloads '''folder
  
 
Type '''cd Downloads'''
 
Type '''cd Downloads'''
Line 108: Line 107:
  
 
Press '''Enter '''
 
Press '''Enter '''
| style="border-top:0.6pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | I have saved my code file in the '''Downloads''' folder.  
+
|| I have saved my code file in the '''Downloads''' folder.  
  
Please navigate to the directory of your respective code file location.
+
Please navigate to the directory of your respective '''code file''' location.
  
 
Then type, '''jupyter space notebook '''and press''' Enter.'''
 
Then type, '''jupyter space notebook '''and press''' Enter.'''
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Jupyter Notebook Home page:
+
|| Show Jupyter Notebook Home page:
  
 
Click on''' SVM.ipynb'''
 
Click on''' SVM.ipynb'''
| style="border-top:0.6pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | We can see the '''Jupyter Notebook''' '''Home page''' has opened in the web browser.
+
|| We can see the '''Jupyter Notebook''' '''Home page''' has opened in the web browser.
  
 
Click the '''SVM dot ipynb''' file to open it.
 
Click the '''SVM dot ipynb''' file to open it.
  
<div style="color:#000000;">Note that each cell will have the output displayed in this file.</div>
+
Note that each cell will have the output displayed in this file.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight''' '''the lines
+
|| Highlight''' '''the lines
  
 
'''import pandas as pd '''
 
'''import pandas as pd '''
Line 130: Line 129:
  
 
Press''' Shift '''and''' Enter'''
 
Press''' Shift '''and''' Enter'''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | We start by importing the required libraries for '''SVM classification.'''
+
|| We start by importing the required libraries for '''SVM classification.'''
  
 
Now, we will implement a '''Linear SVM''' model.
 
Now, we will implement a '''Linear SVM''' model.
Line 136: Line 135:
 
Make sure to Press''' Shift '''and''' Enter''' to execute the code in each cell.
 
Make sure to Press''' Shift '''and''' Enter''' to execute the code in each cell.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight''' '''the lines:
+
|| Highlight''' '''the lines:
  
 
'''housing_df = pd.read_csv('californiahousing.csv')'''
 
'''housing_df = pd.read_csv('californiahousing.csv')'''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | First, we '''load the dataset''' from a CSV file.
+
|| First, we '''load the dataset''' from a CSV file.
  
 
|-
 
|-
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
 
'''housing_df.head() '''
 
'''housing_df.head() '''
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Next, we display the first few rows using the '''head function'''.
+
|| Next, we display the first few rows using the '''head function'''.
  
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Highlight the lines:
 
|| Highlight the lines:
  
 
'''housing_df.shape '''
 
'''housing_df.shape '''
 
|| Then, we check the '''dataset’s shape''' to see the number of rows and columns.
 
|| Then, we check the '''dataset’s shape''' to see the number of rows and columns.
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Highlight the lines:
 
|| Highlight the lines:
  
 
'''selected_features = ["MedInc", "HouseAge", "AveRooms", "AveBedrms", "Housing Price"] '''
 
'''selected_features = ["MedInc", "HouseAge", "AveRooms", "AveBedrms", "Housing Price"] '''
 
|| Now, let’s visualize relationships between features using a pair''' plot'''.
 
|| Now, let’s visualize relationships between features using a pair''' plot'''.
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Show the output
 
|| Show the output
 
|| Here is the output displaying feature relationships in the dataset.
 
|| Here is the output displaying feature relationships in the dataset.
|- style="border-top:0.5pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-
 
|| Highlight the lines:
 
|| Highlight the lines:
  
 
|| Since our data has categories, we use '''Label Encoding''' to convert them.
 
|| Since our data has categories, we use '''Label Encoding''' to convert them.
 
|-
 
|-
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Next, we separate the '''features''' and '''target '''variable for model training.
+
|| Next, we separate the '''features''' and '''target '''variable for model training.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
 
'''X'''
 
'''X'''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Then we print the '''feature set X.'''
+
|| Then we print the '''feature set X.'''
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
 
'''y'''
 
'''y'''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Similarly, we print '''target variable y.'''
+
|| Similarly, we print '''target variable y.'''
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
 
'''X_train, X_test, y_train, y_test ='''
 
'''X_train, X_test, y_train, y_test ='''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Now, we split the data into '''training''' and '''testing''' '''sets.'''  
+
|| Now, we split the data into '''training''' and '''testing''' '''sets.'''  
 
|-
 
|-
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
 
'''scaler = MinMaxScaler()'''
 
'''scaler = MinMaxScaler()'''
 
'''X_train_scaled ='''
 
'''X_train_scaled ='''
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Following this, we apply '''Min Max Scaler''' to keep the data within a fixed range.
+
|| Following this, we apply '''Min Max Scaler''' to keep the data within a fixed range.
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Highlight the lines:
 
|| Highlight the lines:
 
|| Now, we train a '''Linear SVM''' model using the training data.
 
|| Now, we train a '''Linear SVM''' model using the training data.
Line 194: Line 193:
 
To set up a '''Linear SVM''', we use the''' Linear''' '''kernel'''.
 
To set up a '''Linear SVM''', we use the''' Linear''' '''kernel'''.
  
|- style="border-top:0.5pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-
 
|| Highlight the lines:  
 
|| Highlight the lines:  
  
Line 201: Line 200:
  
 
|-
 
|-
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Now, we check the training '''accuracy''' to evaluate model learning.
+
|| Now, we check the training '''accuracy''' to evaluate model learning.
  
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
 
'''y_pred_linear ='''
 
'''y_pred_linear ='''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Next, we predict target values for the test data.
+
|| Next, we predict target values for the test data.
  
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Then, we compare the actual target values with the predicted values.
+
|| Then, we compare the actual target values with the predicted values.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | We now calculate and display the '''accuracy''' of '''the Linear SVM model.'''
+
|| We now calculate and display the '''accuracy''' of '''the Linear SVM model.'''
  
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the output:
+
|| Highlight the output:
  
 
'''Accuracy: 0.840'''
 
'''Accuracy: 0.840'''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | We see the accuracy is '''0.84''', indicating strong model performance.
+
|| We see the accuracy is '''0.84''', indicating strong model performance.
  
 
|-
 
|-
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Now, we generate a classification report to evaluate model performance.
+
|| Now, we generate a classification report to evaluate model performance.
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Highlight the lines:
 
|| Highlight the lines:
  
Line 233: Line 232:
 
|| Next, we plot a '''learning curve''' to see how accuracy changes with training size.
 
|| Next, we plot a '''learning curve''' to see how accuracy changes with training size.
 
|-
 
|-
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show the output  
+
|| Show the output  
  
 
Hover over training accuracy line and validation accuracy line in the plot.
 
Hover over training accuracy line and validation accuracy line in the plot.
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | <div style="color:#000000;">The plot shows how '''accuracy changes with different training sizes.'''</div>
+
|| The plot shows how '''accuracy changes with different training sizes.'''
  
<div style="color:#000000;">The blue and red lines show '''training''' and '''validation accuracy''' respectively.</div>
+
The blue and red lines show '''training''' and '''validation accuracy''' respectively.
  
<div style="color:#000000;">The learning curve helps to analyze model performance before further tuning.</div>
+
The learning curve helps to analyze model performance before further tuning.
 
|-
 
|-
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Narration
+
|| Narration
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Let’s move to '''Non Linear SVM'''.
+
|| Let’s move to '''Non Linear SVM'''.
 
|-
 
|-
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
 
'''svc_rbf = SVC(kernel='rbf', C=10,'''
 
'''svc_rbf = SVC(kernel='rbf', C=10,'''
  
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | To set up a '''Non Linear SVM''', we use the''' Radial Basis Function kernel'''.
+
|| To set up a '''Non Linear SVM''', we use the''' Radial Basis Function kernel'''.
  
 
We set the '''regularization parameter C''' to 10 for better separation.
 
We set the '''regularization parameter C''' to 10 for better separation.
Line 255: Line 254:
 
We also use '''class weighting''' to handle '''class imbalance'''.
 
We also use '''class weighting''' to handle '''class imbalance'''.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
 
'''y_train_pred_rbf = svc_rbf.predict(X_train_scaled) '''
 
'''y_train_pred_rbf = svc_rbf.predict(X_train_scaled) '''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Now, we predict the training labels using the trained '''Non Linear SVM''' model.
+
|| Now, we predict the training labels using the trained '''Non Linear SVM''' model.
 
|-
 
|-
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Next, we calculate and display the '''training accuracy.'''
+
|| Next, we calculate and display the '''training accuracy.'''
  
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Highlight the lines:
 
|| Highlight the lines:
  
Line 269: Line 268:
 
|| Now, we generate predictions on the test data.
 
|| Now, we generate predictions on the test data.
 
|-
 
|-
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Then we compare actual values with predicted values using a Dataframe.
+
|| Then we compare actual values with predicted values using a Dataframe.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | We now check the model’s final '''accuracy'''.
+
|| We now check the model’s final '''accuracy'''.
  
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the output
+
|| Highlight the output
  
 
'''Accuracy: 0.840'''
 
'''Accuracy: 0.840'''
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | With an accuracy of '''84 percent''', the model performs well.
+
|| With an accuracy of '''84 percent''', the model performs well.
 
|-
 
|-
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
| style="border-top:none;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | Now, let's analyze it further with a '''classification report'''.
+
|| Now, let's analyze it further with a '''classification report'''.
 
|-
 
|-
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Highlight the lines:
+
|| Highlight the lines:
  
 
'''pca = PCA(n_components=2) '''
 
'''pca = PCA(n_components=2) '''
 
'''X_train_pca = pca.fit_transform(X_train_scaled)'''
 
'''X_train_pca = pca.fit_transform(X_train_scaled)'''
 
'''X_test_pca = pca.transform(X_test_scaled)'''
 
'''X_test_pca = pca.transform(X_test_scaled)'''
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | After evaluating the model, let's visualize how '''SVM''' separates the classes.
+
|| After evaluating the model, let's visualize how '''SVM''' separates the classes.
  
 
We now plot the '''support vectors''', which help define the '''decision boundary'''.
 
We now plot the '''support vectors''', which help define the '''decision boundary'''.
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Show the output
 
|| Show the output
 
|| This plot shows an '''SVM model''' trained with an '''RBF kernel'''.
 
|| This plot shows an '''SVM model''' trained with an '''RBF kernel'''.
Line 300: Line 299:
  
 
Thus, this is a '''2D visualization''' of an originally 9D dataset.
 
Thus, this is a '''2D visualization''' of an originally 9D dataset.
|- style="border-top:0.5pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-
 
|| Show Slide:
 
|| Show Slide:
  
Line 306: Line 305:
 
|| This brings us to the end of the tutorial. Let us summarize.
 
|| This brings us to the end of the tutorial. Let us summarize.
 
|-
 
|-
| style="border-top:0.5pt solid #000000;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Slide:
+
| | Show Slide:
 
| style="border-top:0.5pt solid #000000;border-bottom:0.75pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | In Linear SVM code,
 
| style="border-top:0.5pt solid #000000;border-bottom:0.75pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | In Linear SVM code,
* <div style="margin-left:1.27cm;margin-right:0cm;">Change the '''value of C to''' '''5''' as shown here</div>
+
* Change the '''value of C to''' '''5''' as shown here
* <div style="margin-left:1.27cm;margin-right:0cm;">Observe the change in '''accuracy'''.</div>
+
* Observe the change in '''accuracy'''.
  
|- style="border-top:0.75pt solid #000000;border-bottom:0.5pt solid #000000;border-left:0.75pt solid #000000;border-right:0.75pt solid #000000;padding:0.106cm;"
+
|-  
 
|| Show Slide:
 
|| Show Slide:
  
Line 319: Line 318:
 
|| After completing the assignment, the output should match the expected result.
 
|| After completing the assignment, the output should match the expected result.
 
|-
 
|-
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Slide:  
+
|| Show Slide:  
  
 
'''FOSSEE Forum'''
 
'''FOSSEE Forum'''
| style="border-top:none;border-bottom:0.75pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | For any general or technical questions on '''Python for Machine Learning''', visit the '''FOSSEE forum''' and post your question
+
|| For any general or technical questions on '''Python for Machine Learning''', visit the '''FOSSEE forum''' and post your question
 
|-
 
|-
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:0.6pt solid #000000;border-right:0.6pt solid #000000;padding:0.106cm;" | Show Slide:
+
|| Show Slide:
  
 
'''Thank you'''
 
'''Thank you'''
| style="border-top:0.5pt solid #000000;border-bottom:0.6pt solid #000000;border-left:none;border-right:0.6pt solid #000000;padding:0.106cm;" | This is '''Harini Theiveegan''', a FOSSEE Summer Fellow 2025, IIT Bombay signing off
+
|| This is '''Harini Theiveegan''', a FOSSEE Summer Fellow 2025, IIT Bombay signing off
  
 
Thanks for joining.
 
Thanks for joining.
 
|-
 
|-
 
|}
 
|}

Latest revision as of 21:43, 10 July 2025

Visual Cue Narration
Show slide:

Welcome

Welcome to the Spoken Tutorial on Support Vector Machine.
Show Slide:

Learning Objectives

In this tutorial, we will learn about
  • Support Vector Machine (SVM)
  • Linear SVM and
  • Non Linear SVM
Show Slide:

System Requirements

To record this tutorial, I am using
  • Ubuntu Linux OS version 24.04
  • Jupyter Notebook IDE
Show Slide:

Prerequisite

To follow this tutorial,
  • The learner must have basic knowledge of Python.
  • For prerequisite Python tutorials, please visit this website.
Show Slide:

Code files

  • The files used in this tutorial are provided in the Code files link.
  • Please download and extract the files.
  • Make a copy and then use them while practicing.
Show Slide

SVM

  • SVM is a supervised learning algorithm used for classification and regression.
  • It finds the best boundary, called a hyperplane, to separate classes.
Show Slide

Hyperplane and Margin

Show margin.png

Narration

  • The best hyperplane is the one that leaves the largest gap between classes.
  • This gap is called the margin, and a larger margin reduces errors.
Narration Next we will see about Linear SVM and Non-Linear SVM.
Show Slide

Linear SVM

  • If a straight line hyperplane can separate the data, we use Linear SVM.
  • Linear SVM aims to find the hyperplane that maximizes the margin.
Show Slide

Non-Linear SVM

  • When data is not linearly separable, we use Non Linear SVM.
  • Non Linear SVM uses the kernel trick to transform the data.
  • Kernels help find decision boundaries for data that isn’t linearly separable.
Hover over the files I have created required files for the demonstration of SVM.
Open the file housingcalifornia.csv and point to the fields as per narration. To implement the SVM model, we use the californiahousing dot csv dataset.

The columns in the dataset helps to classify whether a house price is High or Low.

Point to the SVM.ipynb SVM dot ipynb is the python notebook file for this demonstration.
Press Ctrl,Alt and T keys

Type conda activate ml

Press Enter

Let us open the Linux terminal. Press Ctrl, Alt and T keys together.

Activate the machine learning environment as shown

Go to the Downloads folder

Type cd Downloads

Type jupyter notebook

Press Enter

I have saved my code file in the Downloads folder.

Please navigate to the directory of your respective code file location.

Then type, jupyter space notebook and press Enter.

Show Jupyter Notebook Home page:

Click on SVM.ipynb

We can see the Jupyter Notebook Home page has opened in the web browser.

Click the SVM dot ipynb file to open it.

Note that each cell will have the output displayed in this file.

Highlight the lines

import pandas as pd import seaborn as sns from sklearn.decomposition import PCA

Press Shift and Enter

We start by importing the required libraries for SVM classification.

Now, we will implement a Linear SVM model.

Make sure to Press Shift and Enter to execute the code in each cell.

Highlight the lines:

housing_df = pd.read_csv('californiahousing.csv')

First, we load the dataset from a CSV file.
Highlight the lines:

housing_df.head()

Next, we display the first few rows using the head function.
Highlight the lines:

housing_df.shape

Then, we check the dataset’s shape to see the number of rows and columns.
Highlight the lines:

selected_features = ["MedInc", "HouseAge", "AveRooms", "AveBedrms", "Housing Price"]

Now, let’s visualize relationships between features using a pair plot.
Show the output Here is the output displaying feature relationships in the dataset.
Highlight the lines: Since our data has categories, we use Label Encoding to convert them.
Highlight the lines: Next, we separate the features and target variable for model training.
Highlight the lines:

X

Then we print the feature set X.
Highlight the lines:

y

Similarly, we print target variable y.
Highlight the lines:

X_train, X_test, y_train, y_test =

Now, we split the data into training and testing sets.
Highlight the lines:

scaler = MinMaxScaler() X_train_scaled =

Following this, we apply Min Max Scaler to keep the data within a fixed range.
Highlight the lines: Now, we train a Linear SVM model using the training data.

To set up a Linear SVM, we use the Linear kernel.

Highlight the lines:

y_train_pred_linear = svc_linear.predict(X_train_scaled)

Once trained, we make predictions on the training data.
Highlight the lines: Now, we check the training accuracy to evaluate model learning.
Highlight the lines:

y_pred_linear =

Next, we predict target values for the test data.
Highlight the lines: Then, we compare the actual target values with the predicted values.
Highlight the lines: We now calculate and display the accuracy of the Linear SVM model.
Highlight the output:

Accuracy: 0.840

We see the accuracy is 0.84, indicating strong model performance.
Highlight the lines: Now, we generate a classification report to evaluate model performance.
Highlight the lines:

train_sizes, train_scores, test_scores = learning_curve

Next, we plot a learning curve to see how accuracy changes with training size.
Show the output

Hover over training accuracy line and validation accuracy line in the plot.

The plot shows how accuracy changes with different training sizes.

The blue and red lines show training and validation accuracy respectively.

The learning curve helps to analyze model performance before further tuning.

Narration Let’s move to Non Linear SVM.
Highlight the lines:

svc_rbf = SVC(kernel='rbf', C=10,

To set up a Non Linear SVM, we use the Radial Basis Function kernel.

We set the regularization parameter C to 10 for better separation.

We also use class weighting to handle class imbalance.

Highlight the lines:

y_train_pred_rbf = svc_rbf.predict(X_train_scaled)

Now, we predict the training labels using the trained Non Linear SVM model.
Highlight the lines: Next, we calculate and display the training accuracy.
Highlight the lines:

y_pred_rbf = svc_rbf.predict(X_test_scaled)

Now, we generate predictions on the test data.
Highlight the lines: Then we compare actual values with predicted values using a Dataframe.
Highlight the lines: We now check the model’s final accuracy.
Highlight the output

Accuracy: 0.840

With an accuracy of 84 percent, the model performs well.
Highlight the lines: Now, let's analyze it further with a classification report.
Highlight the lines:

pca = PCA(n_components=2) X_train_pca = pca.fit_transform(X_train_scaled) X_test_pca = pca.transform(X_test_scaled)

After evaluating the model, let's visualize how SVM separates the classes.

We now plot the support vectors, which help define the decision boundary.

Show the output This plot shows an SVM model trained with an RBF kernel.

Each point represents a data sample from the dataset.Red and blue colors indicate two different target classes.Black X marks represent the model's support vectors.Support vectors are the key points defining the decision boundary.

Thus, this is a 2D visualization of an originally 9D dataset.

Show Slide:

Summary

This brings us to the end of the tutorial. Let us summarize.
Show Slide: In Linear SVM code,
  • Change the value of C to 5 as shown here
  • Observe the change in accuracy.
Show Slide:

Assignment Solution

Show Linear.PNG image file

After completing the assignment, the output should match the expected result.
Show Slide:

FOSSEE Forum

For any general or technical questions on Python for Machine Learning, visit the FOSSEE forum and post your question
Show Slide:

Thank you

This is Harini Theiveegan, a FOSSEE Summer Fellow 2025, IIT Bombay signing off

Thanks for joining.

Contributors and Content Editors

Madhurig, Nirmala Venkat