Python-for-Machine-Learning/C3/Support-Vector-Machine/English
| Visual Cue | Narration |
| Show slide:
Welcome |
Welcome to the Spoken Tutorial on Support Vector Machine. |
| Show Slide:
Learning Objectives |
In this tutorial, we will learn about
|
| Show Slide:
System Requirements |
To record this tutorial, I am using
|
| Show Slide:
Prerequisite |
To follow this tutorial,
|
| Show Slide:
Code files |
|
| Show Slide
SVM |
|
| Show Slide
Hyperplane and Margin Show margin.png Narration |
|
| Narration | Next we will see about Linear SVM and Non-Linear SVM. |
| Show Slide
Linear SVM |
|
| Show Slide
Non-Linear SVM |
|
| Hover over the files | I have created required files for the demonstration of SVM. |
| Open the file housingcalifornia.csv and point to the fields as per narration. | To implement the SVM model, we use the californiahousing dot csv dataset.
The columns in the dataset helps to classify whether a house price is High or Low. |
| Point to the SVM.ipynb | SVM dot ipynb is the python notebook file for this demonstration. |
| Press Ctrl,Alt and T keys
Type conda activate ml Press Enter |
Let us open the Linux terminal. Press Ctrl, Alt and T keys together.
Activate the machine learning environment as shown |
| Go to the Downloads folder
Type cd Downloads Type jupyter notebook Press Enter |
I have saved my code file in the Downloads folder.
Please navigate to the directory of your respective code file location. Then type, jupyter space notebook and press Enter. |
| Show Jupyter Notebook Home page:
Click on SVM.ipynb |
We can see the Jupyter Notebook Home page has opened in the web browser.
Click the SVM dot ipynb file to open it. Note that each cell will have the output displayed in this file. |
| Highlight the lines
import pandas as pd import seaborn as sns from sklearn.decomposition import PCA Press Shift and Enter |
We start by importing the required libraries for SVM classification.
Now, we will implement a Linear SVM model. Make sure to Press Shift and Enter to execute the code in each cell. |
| Highlight the lines:
housing_df = pd.read_csv('californiahousing.csv') |
First, we load the dataset from a CSV file. |
| Highlight the lines:
housing_df.head() |
Next, we display the first few rows using the head function. |
| Highlight the lines:
housing_df.shape |
Then, we check the dataset’s shape to see the number of rows and columns. |
| Highlight the lines:
selected_features = ["MedInc", "HouseAge", "AveRooms", "AveBedrms", "Housing Price"] |
Now, let’s visualize relationships between features using a pair plot. |
| Show the output | Here is the output displaying feature relationships in the dataset. |
| Highlight the lines: | Since our data has categories, we use Label Encoding to convert them. |
| Highlight the lines: | Next, we separate the features and target variable for model training. |
| Highlight the lines:
X |
Then we print the feature set X. |
| Highlight the lines:
y |
Similarly, we print target variable y. |
| Highlight the lines:
X_train, X_test, y_train, y_test = |
Now, we split the data into training and testing sets. |
| Highlight the lines:
scaler = MinMaxScaler() X_train_scaled = |
Following this, we apply Min Max Scaler to keep the data within a fixed range. |
| Highlight the lines: | Now, we train a Linear SVM model using the training data.
To set up a Linear SVM, we use the Linear kernel. |
| Highlight the lines:
y_train_pred_linear = svc_linear.predict(X_train_scaled) |
Once trained, we make predictions on the training data. |
| Highlight the lines: | Now, we check the training accuracy to evaluate model learning. |
| Highlight the lines:
y_pred_linear = |
Next, we predict target values for the test data. |
| Highlight the lines: | Then, we compare the actual target values with the predicted values. |
| Highlight the lines: | We now calculate and display the accuracy of the Linear SVM model. |
| Highlight the output:
Accuracy: 0.840 |
We see the accuracy is 0.84, indicating strong model performance. |
| Highlight the lines: | Now, we generate a classification report to evaluate model performance. |
| Highlight the lines:
train_sizes, train_scores, test_scores = learning_curve |
Next, we plot a learning curve to see how accuracy changes with training size. |
| Show the output
Hover over training accuracy line and validation accuracy line in the plot. |
The plot shows how accuracy changes with different training sizes.
The blue and red lines show training and validation accuracy respectively. The learning curve helps to analyze model performance before further tuning. |
| Narration | Let’s move to Non Linear SVM. |
| Highlight the lines:
svc_rbf = SVC(kernel='rbf', C=10, |
To set up a Non Linear SVM, we use the Radial Basis Function kernel.
We set the regularization parameter C to 10 for better separation. We also use class weighting to handle class imbalance. |
| Highlight the lines:
y_train_pred_rbf = svc_rbf.predict(X_train_scaled) |
Now, we predict the training labels using the trained Non Linear SVM model. |
| Highlight the lines: | Next, we calculate and display the training accuracy. |
| Highlight the lines:
y_pred_rbf = svc_rbf.predict(X_test_scaled) |
Now, we generate predictions on the test data. |
| Highlight the lines: | Then we compare actual values with predicted values using a Dataframe. |
| Highlight the lines: | We now check the model’s final accuracy. |
| Highlight the output
Accuracy: 0.840 |
With an accuracy of 84 percent, the model performs well. |
| Highlight the lines: | Now, let's analyze it further with a classification report. |
| Highlight the lines:
pca = PCA(n_components=2) X_train_pca = pca.fit_transform(X_train_scaled) X_test_pca = pca.transform(X_test_scaled) |
After evaluating the model, let's visualize how SVM separates the classes.
We now plot the support vectors, which help define the decision boundary. |
| Show the output | This plot shows an SVM model trained with an RBF kernel.
Each point represents a data sample from the dataset.Red and blue colors indicate two different target classes.Black X marks represent the model's support vectors.Support vectors are the key points defining the decision boundary. Thus, this is a 2D visualization of an originally 9D dataset. |
| Show Slide:
Summary |
This brings us to the end of the tutorial. Let us summarize. |
| Show Slide: | In Linear SVM code,
|
| Show Slide:
Assignment Solution Show Linear.PNG image file |
After completing the assignment, the output should match the expected result. |
| Show Slide:
FOSSEE Forum |
For any general or technical questions on Python for Machine Learning, visit the FOSSEE forum and post your question |
| Show Slide:
Thank you |
This is Harini Theiveegan, a FOSSEE Summer Fellow 2025, IIT Bombay signing off
Thanks for joining. |