Difference between revisions of "Python-for-Machine-Learning/C2/Logistic-Regression-MultiClass-Classification/English"

Revision as of 12:43, 11 July 2025

Visual Cue	Narration
Show slide: Welcome	Welcome to the Spoken Tutorial on Logistic Regression - Multiclass Classification.
Show slide: Learning Objectives	In this tutorial, we will learn about Multiclass Classification for Logistic Regression
Show slide: System Requirements	To record this tutorial, I am using Ubuntu Linux OS version 24.04 Jupyter Notebook IDE
Show slide: Prerequisite	To follow this tutorial, The learner must have basic knowledge of Python. For pre-requisite Python tutorials, please visit this website.
Show slide: Code files	The files used in this tutorial are provided in the Code files link. Please download and extract the files. Make a copy and then use them while practicing.
Show slide: Iris flower classification	To implement the Multiclass classification model we will, Use the iris dataset to classify the iris flower. To know more about the iris dataset please watch earlier tutorials.
Point to the LR_Multiclass.ipynb	LR_Multiclass dot ipynb is the ipython notebook file created for this demonstration.
Press Ctrl+Alt+T keys Type conda activate ml Press Enter Highlight: (ml)	Let us open the Linux terminal by pressing Ctrl, Alt and T keys together. Activate the machine learning environment as shown.
Type cd Downloads Type jupyter notebook Press Enter	I have saved my code file in the Downloads folder. Please navigate to the respective folder of your code file location. Then type, jupyter space notebook and press Enter.
Show Jupyter Notebook Home page: Double Click on LR_Multiclass.ipynb file	We can see the Jupyter Notebook Home page has opened in the web browser. Click on the LR underscore Multiclass dot ipynb file to open it. Note that each cell will have the output displayed in this file. Let us see the implementation of multiclass logistic regression.
Highlight import pandas as pd	These are the necessary libraries to be imported for Multiclass classification.
Only narration Highlight: iris = load_iris() iris.data[:5]	We first load the Iris dataset using the load underscore iris method. The dataset is stored in the variable iris. Then we display the first five rows using the head method.
Highlight Data Preprocessing	Now, let us prepare the data for training.
Highlight X = iris.data	We create variable X and assign all feature columns to it.
Highlight Y = iris.target	Next, we assign the target column to the variable Y.
Highlight df = pd.DataFrame(X, columns=iris.feature_names) df['target'] = Y	To analyze the data better, we create a DataFrame df using pd dot DataFrame.
Highlight corr_matrix = df[iris.feature_names].corr()	We compute correlation values between features of the Iris dataset using df dot corr. Now, we visualize this correlation using a heatmap. The heatmap shows how features relate to one another.
Highlight Train and Test Split	Next, we split the data into training and testing sets.
Highlight Model Instantiation of Multiclass Classification and Model training	Let us now build a multiclass classification model.
Highlight mlr = LogisticRegression(multi_class='multinomial', solver='lbfgs', max_iter=1000) mlr.fit(X_train, Y_train)	We create an instance of LogisticRegression from the sklearn library. Set multi underscore class equals multinomial and solver equals lbfgs. We also set max underscore iter equals 1000 to ensure convergence. Now we train the model using the fit method on the training data. Ignore the warning in the output cell, if any.
Highlight Y_train_pred = mlr.predict(X_train)	Now, we calculate and print the training accuracy.
Hightlight Training Accuracy: 0.981	The training accuracy is approximately 0.981, which is quite good.
Highlight Train Log Loss: 0.1308	Next, we calculate the cross-entropy loss for the training data. A Loss of 0.1308 shows the model is making accurate predictions.
Highlight plt.figure(figsize=(8, 6)) for i in range(Y_train_pred_proba.shape[1]): # Iterate over each class fpr, tpr, _ = roc_curve(Y_train == i, Y_train_pred_proba[:, i]) # One-vs-rest for each class roc_auc = auc(fpr, tpr)	Let us now plot the ROC curve and calculate the ROC-AUC score. The ROC curve shows TPR vs FPR at various threshold values. TPR stands for True Positive Rate that is recall. It is the fraction of actual positives correctly identified. FPR stands for False Positive Rate. It is the fraction of actual negatives wrongly classified as positives.
Show output plot	The ROC curve shows near-perfect classification. The curves stay close to the top-left corner. All three classes achieve an AUC of 1.00. This indicates the model effectively distinguishes the classes.
Highlight: Predictions for Test Data	Further, we predict labels for x underscore test.
Highlight test_data = X_test[15].reshape(1, -1) predicted_class = mlr.predict(test_data)	We test the model on a single sample, similar to binary classification. We compare the predicted with the actual test class.
Highlight: Predicted class: 0, Actual class: 0	The predicted value is 0, which is Setosa. The actual value is also 0, hence prediction is correct.
Highlight Y_pred = mlr.predict(x_test)	y underscore pred stores predicted labels for all test samples.
Highlight: print("Multiclass classification - Actual vs Predicted:")	We compare the actual class labels with the predicted labels.
Highlight: Multiclass Logistic Regression - Actual vs Predicted:	The output shows both actual and predicted label arrays.
Highlight: test_accuracy = accuracy_score(y_test, y_pred) print(f"Test Accuracy: {test_accuracy:.3f}")	Now we calculate the test accuracy.
Highlight: Test Accuracy: 0.978	We get an accuracy of approximately 0.978, which is pretty good.
Highlight # Predict probabilities for test set Y_test_pred_proba = mlr.predict_proba(X_test)	We also compute ROC-AUC score and cross-entropy loss for test data.
Highlight Test ROC-AUC Score (OvR): 0.9968 Test Log Loss: 0.1616	ROC-AUC score of 0.9968 indicates excellent performance. Cross-entropy loss of 0.1616 shows the predictions are accurate.
Highlight conf_matrix = confusion_matrix(Y_test, Y_pred)	Let us visualize the confusion matrix of the model. It shows how well the model classifies each class.
Show output plot	This matrix has three classes: 0, 1, and 2. The diagonal values represent correct predictions. One sample from Class 1 was incorrectly predicted as Class 2. The absence of other misclassified values indicates that the model performs well. A strong diagonal pattern suggests high classification accuracy.
Only narration	Now we have successfully classified different Iris flower classes. This brings us to the end of the tutorial. Let us summarize.
Show slide: Summary	In this tutorial, we have learnt about* Multiclass Classification for Logistic Regression
Show slide: Assignment	As an assignment, please do the following: Generate the classification report of the model using sklearn method as shown. Use classification_report from sklearn dot metrics to display results. This shows precision, recall, f1-score, and support for each class.
Show slide:	After completing the assignment, the output should match the expected result.
Show Slide: FOSSEE Forum	For any general or technical questions on Python for Machine Learning, visit the FOSSEE forum and post your question.
Show Slide: Thank You	This is Anvita Thadavoose Manjummel, a FOSSEE Summer Fellow 2025, IIT Bombay signing off. Thanks for joining.

Contributors and Content Editors

Madhurig, Nirmala Venkat

Difference between revisions of "Python-for-Machine-Learning/C2/Logistic-Regression-MultiClass-Classification/English"

Revision as of 12:43, 11 July 2025

Contributors and Content Editors

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools

@@ Line 1: / Line 1: @@
-<div style="margin-left:1.27cm;margin-right:0cm;"></div>
 {| border="1"
 |-
 || '''Visual Cue'''
 || '''Narration'''
-|-
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.191cm;padding-right:0.191cm;"
+|-
-|| <div style="color:#000000;">Show slide:</div>
+|| Show slide:
 '''Welcome'''
 || Welcome to the Spoken Tutorial on '''Logistic Regression - Multiclass Classification.'''
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show slide:
 '''Learning Objectives'''
 || In this tutorial, we will learn about
-* <div style="margin-left:1.27cm;margin-right:0cm;">Multiclass Classification for Logistic Regression</div>
+* Multiclass Classification for Logistic Regression
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show slide:
 '''System Requirements'''
 || To record this tutorial, I am using
-* <div style="margin-left:1.27cm;margin-right:0cm;">'''Ubuntu Linux '''OS version''' 24.04'''</div>
+* '''Ubuntu Linux '''OS version''' 24.04'''
-* <div style="margin-left:1.27cm;margin-right:0cm;">'''Jupyter Notebook '''IDE</div>
+* '''Jupyter Notebook '''IDE
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show slide:
 '''Prerequisite'''
 || To follow this tutorial,
-* <div style="margin-left:1.27cm;margin-right:0cm;">The learner must have basic knowledge of '''Python.'''</div>
+* The learner must have basic knowledge of '''Python.'''
-* <div style="margin-left:1.27cm;margin-right:0cm;">For pre-requisite '''Python''' tutorials, please visit this website.</div>
+* For pre-requisite '''Python''' tutorials, please visit this website.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show slide:
 '''Code files'''
 ||
-* <div style="margin-left:1.27cm;margin-right:0cm;">The files used in this tutorial are provided in the '''Code files '''link.</div>
+* The files used in this tutorial are provided in the '''Code files '''link.
-* <div style="margin-left:1.27cm;margin-right:0cm;">Please download and extract the files.</div>
+* Please download and extract the files.
-* <div style="margin-left:1.27cm;margin-right:0cm;">Make a copy and then use them while practicing.</div>
+* Make a copy and then use them while practicing.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show slide:
 '''Iris flower classification'''
 || To implement the '''Multiclass classification model '''we will,
-* <div style="margin-left:1.27cm;margin-right:0cm;">Use the '''iris '''dataset to classify the '''iris '''flower.</div>
+* Use the '''iris '''dataset to classify the '''iris '''flower.
-* <div style="margin-left:1.27cm;margin-right:0cm;"><span style="color:#000000;">To know more about the </span><span style="color:#000000;">'''iris'''</span><span style="color:#000000;"> dataset please watch </span>earlier <span style="color:#000000;">tutorials.</span></div>
+* To know more about the '''iris''' dataset please watch earlier tutorials.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Point to the '''LR_Multiclass.ipynb'''
 || '''LR_Multiclass dot ipynb '''is the ipython notebook file created for this demonstration.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Press '''Ctrl+Alt'''+'''T '''keys
@@ Line 65: / Line 61: @@
 Activate the machine learning environment as shown.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Type '''cd Downloads'''
@@ Line 76: / Line 72: @@
 Then type, '''jupyter space notebook '''and press''' Enter.'''
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show Jupyter Notebook Home page:
@@ Line 84: / Line 80: @@
 Click on the''' LR underscore Multiclass dot ipynb '''file to open it.
-<div style="color:#000000;">Note that each cell will have the output displayed in this file.</div>
+Note that each cell will have the output displayed in this file.
 Let us see the implementation of '''multiclass logistic regression'''.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''import pandas as pd '''
 || These are the necessary libraries to be imported for '''Multiclass classification.'''
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Only narration
@@ Line 103: / Line 99: @@
 Then we display the first five rows using the '''head''' method.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''Data Preprocessing'''
 || Now, let us prepare the data for training.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight
 '''X = iris.data '''
 || We create variable '''X''' and assign all feature columns to it.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''Y = iris.target'''
 || Next, we assign the target column to the variable '''Y'''.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''df = pd.DataFrame(X, columns=iris.feature_names)'''
 '''df['target'] = Y'''
 || To analyze the data better, we create a DataFrame '''df''' using '''pd dot DataFrame'''.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''corr_matrix = df[iris.feature_names].corr()'''
 || We compute '''correlation''' values between features of the '''Iris''' dataset using '''df dot corr'''.
@@ Line 126: / Line 122: @@
 The '''heatmap''' shows how features relate to one another.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''Train and Test Split'''
 || Next, we split the data into training and testing sets.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''Model Instantiation of Multiclass Classification and Model training'''
 || Let us now build a multiclass classification model.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''mlr = LogisticRegression(multi_class='multinomial', solver='lbfgs', max_iter=1000) '''
@@ Line 145: / Line 141: @@
 Ignore the warning in the output cell, if any.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight
@@ Line 152: / Line 148: @@
 || Now, we calculate and print the '''training accuracy'''.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Hightlight '''Training Accuracy: 0.981'''
 || The '''training accuracy''' is approximately '''0.981''', which is quite good.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight
@@ Line 162: / Line 158: @@
 A Loss of '''0.1308''' shows the model is making accurate predictions.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight
@@ Line 185: / Line 181: @@
 It is the fraction of actual negatives wrongly classified as positives.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show output plot
-|| <div style="color:#000000;">The '''ROC curve''' shows near-perfect classification.</div>
+|| The '''ROC curve''' shows near-perfect classification.
-<div style="color:#000000;">The curves stay close to the '''top-left corner'''. </div>
+The curves stay close to the '''top-left corner'''.
-<div style="color:#000000;">All three classes achieve an '''AUC''' of '''1.00'''. </div>
+All three classes achieve an '''AUC''' of '''1.00'''.
-<div style="color:#000000;">This indicates the model effectively distinguishes the classes.</div>
+This indicates the model effectively distinguishes the classes.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight: '''Predictions for Test Data'''
 || Further, we predict labels for x underscore test.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''test_data = X_test[15].reshape(1, -1)'''
@@ Line 204: / Line 200: @@
 We compare the predicted with the actual test class.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight: '''Predicted class: 0, Actual class: 0'''
 || The predicted value is 0, which is Setosa.
 The actual value is also 0, hence prediction is correct.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight '''Y_pred = mlr.predict(x_test)'''
 || '''y underscore pred '''stores predicted labels for all test samples.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight: '''print("Multiclass classification - Actual vs Predicted:")'''
 || We compare the actual class labels with the predicted labels.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight:''' Multiclass Logistic Regression - Actual vs Predicted:'''
 || The output shows both actual and predicted label arrays.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight: '''test_accuracy = accuracy_score(y_test, y_pred)'''
@@ Line 224: / Line 220: @@
 || Now we calculate the '''test accuracy'''.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight:''' Test Accuracy: 0.978'''
 || We get an accuracy of approximately '''0.978''', which is pretty good.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight
@@ Line 236: / Line 232: @@
 || We also compute '''ROC-AUC score''' and '''cross-entropy loss''' for test data.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight
@@ Line 246: / Line 242: @@
 '''Cross-entropy loss''' of '''0.1616''' shows the predictions are accurate.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Highlight
@@ Line 255: / Line 251: @@
 It shows how well the model classifies each class.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show output plot
 || This matrix has three classes: 0, 1, and 2.
@@ Line 266: / Line 262: @@
 A '''strong diagonal pattern''' suggests '''high classification accuracy'''.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Only narration
 || Now we have successfully classified different Iris flower classes.
 This brings us to the end of the tutorial. Let us summarize.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show slide:
 '''Summary'''
-|| In this tutorial, we have learnt about* <div style="margin-left:1.27cm;margin-right:0cm;">Multiclass Classification for Logistic Regression</div>
+|| In this tutorial, we have learnt about* Multiclass Classification for Logistic Regression
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show slide:
@@ Line 286: / Line 282: @@
 * This shows '''precision''', '''recall''', '''f1-score''', and '''support''' for each class.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
 || Show slide:
 || After completing the assignment, the output should match the expected result.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
-|| <div style="color:#000000;">Show Slide:</div>
+|| Show Slide:
-<div style="color:#000000;">'''FOSSEE Forum'''</div>
+'''FOSSEE Forum'''
-|| <span style="background-color:#ffffff;color:#000000;">For any general or technical questions on </span><span style="color:#000000;">'''Python for'''</span>
+|| For any general or technical questions on '''Python for'''
-<span style="color:#000000;">'''Machine Learning'''</span><span style="background-color:#ffffff;color:#000000;">, visit the</span><span style="background-color:#ffffff;color:#000000;">''' FOSSEE forum'''</span><span style="background-color:#ffffff;color:#000000;"> and post your question.</span>
+'''Machine Learning''', visit the''' FOSSEE forum''' and post your question.
-|- style="border:0.5pt solid #000000;padding-top:0cm;padding-bottom:0cm;padding-left:0.199cm;padding-right:0.191cm;"
+|-
-|| <div style="color:#000000;">Show Slide:</div>
+|| Show Slide:
-<div style="color:#000000;">'''Thank You'''</div>
+'''Thank You'''
-|| <span style="color:#000000;">This is </span><span style="color:#000000;">'''Anvita Thadavoose Manjummel'''</span><span style="color:#000000;">, a FOSSEE Summer Fellow 2025, IIT Bombay signing off.</span>
+|| This is '''Anvita Thadavoose Manjummel''', a FOSSEE Summer Fellow 2025, IIT Bombay signing off.
-<div style="color:#000000;">Thanks for joining.</div>
+Thanks for joining.
 |-
 |}