Difference between revisions of "Python for Machine Learning"

From Script | Spoken-Tutorial
Jump to: navigation, search
 
Line 4: Line 4:
  
 
In this Python for Machine Learning series, we will explore core concepts, practical techniques, and essential libraries that form the foundation of modern machine learning workflows. If you're a beginner stepping into AI or an experienced developer looking to deepen your skills, this series will guide you through hands-on examples and best practices to unlock the full potential of Python in machine learning.
 
In this Python for Machine Learning series, we will explore core concepts, practical techniques, and essential libraries that form the foundation of modern machine learning workflows. If you're a beginner stepping into AI or an experienced developer looking to deepen your skills, this series will guide you through hands-on examples and best practices to unlock the full potential of Python in machine learning.
 
  
 
The contributors who helped to create the outline, transcribe, code, and record the tutorials are '''Anvita Thadavoose Manjummel''' and '''Harini Theiveegan''' under the guidance of '''Dr. T. Subbulakshmi''' and '''Dr. R. Bhargavi''', Professor, School of Computer Science and Engineering, Vellore Institute of Technology Chennai. The Spoken Tutorial Effort for Python for Machine Learning is being contributed by ''' Ms. Nirmala Venkat''' and '''Ms.Madhuri Ganapathi''' from the Spoken Tutorial project, Indian Institute of Technology Bombay.
 
The contributors who helped to create the outline, transcribe, code, and record the tutorials are '''Anvita Thadavoose Manjummel''' and '''Harini Theiveegan''' under the guidance of '''Dr. T. Subbulakshmi''' and '''Dr. R. Bhargavi''', Professor, School of Computer Science and Engineering, Vellore Institute of Technology Chennai. The Spoken Tutorial Effort for Python for Machine Learning is being contributed by ''' Ms. Nirmala Venkat''' and '''Ms.Madhuri Ganapathi''' from the Spoken Tutorial project, Indian Institute of Technology Bombay.
 
 
 
  
  
Line 150: Line 146:
 
*Introduction to Ensemble Learning
 
*Introduction to Ensemble Learning
 
*Introduction to Random Forest
 
*Introduction to Random Forest
**Importing Libraries
+
*Importing Libraries
 
*Loading the dataset
 
*Loading the dataset
 
*Data Preprocessing
 
*Data Preprocessing

Latest revision as of 17:15, 18 July 2025

Machine Learning is considered as a transformative field of artificial intelligence that enables systems to learn from data, identify patterns, and make decisions with minimal human intervention.

Python's simplicity, flexibility, and extensive ecosystem make it the language of choice for machine learning, enabling developers and data scientists to build intelligent systems with ease, speed, and accuracy. Python offers a complete toolbox for every stage of the machine learning pipeline. Beyond model development, Python empowers you to visualize data insights with libraries like Matplotlib and Seaborn, assess algorithm performance through confusion matrices and ROC curves, and apply deep learning techniques using powerful frameworks such as TensorFlow and PyTorch.

In this Python for Machine Learning series, we will explore core concepts, practical techniques, and essential libraries that form the foundation of modern machine learning workflows. If you're a beginner stepping into AI or an experienced developer looking to deepen your skills, this series will guide you through hands-on examples and best practices to unlock the full potential of Python in machine learning.

The contributors who helped to create the outline, transcribe, code, and record the tutorials are Anvita Thadavoose Manjummel and Harini Theiveegan under the guidance of Dr. T. Subbulakshmi and Dr. R. Bhargavi, Professor, School of Computer Science and Engineering, Vellore Institute of Technology Chennai. The Spoken Tutorial Effort for Python for Machine Learning is being contributed by Ms. Nirmala Venkat and Ms.Madhuri Ganapathi from the Spoken Tutorial project, Indian Institute of Technology Bombay.


Basic Level

1. Setup Python environment for Machine Learning

  • Installation of Miniconda in Ubuntu OS
  • Creating a conda environment for Machine Learning
  • Activating conda environment for Machine Learning
  • Download MLpackage.txt file
  • Install all the libraries available in the txt file in Ubuntu OS
  • Installing Jupyter Notebook in the Machine Learning environment
  • Installing conda kernels in Machine Learning environment
  • About Jupyter Notebook and its basics
  • Import the Wine.csv dataset and display the first five rows
  • Deactivate the conda environment

2. K Nearest Neighbor Classification

  • Introduction to Nearest Neighbors and K Nearest Neighbor
  • Introduction to KNN classification
  • Explanation about Iris dataset
  • KNN working example using one of the iris feature
  • Importing the necessary libraries
  • Loading the Iris dataset
  • Basic Data Exploration and Analysis
  • Train and Test Split of dataset
  • Choosing the K value using elbow method
  • KNN classification model building
  • Model prediction and outcome
  • Evaluation metrics using classification report

3. K Nearest Neighbor Regression

  • Introduction to K Nearest Neighbor Regression
  • Various distance metrics used in KNN
  • Importing the necessary libraries
  • Loading the iris dataset
  • Standard scaling of the dataset
  • Train and Test Split of dataset
  • Choosing the K value using elbow method
  • KNN regression model building
  • Model prediction and outcome
  • Evaluation using MSE and Adjusted R squared score

4. Linear Regression

  • About Linear Regression
  • About Simple Linear Regression
  • About Multiple Linear Regression
  • About Evaluation Metrics
  • Splitting the data into training and testing sets
  • Implementing Simple Linear Regression model from scikit-learn
  • Importing required Libraries
  • Loading the dataset
  • Evaluating the model’s accuracy
  • Implementing Multiple Linear Regression model from scikit-learn
  • Evaluating the model's accuracy

5. Logistic Regression Binary Classification

  • Introduction to Logistic Regression
  • Introduction to Binary classification
  • Introduction to Multi class classification
  • About Purchase prediction
  • Implementing Binary classification
  • Model Instantiation of Binary Classification and Model training
  • Prediction for Train Data - Verification for Binary Classification
  • Predictions for Test Data for Binary Classification
  • Calculate the ROC-AUC score on the training data
  • Calculate the cross entropy loss for the training data

6. Logistic Regression MultiClass Classification

  • Implementing Multiclass classification
  • Model Instantiation of Multiclass Classification and Model training
  • Visualize this correlation using a heatmap
  • Split the data into training and testing sets
  • Build a multiclass classification model
  • Prediction for Train Data - Verification for Multiclass Classification
  • Predictions for Test Data for Multiclass Classification
  • Compare the predicted with the actual test class
  • Visualize the confusion matrix of the model

Intermediate Level

1. Decision Tree

  • Introduction to Decision Tree
  • Describing the dataset
  • Importing required Libraries
  • Loading the dataset
  • Encoding Categorical Features
  • Splitting the dataset into Training and Testing sets
  • Training Decision Tree Classifier
  • Evaluating the model's accuracy
  • Plotting Confusion matrix
  • Visualizing Decision Tree

2. Artificial Neural Networks

  • Introduction to Artificial Neural Networks
  • Introduction to Multi-Layer Perceptron
  • About ANN Architecture
  • Explanation of Neuron Structure
  • Importing necessary libraries
  • Loading Breast Cancer dataset
  • Basic Data Exploration and Analysis
  • Train and Test split of dataset
  • MLP Classification model building
  • Model prediction and outcome
  • Evaluation of model’s performance

3. Support Vector Machine

  • About Support Vector Machine
  • Introduction to Linear SVM
  • Introduction to Non-Linear SVM
  • Explanation of the California Housing dataset
  • Importing necessary libraries
  • Loading the dataset
  • Label Encoding
  • Train and Test Split of dataset
  • Linear SVM classification model building
  • Model prediction and outcome
  • Evaluation for Linear SVM classification
  • Non-Linear (RBF) SVM classification model building
  • Model prediction and outcome
  • Evaluation for Non-Linear (RBF) SVM classification

4. K Means Clustering

  • Introduction to K-means Clustering
  • Working of K-means Clustering
  • Description about Silhouette Score
  • Description about the customers dataset
  • Importing required Libraries
  • Loading the dataset
  • Data Exploration
  • Finding optimal number of clusters
  • Instantiating K-means Clustering model
  • Clustering the data
  • Visualizing the Clusters for the Data

5. Random Forest

  • Introduction to Ensemble Learning
  • Introduction to Random Forest
  • Importing Libraries
  • Loading the dataset
  • Data Preprocessing
  • Train and Test Split
  • Model Instantiation of Random Forest and Model training
  • Prediction for Train Data - Verification for Random Forest
  • Predictions for Test Data for Random Forest
  • MSE and Adjusted R square score for Random Forest

Contributors and Content Editors

Nirmala Venkat