Being-Creative-with-AI/C4/Building-a-Basic-RAG-system/English

From Script | Spoken-Tutorial
Jump to: navigation, search

Title of the script: Building a basic RAG System.

Author: EduPyamids

Keywords: RAG, Retrieval Augmented Generation, Python, LangChain, ChromaDB, Hugging Face Embeddings, Vector Database, AI, Similarity Search, Ubuntu, Linux, Pandas, Question Answering, Text Retrieval, Local RAG, Free AI Tools, Machine Learning, EduPyramids, video tutorial.

Visual Cue Narration
Slide 1

Title Slide

Welcome to this Spoken Tutorial on Building a basic RAG System.
Slide 2

Learning Objectives

In this tutorial, we will learn how to-
  • Build a simple RAG system.
  • Retrieve and display answers from a dataset.
Slide 3

Disclaimer Slide

As AI tools constantly evolve, if you are unable to locate any icon or encounter difficulty at any step, you may use any conversational AI Chatbot for guidance.

As AI tools constantly evolve, if you are unable to locate any icon or encounter difficulty at any step, you may use any conversational AI Chatbot for guidance.
Slide 4

System Requirements

To record this tutorial, I am using:
  • Ubuntu 24.04 LTS

Learners will also need a working internet connection

Slide 5

Prerequisites

https://EduPyramids.org

To follow this tutorial,
  • Learners should have Python installed on their system.
  • A basic understanding of using the terminal.

For the Prerequisites of this tutorial, visit the website shown on your screen

Slide 6

Code files

The following code file is required to practice this tutorial* rag-command.txt

This file is provided in the Code Files link of this tutorial page

Please download and extract the file.

The following code file is required to practice this tutorial.

This file is provided in the Code Files link of this tutorial page.

Please download and extract the file.

In this tutorial, we will use a completely free setup.

No API key or billing is required

Let us get started.
Press Ctrl, Alt and T keys together to open the terminal. Press Ctrl, Alt and T keys together to open the terminal.
Type: cd rag_project and press Enter. Type: cd rag_project and press Enter.

This command moves into the project folder.

Type: source venv/bin/activate and press enter. To activate the environment type this command and press Enter.

You should now see (venv) in the terminal.

Type:

nano sample_data.csv and press Enter.

Point the cursor on the GNU editor.

Question,Answer

What is the return policy?,Items can be returned within 24 hours.

Are vegetables returnable?,Perishable items cannot be returned.

When will I get my refund?,Refunds take 3 to 5 business days.

Press Ctrl + O, then Enter to save.

Press Ctrl + X to exit.

Now let us create a dataset file.

Type the command and press Enter.

GNU nano editor opens.

Pause the tutorial and enter this data.

Press Ctrl + O, then Enter to save.

Press Ctrl + X to exit.

Type: nano basic_rag_demo.py and press Enter. Now create a Python file

Type this command to create a Python file and press Enter.

Type:

import pandas as pd

from langchain_community.vectorstores import Chroma

from langchain_community.embeddings import HuggingFaceEmbeddings


# Load dataset

data = pd.read_csv("sample_data.csv")

# Combine Question and Answer

documents = (data["Question"] + " " + data["Answer"]).tolist()


# Create embeddings (free model)

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")


# Store in vector database

db = Chroma.from_texts(documents, embeddings)


# Take user query

query = input("Enter your question: ")


# Retrieve similar result

results = db.similarity_search(query, k=1)

context = results[0].page_content


# Display context

print("\nRetrieved Context:")

print(context)


# Extract answer

if "?" in context:

answer = context.split("?")[-1].strip()

else:

answer = context


print("\nAnswer:")

print(answer)

Press Ctrl + O, then Enter to save.

Press Ctrl + X to exit

Pause the tutorial and type this code carefully.

This script creates embeddings and retrieves similar answers from data.

Embeddings help find similar meanings, not just exact keyword matches

Chroma acts as a lightweight vector database for storing embeddings.

The system retrieves the most relevant match from the stored data.

Press Ctrl + O, then Enter to save.

Press Ctrl + X to exit.

Type: python basic_rag_demo.py and press Enter. Now run the program.

Type the following command and press Enter.

Type: Can I return vegetables? Type: Can I return vegetables?
Highlight the relevant context. The system retrieves relevant context.
Highlight the displayed answer. Then it displays the answer.
Output highlighted Notice that the system finds the closest matching data.

The system then extracts the answer from the retrieved context.

You have successfully built a simple RAG system.

This system works completely free without any API key.

With this, we come to the end of this tutorial.
Slide 7

Summary

In this tutorial, we learnt how to:

  • Build a simple RAG system.
  • Retrieve and display answers from a dataset.
In this tutorial, we learnt how to* Build a simple RAG system.
  • Retrieve and display answers from a dataset.
Slide 8

Acknowledgement

Domain Inputs: Bhavani Shankar R and Saisudha Sugavanam

Script Writer: Ketki Naina

Admin Reviewer: Arthi Varadarajan

Quality Reviewer: Sakina Sidhwa

Novice Reviewer: Misbah Samir

AI Narration: Debosmita Mukherjee

Screen recording:

Video Editor: Arvind Pillai

Web Developer: Ankita Singhal

Thank you for joining.
Slide 9

Acknowledgement

This Spoken Tutorial is brought to you by EduPyramids Educational Services Private Limited at SINE, IIT Bombay.

Contributors and Content Editors

Ketkinaina