Ketkinaina: Created page with "'''Title of the Script: Understanding the RAG Pipeline.''' '''Author: EdyPyramids Team.''' '''Keywords: RAG, Retrieval Augmented Generation, Artificial Intelligence, Embeddi..."

2026-05-19T20:58:31Z

Created page with "'''Title of the Script: Understanding the RAG Pipeline.''' '''Author: EdyPyramids Team.''' '''Keywords: RAG, Retrieval Augmented Generation, Artificial Intelligence, Embeddi..."

New page

'''Title of the Script: Understanding the RAG Pipeline.'''

'''Author: EdyPyramids Team.'''

'''Keywords: RAG, Retrieval Augmented Generation, Artificial Intelligence, Embeddings, Vector Database, Vector Search, AI Chatbot, Document Chunking, AI Pipeline, Query Processing, Information Retrieval, Generative AI, EduPyramids, video tutorial.'''

{|border=1
|-
|| '''Visual Cue'''
|| '''Narration'''
|-
|| '''Slide 1'''

'''Title Slide'''
|| Welcome to this '''Spoken''' '''Tutorial''' on '''Understanding the RAG Pipeline.'''
|-
|| '''Slide 2'''

'''Learning Objectives'''
|| In this tutorial, we will learn:
* The stages of a '''RAG pipeline.'''
* How a user query is processed.
* How retrieval improves the generated response.
|-
|| '''Slide 3'''

'''Disclaimer Slide'''

As '''AI''' tools constantly evolve, if you are unable to locate any icon or encounter difficulty at any step, you may use any conversational '''AI''' '''Chatbot''' for guidance.
|| As '''AI''' tools constantly evolve, if you are unable to locate any icon or encounter difficulty at any step, you may use any conversational '''AI''' '''Chatbot''' for guidance.
|-
|| '''Slide 4'''

'''System Requirements'''
|| To record this tutorial, I am using:
* '''Ubuntu 24.04 LTS, '''
* '''Firefox '''version''' 148.0.2'''

Learners will also need a working internet connection
|-
|| '''Slide 5'''

'''Prerequisites'''

[https://edupyramids.org/ https://EduPyramids.org]
|| To follow this tutorial,
* Learners should know basic computer and internet usage.
* Basic '''AI''' knowledge is required.

For the Prerequisite of this tutorial, visit the website shown on your screen
|-
||
|| Let us get started.
|-
|| '''RAG pipeline diagram with Retrieval + Generation blocks'''
|| '''RAG''' is known as '''Retrieval Augmented Generation.'''
|-
|| '''Flowchart showing sequence of stages'''
|| It is a sequence of steps used to answer a question.
|-
|| '''AI system connected to external documents/database'''
|| It uses external data to generate accurate answers.
|-
|| '''AI brain with “Stored Knowledge” and “Retrieved Knowledge” labels'''
|| The system does not rely only on stored knowledge.
|-
|| '''Search and retrieval animation from documents'''
|| It retrieves relevant information to generate answers.
|-
|| '''“Step-by-step process” illustration'''
|| Let us understand this step by step.
|-
|| '''Grocery shopping chatbot interface'''
|| Consider a grocery app '''chatbot'''.
|-
|| '''User typing a question into chatbot'''
|| Ask this question: “Can I return vegetables after 2 days?”
|-
|| '''Human brain vs computer comparison graphic'''
|| Now the system cannot understand words directly like humans
|-
|| '''Flowchart showing processing stages'''
|| So it follows a series of steps to find the answer.
|-
|| '''Text converting into vectors/numbers animation'''
|| First, the system converts the question into a machine-understandable form.
|-
|| '''Highlight the term “Embedding”'''
|| This process is called '''embedding'''.
|-
|| '''Binary numbers or vectors beside words'''
|| Computers work with numbers, not words.
|-
|| '''Words converting into numeric arrays'''
|| So every word is represented as a list of numbers.
|-
|| '''Example vector representation of “vegetables”'''
|| For example,

The word ‘vegetables’ may be converted into a list of numbers like:

[0.21, 0.45, 0.78, …]
|-
|| '''Side-by-side vectors for vegetables and fruits'''
|| Similarly,

The word ‘fruits’ will have a different set of numbers.
|-
|| '''Highlight closeness between vectors'''
|| But the values will be close to ‘vegetables’ because both are related.
|-
|| '''Vector comparison illustration'''
|| Now, instead of comparing words,

The system compares these numbers.
|-
|| '''Similar vectors connected visually'''
|| If two sentences have similar '''vectors''', their meanings are similar.
|-
|| '''Question matched with stored documents'''
|| This lets the system compare the question with stored documents
|-
|| '''Similarity measurement graphic'''
|| It helps measure sentence similarity.
|-
|| '''Company documents or policy PDF shown'''
|| Next, the system searches company documents such as the return policy.
|-
|| '''Large document splitting into blocks'''
|| These documents are often divided into smaller parts called '''chunks'''.
|-
|| '''Three text boxes labeled Chunk 1, Chunk 2, Chunk 3'''
|| A return policy document may be split into '''chunks''' like this
|-
|| '''Three text boxes labeled Chunk 1, Chunk 2, Chunk 3'''
|| '''Chunk 1: '''Return rules for fruits and vegetables

'''Chunk 2:''' Return rules for packaged items

'''Chunk 3:''' Refund process and timelines
|-
|| '''Chunks converting into embeddings'''
|| Now, each '''chunk''' is converted into numbers using the same '''embedding''' process.
|-
|| '''Vector database illustration with stored vectors'''
|| These number representations are stored in a '''database''' called a '''vector database'''.
|-
|| '''Search icon over vector database'''
|| A '''vector database''' stores data in the form of numbers.

It can quickly find similar meanings.
|-
|| '''User query converting into vectors'''
|| When you ask:

‘Can I return vegetables after 2 days?’

Your question is also converted into numbers.
|-
|| '''Query vector compared with stored chunk vectors'''
|| Then, the system compares your question with '''chunks''' in the''' vector database'''.
|-
|| '''Best matching chunk highlighted'''
|| It finds the most similar '''chunk'''.

For example:

‘Fresh vegetables can be returned within 24 hours only.’
|-
|| '''Retrieved chunk sent to AI model'''
|| This relevant '''chunk''' is then used to generate the final answer.
|-
|| '''Multiple retrieved policy lines displayed'''
|| The system compares the question with '''chunks''' and retrieves relevant '''chunks'''.
|-
|| '''Highlight retrieved policy statements'''
|| The system may retrieve statements such as:
* “Perishable items cannot be returned.”
* “Items can be returned within 24 hours.”
|-
|| '''Good retrieval vs bad retrieval comparison'''
|| Note that the quality of the final answer depends on the retrieved information.
|-
|| '''Relevant chunk highlighted among many results'''
|| The system selects the most relevant information from the retrieved results.
|-
|| '''Question and retrieved chunk combined visually'''
|| This information is combined with the original question.
|-
|| '''Two blocks labeled Question and Policy Line'''
|| The system now has two things:
Your question and the relevant policy line.
|-
|| '''Context box formed from question + retrieved text'''
|| Together, these form the '''context'''

The '''AI''' uses the background information to generate an answer.
|-
|| '''Policy statement highlighted'''
|| For example, from the return policy, it may find this line:

‘Fresh vegetables can be returned within 24 hours only.’
|-
|| '''Original question shown beside retrieved statement'''
|| Now, this information is combined with your original question:
|-
|| '''Question displayed again prominently'''
|| ‘Can I return vegetables after 2 days?’
|-
|| '''Combined input entering AI model'''
|| So the system now has both, your question and the relevant policy information
|-
|| '''Label “Context” shown clearly'''
|| This combined input is called '''context'''.
|-
|| '''Definition text animation'''
'''Context → AI → Response flowchart'''
|| In simple terms, '''context''' is the background information given to the system.
|-
|| '''Final chatbot answer displayed on screen'''
|| Using this '''context''', the system can now generate a correct response, like this:
‘No, vegetables cannot be returned after 2 days.

As the policy allows returns only within 24 hours.
|-
|| '''AI model reading retrieved chunk and question'''
|| The '''AI model''' reads the question and the retrieved information.
|-
|| '''Generated answer appearing'''
|| It then generates an answer based on this information.
|-
|| '''Alternative generated answer example'''
|| For example:

“Vegetables are perishable items and cannot be returned after delivery.”
|-
|| '''Incorrect guessing crossed out, retrieved answer ticked'''
|| The system reduces guess work and bases its answer on retrieved information.
|-
|| '''Relevant answer highlighted'''
|| This leads to a more relevant response.

Without retrieval, the answer may be generic or incorrect.
|-
|| '''Three-step pipeline animation: Question → Retrieval → Answer'''
|| The process is as follows:
* A question is asked.
* Relevant information is retrieved.
* An answer is generated.
|-
|| '''Wrong retrieval leading to incorrect answer illustration'''
|| Wrong or unrelated retrieval may lead to an incorrect answer.
|-
||
|| With this we come to the end of this tutorial.
|-
|| '''Slide 6'''

'''Summary'''

'''In this tutorial, we learnt:'''
* '''The stages of a RAG pipeline.'''
* '''How a user query is processed.'''
* '''How retrieval improves the generated response.'''
|| In this tutorial, we learnt:
* The stages of a RAG pipeline.
* How a user query is processed.
* How retrieval improves the generated response.
|-
|| '''Slide 7'''

'''Assignment '''

'''Create a simple grocery return policy with at least three rules.'''

'''Ask the question: “Can I return vegetables after 2 days?”'''

'''Then ask the same question again using your policy as context.'''

'''Compare both responses and observe how retrieval improves the final answer.'''

'''Also identify:'''
* '''the user question,'''
* '''the retrieved information, and'''
* '''the final answer.'''
|| We encourage you to do this assignment.
|-
|| '''Slide 8'''

'''Acknowledgement'''

'''Domain Inputs: Bhavani Shankar R and Saisudha Sugavanam '''

'''Script Writer: Ketki Naina'''

'''Admin Reviewer: Arthi Varadarajan'''

'''Quality Reviewer: Sakina Sidhwa'''

'''Novice Reviewer: Misbah Samir'''

'''AI Narration: Debosmita Mukherjee'''

'''AI Graphics: Arvind Pillai'''

'''Video Editor: Arvind Pillai'''

'''Web Developer: Ankita Singhal'''
|| Thank you for joining.
|-
|| '''Slide 9'''

'''Acknowledgement'''

'''This Spoken Tutorial is brought to you by EduPyramids Educational Services Private Limited at SINE, IIT Bombay. '''
||
|-
|}

Being-Creative-with-AI/C4/Understanding-the-RAG-pipeline/English - Revision history

Ketkinaina: Created page with "'''Title of the Script: Understanding the RAG Pipeline.''' '''Author: EdyPyramids Team.''' '''Keywords: RAG, Retrieval Augmented Generation, Artificial Intelligence, Embeddi..."