Python-for-Automation/C3/Spelling-and-Grammar-Checker/English

From Script | Spoken-Tutorial
Revision as of 14:23, 30 August 2024 by Nirmala Venkat (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Visual Cue Narration
Show slide:

Welcome

Welcome to the Spoken Tutorial on Spelling and Grammar Checker
Show slide:

Learning Objectives

In this tutorial, we will learn to
  • Create functions to identify spelling and grammar errors
  • Implement text processing techniques in Python
  • Create a GUI application with a text area and buttons
Show slide:

System Requirements

Ubuntu Linux OS 22.04

Python 3.12.3

To record this tutorial, I am using
  • Ubuntu Linux OS version 22.04</div>
  • Python 3.12.3</div>
Show slide:

Prerequisite

https:/spoken-tutorial.org

To follow this tutorial:
  • You must have basic knowledge of using Linux Terminal and Python
  • For pre-requisite Linux and Python Tutorials, please visit this website
  • Python libraries required for automation must be installed
Show slide:

Code files

  • The files used in this tutorial are provided in the Code'files' link.
  • Please download and extract the files.
  • Make a copy and then use them while practicing.
Show Slide:

Spelling and Grammar Checker - GUI

In this tutorial we will build the GUI for a Spelling and Grammar checker as shown here.
Slide:

Spelling and Grammar Checker

  • The checker will refer to an inbuilt dictionary to identify wrong spellings
  • It will use language_tool_python to recognize incorrect grammar and suggest corrections
Show Slide:

Spelling and Grammar Checker - Libraries

The following Python libraries are required for this tutorial:
  • Language_tool_python provides grammar and style checking capabilities.
  • NLTK is a library for text processing and natural language understanding tasks.
Show Slide:

Spelling and Grammar Checker - Libraries

  • re module helps to manipulate strings based on patterns using regular expressions.
  • tkinter is the standard GUI toolkit for Python.
  • docx is for reading and writing Microsoft Word files.
  • Fitz (PyMuPDF) is for reading PDF files
Slide: Install java and tkinter Note that Java and tkinter must be installed for this tutorial.

If not, please install using the below commands.

Point to the file in downloads folder I have created the source code checker.py for this demonstration.
Open checker.py file Now, let us go through the code in the text editor.
Highlight:

import re import tkinter as tk from tkinter import scrolledtext, filedialog from nltk.corpus import words import nltk import language_tool_python import docx import fitz

First we need to import the necessary modules.

This helps to create scrollable text areas and file dialogs.

It also provides access to Python’s inbuilt spelling dictionary and grammar library.

Highlight:

try:

nltk.download('words')

First, we download the 'words' corpus from NLTK.

This corpus has a large collection of English words used for spell-checking.

Highlight:

except Exception as e:

print(f"NLTK Download Error: {e}")

Here, the exception block catches any errors that may occur in the try block.
Highlight:

try:

tool = language_tool_python .LanguageTool('en-US')

Next, we initialize an instance of LanguageTool.

It checks grammar and style according to US English.

Highlight:

except language_tool_python.ServerException as e: print(f"Language Tool Initialization Error: {e}")

This block handles exceptions raised by the LanguageTool during its initialization.
Highlight:

def checkSpelling(text):

checkSpelling function checks the spelling in the given text.
Highlight:

wordList = set(words.words())

Here the list of words from the NLTK corpus is converted into a set.

A set will allow for faster look-up.

Highlight:

misspelled = []

Initialize a list which will store all the misspelled words in the text.
Highlight:

for word in re.findall(r'\b\w+\b', text):

Then we run a for loop to iterate through each word in the text.

re.findall method finds all words in the text using regular expressions.

Highlight:

if word.lower() not in wordList:

misspelled.append(word)

return misspelled

Each word is converted to lowercase and checked against the wordList.

If the word is not found in the wordList, it is added to the misspelled list.

Highlight:

def checkGrammar(text):

Next, the checkGrammar function checks the grammar in the text.
Highlight:

matches = tool.check(text) return matches

check is a method of the LanguageTool that checks the text for grammatical errors.

We store the result of the grammar check in a variable named matches.

Highlight:

def checkText():

checkText function combines the grammar and spelling checks and displays results.
Highlight:

text = textArea.get("1.0", tk.END)

get method retrieves all the text from the text area.
Highlight:

“1.0”

Here, 1 point 0 is a textual representation of a position in the text widget.

It points to the beginning of the first line.

Highlight:

spellingErrors = checkSpelling(text)

grammarErrors = checkGrammar(text)

We call the checkSpelling function and the checkGrammar function.
Highlight:

result = "Spelling Errors:\n"

if spellingErrors:

result += ", ".join(spellingErrors)

else: result += "None"

If there are any spelling errors we append them to a string named result.
Highlight:

result += "\n\nGrammar Errors:\n" if grammarErrors: for match in grammarErrors: result += f"{match.context} -> {match.message}\n" else: result += "None"

Then, we iterate through any grammar errors and append them to result.

If no errors are found, the word None is appended to result.

Highlight:

resultArea.config(state=tk.NORMAL)

Next, we add commands to configure the resultArea.

In the config method set the state to tk dot NORMAL to make the result area editable.

This will allow the program to print the error results in the result area.

Highlight:

resultArea.delete("1.0", tk.END)

delete method clears text from the specified start position to end position in the result area
Highlight:

resultArea.insert(tk.END, result)

insert method inserts the result string containing the errors into the result area.
Highlight:

resultArea.config(state=tk.DISABLED)

Again, in the config method set the state to tk dot DISABLED to disable editing.
Highlight:

def uploadFile():

uploadFile function opens a file dialog box to upload a text, Word, or PDF file.

It reads the file content, and inserts it into the text area.

Highlight:

file_path = filedialog.askopen filename(filetypes=[("Text files", "*.txt"), ("Word documents", "*.docx"), ("PDF files", "*.pdf")])

askopenfilename opens a file dialog window that allows the user to select a file.
Highlight:

filetypes=[("Text files", "*.txt"), ("Word documents", "*.docx"), ("PDF files", "*.pdf")]

filetypes restricts the selectable files to text, word or pdf files.
Highlight:

if file_path:

This checks if a file is selected.
Highlight:

if file_path.endswith(".txt"):

with open(file_path, 'r', encoding='utf-8') as file:

text = file.read()

First we check if the selected file is a text file.

If yes, then we open the file in read mode with UTF-8 encoding.

Then we read the entire content of the file into the variable text.

Highlight:

elif file_path.endswith(".docx "): doc = docx.Document(file_ path) text = '\n'.join([para.text for para in doc.paragraphs])

If the selected file is not a text file, we check if it is a word document.

Open the word document using the docx library.

We then join all the content into the variable text.

Highlight:

elif file_path.endswith(".pdf"):

pdf_document = fitz.open (file_path) text = ""

for page_num in range(pdf_ document.page_count):

page = pdf_document.load_ page(page_num)

text += page.get_text()

If the selected file is not a text or word file, we check if it is a PDF file.

Open the PDF file using the fitz library.

Content from each page of the PDF is appended to the variable text.

Highlight:

textArea.delete("1.0", tk.END)

textArea.insert(tk.END, text)

Finally, we delete the current content of the text area.

Then we insert the content of the file into the text area.

Highlight:

root = tk.Tk()

root.title("Spelling and Grammar Checker")

Now, we initialize the main Tkinter window and call it root.

root dot title sets the title of the window.

Highlight:

textArea = scrolledtext.ScrolledText(root, wrap=tk.WORD, width=60, height=15)

This creates a scrollable text area for user input.
Highlight:

textArea.pack(pady=10)

We can also add padding to the text area using the pack method.
Highlight:

uploadButton = tk.Button (root, text="Upload", command=uploadFile)

uploadButton.pack(pady=5)

Next, we create an upload button that invokes the uploadFile function.
Highlight:

command=uploadFile

We can use the command attribute to set the function to be called.
Highlight:

checkButton = tk.Button(root, text="Check", command=checkText)

checkButton.pack(pady=5)

We also need a check button which will trigger the checkText function.
Highlight:

resultArea = scrolledtext.Scro lledText(root,wrap=tk.WORD, width=60, height=15, state =tk.DISABLED)

resultArea.pack(pady=10)

Finally, we create a result area using ScrolledText to display the results.
Highlight:

state=tk.DISABLED

state is set as tk dot DISABLED which disables the editing of the result area.
Highlight:

root.mainloop()

This starts the Tkinter event loop and keeps the window open for user interaction.
Running the code:

Open terminal -

Press Ctrl + Alt + T

Save the code as checker.py in the Downloads folder.

Open the terminal by pressing Control, Alt and T keys simultaneously.

Type in terminal:

source Automation/bin/activate

We will open the virtual environment we created for the Automation series.

Type source space Automation forward slash bin forward slash activate.

Then press enter.

Type in terminal:

> cd Downloads

> Python3 checker.py

In the terminal, type cd Downloads and press enter.
Next type python3 checker.py and press enter.
Show tkinter window:

Tkinter window opens up on running the code


(Adjust tkinter window in frame)
(pause here)Wait for a moment as it may take a while for the Tkinter window to appear.
The tkinter window will pop up once the code is executed.
As you can see, there is a text area, an upload and check button and a result area.
Let us test out our Spelling and Grammar checker application.
Type in tkinter window:

Hello wrold.

Click on check button

Highlight:

We are going to type Hello wrold and click on the Check button.
In the result area, we can see that the wrong spelling of world has been detected.
Let us correct the spelling of world.
Now click on the check button.
We can see there are no errors.
Type in tkinter window:

What is you favorite lanugage

Click on check button

Highlight:

Now let us type what is you favorite lanugage and click on Check.
This time we can see grammar and spelling errors.
In the result area, corrections for the errors are also suggested.
Correct text in tkinter window:

Hello world. Python is my favorite language

Click on check button

Highlight:

Content in result area

Let us fix the errors we see.
The correct sentence would be What is your favorite language. (correct this first)
Click on Check. (then correct this)
Now there are no errors, so the result area displays None.
*
Click on upload button
  • Select file type
  • Select file
  • Click Open


Click on the Upload button and a file dialog box will appear on your screen.
Here we can select the type of file we want to upload.
Let us select this paragraph.txt file and click on Open.
Highlight:

Content in text area

Click Check button

Highlight:

Content in result area

We see that the content of the paragraph.txt file has appeared in the text area.
Click on the Check button.
Please wait as it might take some time to process the spell check.
Some spelling and grammar errors are displayed in the result area.
Highlight: We can now edit the text, based on the errors and suggestions pointed out in the result area.
Close tkinter window We can terminate this program by closing the tkinter window.
Type in terminal:

deactivate

In the terminal, type deactivate.This will allow you to exit the virtual environment.
Show slide:

Summary

This brings us to the end of the tutorial. Let us summarize.
In this tutorial, we have learnt to
  • Create functions to identify spelling and grammar errors
  • Implement text processing techniques in Python
  • Create a GUI application with a text area and buttons
Show slide:

Assignment

As an assignment, do the following:
  • Modify the checker.py and print the total number of errors in the result area.
  • Hint: Use the len function to calculate the number of errors.
Show slide:

About the Spoken Tutorial Project

The video at the following link summarizes the Spoken Tutorial Project.Please download and watch it
Show Slide:

Spoken Tutorial Workshops

The Spoken Tutorial Project team conducts workshops and gives certificates.

For more details, please write to us.

Show Slide: Answers for THIS Spoken Tutorial Please post your timed queries in this forum.
Show Slide:

FOSSEE Forum

For any general or technical questions on Python for Automation, visit the FOSSEE forum and post your question.
Show slide:

Acknowledgement

The Spoken Tutorial Project was established by the Ministry of Education, Government of India.
Show slide:

Thank You

This is Jasmine Tresa Jose, a FOSSEE Semester Long Intern 2024, IIT Bombay signing off.

Thanks for joining.

Contributors and Content Editors

Madhurig, Nirmala Venkat