Python-for-Automation/C2/File-Backup-and-Compression/English
Visual Cue | Narration |
Show slide
Title Slide |
Welcome to the Spoken Tutorial on “File Backup and Compression” |
Show slide:
Learning Objectives |
In this tutorial, we will learn to
|
Show slide:
System Requirements |
To record this tutorial, I am using
|
Show slide:
Prerequisite |
To follow this tutorial:
|
Show slide:
Code files |
|
Show the folder Sample | I have created a folder Sample which contains some files for demonstration. |
Point to the file in Downloads folder | file_backup.py contains the code for the backup and compression of directories.
run_backup.py is used to call the functions defined in file_backup.py. Ensure that both source codes are saved in the same location. Schedule.txt contains the commands required to schedule the backup process everyday |
Open file_backup.py file | Now, let us go through the file_backup.py file in the text editor. |
Highlight: | Import the necessary modules for File backup and compression |
Highlight: | First, we define a function that performs the backup operation.
We pass the source and backup directories and compression format as parameters. |
Highlight:
compress=False |
compress is a boolean value that determines if the backup needs to be compressed. |
Highlight:
if not os.path.exists(backup_dir): |
First we check if the backup directory exists using the exists function. |
Highlight:
os.makedirs(backup_dir) |
We create a backup directory using the makedirs function. |
Highlight: | Then we check if there is an existing backup directory named current_backup.
This existing current_backup directory is removed using rmtree function. This prevents multiple copies of the backup directory from being created. |
Highlight: | We create a path for the new backup directory as current_backup using the join function. |
Highlight:
for root, dirs, files in os.walk(source_dir): |
Next, we use the walk method to recursively traverse through the source directory.
We then try to recreate the same directory structure in the backup folder. |
Highlight:
root |
Here, root is the current directory being traversed. |
Highlight:
relative_path = os.path.relpath(root, source_dir) |
relpath function computes the relative path of the root to the source folder.
This helps in maintaining the same structure as the source directory. |
Highlight: | Then the join function is used to construct the path for the destination directory. |
Highlight:
if not os.path.exists(dest_dir): os.makedirs(dest_dir) |
We now check if the destination directory exists.
If it does not, then the destination directory is created. |
Highlight: | This for loop constructs the path to the source file in the source directory.
It will also do the same for the destination files in the backup directory. |
Highlight:
if not os.path.exists(dest_file) |
First we check if the source file already exists in the backup directory.
If it does not exist, then the file is copied. |
Highlight: | Next, getmtime function checks if the file has been modified since the last backup date.
The file gets copied only if it is modified. This helps to minimize the processing when we perform backup everyday. |
Highlight:
shutil.copy2(source_file, dest_file) |
copy2 function is used to create a copy of the source to the destination directory. |
Highlight: | If compress is set to True, we generate a compressed backup. |
Highlight: | We call the compress_backup function to compress current_backup folder. |
Highlight:
shutil.rmtree(current_backup_dir) |
After compression we use rmtree method to remove the current_backup directory. |
Highlight: | Finally, we print a success message indicating where the backup is stored. |
Highlight: | Next, we define a function to compress the backup folder |
Highlight: | In the function, we check which compression format has been passed.
If the format is zip, we use make_archive method to create a zip archive. |
Highlight: | If the format is tar dot gz or tar dot bz2, we use tarfile dot open function. |
This function creates tar archives with gzip or bzip2 compression. | |
This method adds the contents of the source directory to the tar archive. | |
Highlight:
os.path.basename(source_dir) |
os dot path dot basename allows you to set the name of the tar archive. |
Highlight: | If an unsupported compression format is provided, we raise a ValueError. |
Save file_backup.py | Save the code as file_backup.py in the Downloads folder. |
Open file:
run_backup.py |
Open Downloads folder and open the run_backup.py file.
Now, let us go through this code. |
Highlight:
import file_backup.py |
We need to import the file_backup python code as a module into the run_backup file. |
In Downloads point to sample Files | Go to the Downloads folder.
This is our sample directory that we will use for testing. |
Highlight:
source_directory = ‘’ backup_directory_path = ‘’ |
Source_directory shows the path of the sample folder where it is saved.
Backup_directory_path shows the path where back up has to be saved. Please change the path according to the location of your directory. |
Highlight: | Finally, we call the perform_backup function and pass all the parameters. |
Set compress=False | Let us set the compress parameter to False and the format to zip. |
Running the code:
Press Ctrl + Alt + T |
Save the code file run_backup.py |
Open the terminal by pressing Control, Alt and T keys | Open the terminal by pressing Control, Alt and T keys simultaneously. |
Type in terminal:
source Automation/bin/activate |
We will open the virtual environment we created for the Automation series.
Type source space Automation forward slash bin forward slash activate. Then press Enter. |
Type in terminal:
> cd Downloads > python3 run_backup.py Highlight: |
In the terminal, type cd Downloads and press Enter.
Next type python3 run_backup.py and press Enter. We see a message that backup is done successfully and it is stored in this location. Let us see the output in this directory. |
Switch to Downloads folder
|
Go to the Downloads folder, and we can see a directory named current_backup.
Open the current_backup directory and you will find all the files from our source directory. So we have successfully completed the file backup. |
Next we will check the working of file compression. | |
Go to Downloads folder. | Go to Downloads folder and in the Sample directory you will see a document named Doc3.odt
Let us modify this file to see he file compression is working properly. Open the Doc3.odt and add some images or text. Save the file. |
|
Go to Downloads, right click on the Sample directory and select properties.
Here we can see the size of this directory. After compression this size should reduce. |
In run_backup.py type:
compress=True Save run_backup.py |
Go to run_backup.py file and set compress to True.
This will compress the backup directory. Save the file. |
In terminal type:
python3 run_backup.py |
Switch to the terminal and type python3 run_backup.py again and press Enter.
We see a message that indicates successful backup and compression. |
|
This time the current_backup directory is compressed in zip format.
Again right click on the compressed current_backup.zip and select properties. We can see that the size of the compressed folder is lesser than Sample folder. |
Open current_backup | This is how the Python backup and compression code works. |
Switch to terminal | Switch back to the terminal.
Next we will learn how to schedule this program to run automatically at a specified time. |
Type crontab -e | We will use Crontab editor which is a Linux scheduler that runs a list of commands.
To open and edit the crontab editor, in the terminal, type EDITOR=nano space crontab space hyphen e. We will type our commands here. |
Open schedule.txt
Copy commands from schedule.txt and paste in crontab |
Go to the Downloads folder and open schedule.txt
This file contains the necessary commands required to run a scheduler. Copy these commands. Switch back to the terminal and paste the commands at the end of the crontab editor. |
Type:
50 11 * * * |
Here this number represents the minutes, and this is the hours.
I will set this to 11:30 because that is when I want the backup to be scheduled. |
Highlight:
* * * |
This indicates that the backup will be scheduled every day at the specified time |
Highlight
/usr/bin/python3 |
This is the path to the python3 interpreter. |
Highlight
/home/jasmine/Downloads/run_backup.py |
Next,we add the path to the run_backup.py file. |
Highlight
>> /home/jasmine/Downloads/bkp_logfile.log |
We can print the output of our code into a logfile and save it in the Downloads folder
Please change the path according to your system. |
Highlight
2>&1 Save crontab: Ctrl+X Y Enter |
This redirects all standard errors to the standard output file.
Now, press Ctrl plus X and then press Y followed by Enter to save and exit. Now we have scheduled the run_backup.py file to be executed at 11:30 everyday. |
The current date and time as of the creation of this video is 28th September 11:38
We will check if the scheduler has completed the backup the next day. | |
Now the date is 29th September 11:35 a.m. | |
Open current_backup
Directory Point to Doc4.txt |
Let us go to the downloads folder, extract the current_backup directory and open it.
We can see that all the files have been backed up. |
Narration | We can also check if there were any output messages or errors after the backup. |
Point to bkp_logfile.log
Open bkp_logfile.log Highlight output in bkp_logfile.log |
In the Downloads folder, we see that a bkp_logfile.log has been created.
Here, the output or error messages of the program will be stored after the cron job is completed. |
Narration | In this way, we can backup and compress a directory at a predetermined time and date. |
Type in terminal:
deactivate |
In the terminal, type deactivate.This will allow you to exit the virtual environment. |
Show slide:
Summary |
This brings us to the end of the tutorial. Let us summarize.
In this tutorial, we have learnt to
|
Show slide:
Assignment |
As an assignment, do the following:
|
Show slide:
About the Spoken Tutorial Project |
The video at the following link summarizes the Spoken Tutorial Project.Please download and watch it |
Show Slide:
Spoken Tutorial Workshops |
The Spoken Tutorial Project team conducts workshops and gives certificates.
For more details, please write to us. |
Show Slide:
Answers for THIS Spoken Tutorial |
Please post your timed queries in this forum. |
Show Slide:
FOSSEE Forum |
For any general or technical questions on Python for Automation, visit the FOSSEE forum and post your question. |
Show slide:
Acknowledgement |
The Spoken Tutorial Project was established by the Ministry of Education, Government of India. |
Show slide:
Thank You |
This is Jasmine Tresa Jose, a FOSSEE Summer Fellow 2024, IIT Bombay signing off.
Thanks for joining. |