Biopython/C2/Introduction-to-Biopython/English-timed

From Script | Spoken-Tutorial
Jump to: navigation, search
Time
Narration
00:01 Welcome to this tutorial on Introduction to Biopython.
00:05 In this tutorial, we will learn about: * important features of Biopython
00:10 * Information regarding download and installation on Linux Operating System
00:15 * And translation of a DNA sequence to a protein sequence using Biopython tools.
00:22 To follow this tutorial, you should be familiar with-
00:25 undergraduate Biochemistry or Bioinformatics
00:29 and basic Python programming.
00:31 Refer to the Python tutorials at the given link.
00:35 To record this tutorial, I am using: * Ubuntu OS version 12.04
00:41 * Python version 2.7.3
00:44 * Ipython version 0.12.1 and
00:48 * Biopython version 1.58.
00:51 Biopython is a collection of modules for computational biology.
00:57 It can perform most basic to advanced tasks required for bioinformatics.
01:03 Biopython tools are used for:
01:05 * Parsing, that is extracting information from various file formats such as FASTA, Genbank etc.
01:14 * Download data from database websites such as NCBI, ExPASY etc.
01:22 * Run Bioinformatic algorithms such as BLAST.
01:26 It has tools for performing common operations on sequences.
01:31 For example- to obtain complements, transcription, translation etc.
01:38 Code for dealing with alignments
01:40 and code to split up tasks into separate processes.
01:46 Information regarding download:
01:48 Biopython package is not part of the Python distribution; it needs to be downloaded independently.
01:54 For details, refer to the following link.
01:59 Installation on Linux system:
02:02 Install Python, Ipython and Biopython packages using Synaptic Package Manager.
02:08 Prerequisite software will be installed automatically.
02:13 Additional packages must be installed for graphic outputs and plots.
02:18 Open the terminal by pressing Ctrl, Alt and T keys simultaneously.
02:24 I have already installed Python, Ipython and Biopython on my system.
02:30 Start Ipython interpreter by typing "ipython" and press Enter.
02:35 IPython prompt appears on screen.
02:38 To check the installation of Biopython- at the prompt, type: "import Bio", press Enter.
02:48 If you don't get any error message, it means Biopython is installed.
02:54 Here, let me remind you, Python language is case sensitive.
02:59 Take precaution while typing keywords, variables or functions.
03:04 For instance, in the above line “i” in import is lowercase and “B” is uppercase in Bio.
03:12 In this tutorial, we will make use of Biopython modules to translate a DNA sequence.
03:19 It involves the following steps.
03:22 First, create a sequence object for coding DNA strand.
03:27 Next, transcription of coding DNA strand to mRNA.
03:32 Finally, translation of mRNA to a protein sequence.
03:37 We will be using the coding DNA strand shown on this slide, as an example.
03:42 It codes for a small protein sequence.
03:46 The first step is to create a sequence object for the above coding DNA strand.
03:52 Let us go back to the terminal.
03:55 For creating a sequence object, import the Seq module from Bio package.
04:02 The Seq module provides methods to store and process sequence objects.
04:08 At the prompt, type: from Bio dot Seq import Seq press Enter.
04:17 Next, specify the alphabets in the strand explicitly, when creating your sequence object.
04:24 That is to specify whether the sequence of alphabets code for nucleotides or amino acids.
04:32 To do so, we will use IUPAC module from Alphabet package.
04:38 At the prompt, type: from Bio dot Alphabet import IUPAC. Press Enter.
04:48 Note that we have used import and from statements to load "Seq" and "IUPAC" modules.
04:56 Store the sequence object in a variable called cdna.
05:01 At the prompt, type: cdna equal to Seq as in normal strings.
05:08 Enclose the sequence within double quotes and parentheses.
05:13 We know our sequence is a DNA fragment. So, type: unambiguous DNA alphabet object as an argument.
05:21 For the output, type: cdna. Press Enter.
05:26 The output shows the DNA sequence as a sequence object.
05:30 Let’s transcribe the coding DNA strand into the corresponding mRNA.
05:35 We will use the Seq module's built-in “transcribe” method.
05:39 Type the following code:
05:41 Store the output in a variable mrna.
05:45 At the prompt, type: mrna equal to cdna dot transcribe open and close parentheses, press Enter.
05:55 For the output, type: mrna. Press Enter.
06:01 Observe the output.The transcribe method replaces the Thiamin in the DNA sequence by Uracil.
06:09 Next, to translate this mRNA to corresponding protein sequence, use the translate method.
06:16 Type the following code: protein equal to mrna dot translate open and close parentheses. Press Enter.
06:27 The translate method translates RNA or DNA sequence using the standard genetic code, if unspecified.
06:36 The output shows an amino acid sequence.
06:40 The output also shows information regarding the presence of stop codons in the translated sequence.
06:47 Observe the asterisk at the end of the protein sequence. It indicates the stop codon.
06:53 In the above code, we have used a coding DNA strand for transcription.
06:59 In Biopython, transcribe method works only on coding DNA strand.
07:04 However, in real biological systems, the process of transcription starts with a template strand.
07:11 If you are starting with a template strand, convert it to coding strand by using reverse complement method, as shown on the terminal.
07:20 Follow the rest of the code as shown above, for the coding strand.
07:24 Using methods in Biopython we have translated a DNA sequence to a protein sequence.
07:31 DNA sequence of any size can be translated to a protein sequence using this code.
07:37 Let's summarize.In this tutorial, we have learnt:
07:41 Important features of Biopython
07:43 Information regarding download and installation on Linux OS
07:48 Create a sequence object for the given DNA strand.
07:52 Transcription of the DNA sequence to mRNA.
07:56 Translation of mRNA to protein sequence.
08:00 Now, for the assignment-
08:02 Translate the given DNA sequence into protein sequence.
08:06 Observe the output.
08:08 The protein sequence has an internal stop codon.
08:11 As it happens in nature, translate the DNA till first in-frame stop codon.
08:17 Your completed assignment should have the following code.
08:20 Notice that we have used to underscore stop argument in the translate() method. Notice the output.
08:27 The stop codon itself is not translated.
08:31 The stop symbol is not included at the end of your protein sequence.
08:36 This video summarizes the Spoken Tutorial project.
08:39 If you do not have good bandwidth, you can download and watch it.
08:43 The Spoken Tutorial Project team conducts workshops and gives certificates for those who pass an online test.
08:50 For more details, please write to us.
08:53 Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.
08:59 More information on this mission is available at this link.
09:03 This is Snehalatha from IIT Bombay, signing off. Thank you for joining.

Contributors and Content Editors

PoojaMoolya, Sandhya.np14