Biopython/C2/Introduction-to-Biopython/English
|
|
---|---|
Slide Number 1
Title Slide |
Welcome to this tutorial on Introduction to Biopython |
Slide Number 2
Learning Objectives |
In this tutorial, we will learn about
|
Slide Number 3
Pre-requisites |
To follow this tutorial you should be familiar with,
* Undergraduate Biochemistry or Bioinformatics * And basic Python programming Refer to the Python tutorials at the given link. |
Slide Number 4
System Requirement |
To record this tutorial I am using
Ubuntu OS version 12.04 Python version 2.7.3 Ipython version 0.12.1 Biopython 1.58 |
Slide Number 5.
About Biopython |
Biopython is a collection of modules for computational biology.
It can perform most basic to advanced tasks required for bioinformatics. |
Slide number 6
Biopython functionality |
Biopython tools are used for:
1. Parsing that is extracting information from various file formats such as FASTA, Genbank etc. 2. Download data from database websites such as NCBI, ExPASY etc 3. Run Bioinformatic algorithms such as BLAST |
Slide Number 7
Biopython functionality |
4. It has tools for performing common operations on sequences.
For example to obtain complements, transcription, translation etc. 5. Code for dealing with alignments. 6. And code to split up tasks into separate processes. |
Slide Number 8
Download |
Information regarding download.
Biopython package is not part of the Python distribution. It needs to be downloaded independently. For details refer the following link |
Slide Number 9
Installation for Ubuntu/Linux systems |
Installation on linux system.
Install Python, Ipython and Biopython packages using Synaptic Package Manager. Prerequisite software will be installed automatically. Additional packages must be installed for graphic outputs and plots. Open the terminal by pressing Ctrl, Alt and T keys simultaneously. |
Cursor on the terminal | I have already installed Python, Ipython and Biopython on my system.
Start Ipython interpretor by typing ipython and press enter IPython prompt appears on screen. |
Open the terminal and check installation of biopython | To check the installation of Biopython,
At the prompt type: import Bio Press enter If you don't get any error message it means Biopython is installed. Here let me remind you, Python language is case sensitive. Take precaution while typing keywords, variables or functions. For instance in the above line “i” in import is lower case. And “B” is uppercase in Bio. |
Cursor on the terminal. | In this tutorial we will make use of Biopython modules to translate a DNA sequence. |
Slide Number 10
DNA Translation |
It involves the following steps.
First create a sequence object for coding DNA strand. Next transcription of coding DNA strand to mRNA. Finally translation of mRNA to a protein sequence. |
Slide Number 11
Sequence Object |
We will use the coding DNA strand shown on this slide as an example.
It codes for a small protein sequence. The first step is to create a sequence object for the above coding DNA strand. Let us go back to the terminal. |
Open the terminal
Type: >>> from Bio.Seq import Seq |
For creating a sequence object import the Seq module from Bio package.
The Seq module provides methods to store and process sequence objects. At the prompt type from Bio dot Seq import Seq press enter |
Cursor on the terminal. | Next specify the alphabets in the strand explicitly when creating your sequence object.
That is to specify whether the sequence of alphabets code for nucleotides or amino acids. |
>>> from Bio.Alphabet import IUPAC | To do so we will use IUPAC module from Alphabet package.
At the prompt type from Bio dot Alphabet import IUPAC Press enter Note that we have used import and from statements to load Seq and IUPAC modules. |
Type >>> cdna = Seq("ATGTTACACTCCCGATGA", IUPAC.unambiguous_dna)
Press enter cdna press enter Out put Seq(ATGTTACACTCCCGATGA”, IUPAC unambiguousDNA()) |
Store the sequence object in a variable called cdna.
At the prompt type, cdna equal to Seq As in normal strings enclose the sequence within double quotes and parentheses. We know our sequence is a DNA fragment. So type unambiguous DNA alphabet object as an argument. For the output type, cdna press enter The output shows the DNA sequence as a sequence object. |
Cursor on the terminal | Let’s transcribe the coding strand into the corresponding mRNA.
We will use the Seq module's built in “transcribe” method. |
Type
>>> mrna = coding_dna.transcribe() press enter Type mrna press enter >>> mrna Seq('AUGUUACACUCCCGAUGA', IUPACUnambiguousRNA()) |
Type the following code:
Store the output in a variable mrna. At the prompt type, mrna equal to cdna dot transcribe open and close parentheses press enter For the output, type mrna press enter. |
Highlight the output | Observe the output, the transcribe method replaces the Thiamin in the DNA sequence by Uracil. |
Cursor on the terminal | Next to translate this mRNA to corresponding protein sequence, use the translate method. |
Type
>>> mrna.translate() press enter Cursor on the terminal. Output: protein Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*')) |
Type the following code
protein equal to mrna dot translate open and close parentheses press enter The translate method translates RNA or DNA sequence using the standard genetic code if unspecified. |
Cursor on the terminal.
Output: protein Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*')) |
The output shows an amino acid sequence.
The output also shows information regarding the presence of stop codons in the translated sequence. Observe the astrix at the end of the protein sequence. It indicates the stop codon. |
Cursor on the terminal. | In the above code we have used a coding DNA strand for transciption.
In Biopython transcribe method works only on coding DNA strand. However in real biological systems the process of transcription starts with a template strand. |
Type ,
coding_dna = template_dna.reverse_complement() |
If you are starting with a template strand;
Convert it to coding strand by using reverse complement method as shown on the terminal. |
Cursor on the terminal | Follow the rest of the code as shown above for the coding strand. |
Cursor on the terminal | Using methods in Biopython we have translated a DNA sequence to a protein sequence. |
Cursor on the terminal | DNA sequence of any size can be translated to a protein using this code. |
Slide Number 12
Summary |
Lets summarize.
In this tutorial we have learnt
|
Slide Number 13
Summary |
|
Slide Number 14
Assignment |
Now for the assignment,
|
Cursor on the terminal. | Your completed assignment should have the following code.
Notice that we have used 'to underscore stop' argument in the translate method. Notice the output, The stop codon itself is not translated. The stop symbol is not included at the end of your protein sequence. |
Slide Number 15
Acknowledgement |
This video summarizes the Spoken Tutorial project.
If you do not have good bandwidth, you can download and watch it. |
Slide Number 16 | The Spoken Tutorial Project Team conducts workshops and gives certificates for those who pass an online test.
For more details, please write to us. |
Slide number 17 | Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.
More information on this Mission is available at this link. |
This is Snehalatha from IIT Bombay signing off. Thank you for joining. |