Difference between revisions of "Biopython/C2/Introduction-to-Biopython/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
(Created page with " {| style="border-spacing:0;" ! <center>Visual Cue</center> ! <center>Narration</center> |- | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;...")
 
 
(One intermediate revision by one other user not shown)
Line 25: Line 25:
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To follow this tutorial you should be familiar with,
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To follow this tutorial you should be familiar with,
  
<nowiki>* Undergraduate Biochemistry </nowiki>or Bioinformatics
+
* Undergraduate Biochemistry or Bioinformatics
 
+
* And basic''' Python''' programming  
<nowiki>* </nowiki>And basic''' Python''' programming  
+
  
 
Refer to the '''Python''' tutorials at the given link.
 
Refer to the '''Python''' tutorials at the given link.
Line 37: Line 36:
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To record this tutorial I am using
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To record this tutorial I am using
  
'''Ubuntu''' OS version 12.04
+
*'''Ubuntu''' OS version 12.04
 
+
*'''Python''' version 2.7.3
'''Python''' version 2.7.3
+
*'''Ipython''' version 0.12.1
 
+
*'''Biopython''' version 1.58
'''Ipython''' version 0.12.1
+
 
+
'''Biopython''' 1.58
+
  
 
|-
 
|-
Line 96: Line 92:
  
 
'''Installation for Ubuntu/Linux systems'''
 
'''Installation for Ubuntu/Linux systems'''
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Installation on '''linux''' system.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Installation on '''Linux''' system.
  
Install '''Python''', '''Ipython''' and '''Biopython''' packages using '''Synaptic Package Manager'''.
+
* Install '''Python, Ipython''' and '''Biopython''' packages using '''Synaptic Package Manager'''.
  
Prerequisite software will be installed automatically.
+
* Prerequisite software will be installed automatically.
  
Additional packages must be installed for graphic outputs and plots.  
+
* Additional packages must be installed for graphic outputs and plots.  
  
Open the terminal by pressing Ctrl, Alt and T keys simultaneously.
+
* Open the terminal by pressing '''Ctrl, Alt''' and '''T''' keys simultaneously.
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| I have already installed '''Python''', '''Ipython''' and '''Biopython''' on my system.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| I have already installed '''Python, Ipython''' and '''Biopython''' on my system.
  
Start '''Ipython''' interpretor by typing '''ipython''' and press enter
+
Start '''Ipython''' interpreter by typing '''ipython''' and press '''Enter'''.
  
 
'''IPython''' prompt appears on screen.  
 
'''IPython''' prompt appears on screen.  
Line 116: Line 112:
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Open the terminal and check installation of biopython
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Open the terminal and check installation of biopython
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To check the installation of '''Biopython''',
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To check the installation of '''Biopython''', at the prompt type: '''import Bio'''
  
At the prompt type:
+
Press '''Enter'''.
  
'''import Bio''' Press enter
+
If you don't get any error message, it means '''Biopython''' is installed.
  
If you don't get any error message it means '''Biopython''' is installed.
 
  
 
Here let me remind you, '''Python''' language is case sensitive.
 
Here let me remind you, '''Python''' language is case sensitive.
Line 128: Line 123:
 
Take precaution while typing keywords, variables or functions.
 
Take precaution while typing keywords, variables or functions.
  
For instance in the above line “i” in '''import''' is lower case.
+
For instance, in the above line “i” in '''import''' is lowercase.
  
 
And “B” is uppercase in '''Bio'''.
 
And “B” is uppercase in '''Bio'''.
Line 134: Line 129:
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal.
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal.
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| In this tutorial we will make use of '''Biopython''' modules to translate a DNA sequence.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| In this tutorial, we will make use of '''Biopython''' modules to translate a '''DNA sequence'''.
  
 
|-
 
|-
Line 142: Line 137:
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| It involves the following steps.
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| It involves the following steps.
  
First create a '''sequence object''' for coding DNA strand.
+
#First create a '''sequence object''' for coding '''DNA''' strand.
 
+
#Next '''transcription''' of coding '''DNA''' strand to '''mRNA'''.
Next '''transcription''' of coding DNA strand to mRNA.
+
#Finally''' translation''' of '''mRNA''' to a '''protein''' sequence.
 
+
Finally''' translation''' of mRNA to a protein sequence.
+
  
 
|-
 
|-
Line 153: Line 146:
 
'''Sequence Object'''
 
'''Sequence Object'''
  
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| We will use the coding DNA strand shown on this slide as an example.  
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| We will be using the coding '''DNA''' strand shown on this slide, as an example.  
  
It codes for a small protein sequence.
+
It codes for a small '''protein''' sequence.
  
The first step is to create a '''sequence object''' for the above coding DNA strand.
+
The first step is to create a '''sequence object''' for the above coding '''DNA''' strand.
  
Let us go back to the terminal.
+
Let us go back to the '''terminal'''.
  
 
|-
 
|-
Line 168: Line 161:
 
>>> from Bio.Seq import Seq  
 
>>> from Bio.Seq import Seq  
  
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| For creating a sequence object import the '''Seq''' module from '''Bio '''package.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| For creating a sequence object, import the '''Seq''' module from '''Bio '''package.
  
 
The''' Seq '''module provides methods to store and process sequence objects.  
 
The''' Seq '''module provides methods to store and process sequence objects.  
  
At the prompt type
+
At the prompt, type '''from Bio dot Seq import Seq '''
  
'''from Bio dot Seq import Seq '''
+
press '''Enter'''.
 
+
press enter
+
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal.
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal.
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Next specify the alphabets in the strand explicitly when creating your '''sequence object'''.  
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Next, specify the alphabets in the strand explicitly, when creating your '''sequence object'''.  
  
That is to specify whether the sequence of alphabets code for nucleotides or amino acids.  
+
That is to specify whether the sequence of alphabets code for '''nucleotides''' or '''amino acids'''.  
  
 
|-
 
|-
Line 188: Line 179:
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To do so we will use '''IUPAC '''module from '''Alphabet '''package.  
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To do so we will use '''IUPAC '''module from '''Alphabet '''package.  
  
At the prompt type  
+
At the prompt, type:
  
 
'''from Bio dot Alphabet import IUPAC'''  
 
'''from Bio dot Alphabet import IUPAC'''  
  
Press enter
+
Press '''Enter'''.
  
Note that we have used import and from statements to load Seq and '''IUPAC''' modules.
+
Note that, we have used import and from statements to load Seq and '''IUPAC''' modules.
  
 
|-
 
|-
Line 210: Line 201:
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Store the sequence object in a variable called '''cdna'''.
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Store the sequence object in a variable called '''cdna'''.
  
At the prompt type,
+
At the prompt, type: '''cdna equal to Seq''' as in normal strings.
  
'''cdna equal to Seq'''
+
Enclose the sequence within double quotes and parentheses.
  
As in normal strings enclose the sequence within double quotes and parentheses.
 
  
We know our sequence is a DNA fragment.
+
We know our sequence is a '''DNA''' fragment.
  
So type '''unambiguous DNA alphabet object''' as an argument.
+
So, type: '''unambiguous DNA alphabet object''' as an argument.
  
For the output type,
 
  
'''cdna'''  
+
For the output type: '''cdna'''press '''Enter'''
 
+
press enter
+
  
 
The output shows the DNA sequence as a sequence object.
 
The output shows the DNA sequence as a sequence object.
Line 230: Line 217:
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Let’s transcribe the coding strand into the corresponding''' mRNA.'''  
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Let’s transcribe the coding '''DNA''' strand into the corresponding''' mRNA.'''  
  
We will use the Seq module's built in '''“transcribe”''' method.
+
We will use the Seq module's built-in '''“transcribe”''' method.
  
 
|-
 
|-
Line 251: Line 238:
  
 
Store the output in a variable '''mrna'''.
 
Store the output in a variable '''mrna'''.
 +
  
 
At the prompt type,
 
At the prompt type,
Line 256: Line 244:
 
'''mrna equal to cdna dot transcribe open and close parentheses'''
 
'''mrna equal to cdna dot transcribe open and close parentheses'''
  
press enter
+
press '''Enter'''.
 +
 
  
 
For the output, type''' mrna'''  
 
For the output, type''' mrna'''  
  
press enter.
+
press '''Enter'''.
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Highlight the output  
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Highlight the output  
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Observe the output, the '''transcribe''' method replaces the '''Thiamin''' in the DNA sequence by '''Uracil'''.  
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Observe the output.
 +
 
 +
 
 +
The '''transcribe''' method replaces the '''Thiamin''' in the '''DNA''' sequence by '''Uracil'''.  
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Next to translate this '''mRNA''' to corresponding protein sequence, use the '''translate''' method.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Next, to translate this '''mRNA''' to corresponding '''protein''' sequence, use the '''translate''' method.
  
 
|-
 
|-
Line 285: Line 277:
 
Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*'))  
 
Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*'))  
  
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Type the following code  
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Type the following code:
  
 
'''protein equal to mrna dot translate open and close parentheses'''  
 
'''protein equal to mrna dot translate open and close parentheses'''  
  
press enter
+
press '''Enter'''.
 +
 
  
The translate method translates RNA or DNA sequence using the standard genetic code if unspecified.
+
The translate method translates '''RNA''' or '''DNA''' sequence using the standard genetic code, if unspecified.
  
 
|-
 
|-
Line 301: Line 294:
  
 
Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*'))  
 
Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*'))  
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| The output shows an amino acid sequence.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| The output shows an '''amino acid''' sequence.
  
 
The output also shows information regarding the presence of '''stop codons''' in the  
 
The output also shows information regarding the presence of '''stop codons''' in the  
Line 307: Line 300:
 
translated sequence.
 
translated sequence.
  
Observe the astrix at the end of the protein sequence.  
+
 
 +
Observe the asterix at the end of the '''protein''' sequence.  
  
 
It indicates the '''stop codon'''.  
 
It indicates the '''stop codon'''.  
Line 314: Line 308:
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal.
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal.
  
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| In the above code we have used a coding DNA strand for '''transciption'''.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| In the above code, we have used a coding '''DNA''' strand for '''transcription'''.
 +
 
 +
In '''Biopython, transcribe method''' works only on coding '''DNA''' strand.
  
In '''Biopython''' '''transcribe method''' works only on coding DNA strand.
 
  
 
However in real biological systems the process of '''transcription''' starts with a '''template strand'''.
 
However in real biological systems the process of '''transcription''' starts with a '''template strand'''.
Line 324: Line 319:
  
 
coding_dna = template_dna.reverse_complement()
 
coding_dna = template_dna.reverse_complement()
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| If you are starting with a''' template strand'''<nowiki>; </nowiki>
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| If you are starting with a''' template strand''',
 
+
* convert it to coding strand  
Convert it to coding strand by using '''reverse complement method''' as shown on the terminal.
+
* by using '''reverse complement method''', as shown on the '''terminal'''.
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Follow the rest of the code as shown above for the coding strand.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Follow the rest of the code as shown above, for the coding strand.
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Using methods in '''Biopython''' we have translated a DNA sequence to a protein sequence.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Using methods in '''Biopython''' we have translated a '''DNA''' sequence to a '''protein''' sequence.
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| DNA sequence of any size can be translated to a protein using this code.  
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| '''DNA''' sequence of any size can be translated to a '''protein''' using this code.  
  
 
|-
 
|-
Line 344: Line 339:
  
 
'''Summary'''
 
'''Summary'''
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Lets summarize.
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Let's summarize.
  
 
In this tutorial we have learnt
 
In this tutorial we have learnt
  
 
* Important features of '''Biopython'''.
 
* Important features of '''Biopython'''.
* Information regarding download and installation on Linux OS.
+
* Information regarding download and installation on '''Linux OS'''.
* Create a sequence object for the given DNA strand.<br/>
+
* Create a sequence object for the given '''DNA''' strand.
  
 
|-
 
|-
Line 357: Line 352:
 
'''Summary'''
 
'''Summary'''
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"|  
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"|  
* '''Transcription''' of the DNA sequence to mRNA.
+
* '''Transcription''' of the '''DNA''' sequence to '''mRNA'''.
* '''Translation''' of mRNA to protein sequence.
+
* '''Translation''' of '''mRNA''' to '''protein''' sequence.
  
 
|-
 
|-
Line 364: Line 359:
  
 
Assignment
 
Assignment
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Now for the assignment,
+
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Now for the assignment.
* Translate the given DNA sequence into protein sequence.<br/> 'ATGGCCCTATAGTGTCTAAGCTAG'
+
 
 +
* Translate the given '''DNA''' sequence into '''protein''' sequence.  
 +
* ''''ATGGCCCTATAGTGTCTAAGCTAG''''
 
* Observe the output.
 
* Observe the output.
* The protein sequence has an internal '''stop codon'''.
+
* The '''protein''' sequence has an internal '''stop codon'''.
* As it happens in nature, translate the DNA till first in frame '''stop codon'''.
+
* As it happens in nature, translate the '''DNA''' till first in frame '''stop codon'''.
  
 
|-
 
|-
Line 374: Line 371:
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Your completed assignment should have the following code.
 
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Your completed assignment should have the following code.
  
Notice that we have used ''''to underscore stop'''' argument in the '''translate method'''.
+
Notice that, we have used ''''to underscore stop'''' argument in the '''translate method'''.
 +
 
  
Notice the output,
+
Notice the output.
  
 
The '''stop codon''' itself is not translated.
 
The '''stop codon''' itself is not translated.
  
The stop symbol is not included at the end of your protein sequence.
+
The stop symbol is not included at the end of your '''protein''' sequence.
  
 
|-
 
|-

Latest revision as of 10:46, 26 June 2015

Visual Cue
Narration
Slide Number 1

Title Slide

Welcome to this tutorial on Introduction to Biopython
Slide Number 2

Learning Objectives

In this tutorial, we will learn about
  • Important features of Biopython.
  • Information regarding download and installation on Linux Operating System.
  • And translation of a DNA sequence to a protein sequence using Biopython tools.
Slide Number 3

Pre-requisites

To follow this tutorial you should be familiar with,
  • Undergraduate Biochemistry or Bioinformatics
  • And basic Python programming

Refer to the Python tutorials at the given link.

Slide Number 4

System Requirement

To record this tutorial I am using
  • Ubuntu OS version 12.04
  • Python version 2.7.3
  • Ipython version 0.12.1
  • Biopython version 1.58
Slide Number 5.

About Biopython

Biopython is a collection of modules for computational biology.

It can perform most basic to advanced tasks required for bioinformatics.

Slide number 6

Biopython functionality

Biopython tools are used for:

1. Parsing that is extracting information from various file formats such as FASTA, Genbank etc.

2. Download data from database websites such as NCBI, ExPASY etc

3. Run Bioinformatic algorithms such as BLAST

Slide Number 7

Biopython functionality

4. It has tools for performing common operations on sequences.

For example to obtain complements, transcription, translation etc.

5. Code for dealing with alignments.

6. And code to split up tasks into separate processes.

Slide Number 8

Download

Information regarding download.

Biopython package is not part of the Python distribution.

It needs to be downloaded independently.

For details refer the following link

http://biopython.org/wiki/Download

Slide Number 9

Installation for Ubuntu/Linux systems

Installation on Linux system.
  • Install Python, Ipython and Biopython packages using Synaptic Package Manager.
  • Prerequisite software will be installed automatically.
  • Additional packages must be installed for graphic outputs and plots.
  • Open the terminal by pressing Ctrl, Alt and T keys simultaneously.
Cursor on the terminal I have already installed Python, Ipython and Biopython on my system.

Start Ipython interpreter by typing ipython and press Enter.

IPython prompt appears on screen.

Open the terminal and check installation of biopython To check the installation of Biopython, at the prompt type: import Bio

Press Enter.

If you don't get any error message, it means Biopython is installed.


Here let me remind you, Python language is case sensitive.

Take precaution while typing keywords, variables or functions.

For instance, in the above line “i” in import is lowercase.

And “B” is uppercase in Bio.

Cursor on the terminal. In this tutorial, we will make use of Biopython modules to translate a DNA sequence.
Slide Number 10

DNA Translation

It involves the following steps.
  1. First create a sequence object for coding DNA strand.
  2. Next transcription of coding DNA strand to mRNA.
  3. Finally translation of mRNA to a protein sequence.
Slide Number 11

Sequence Object

We will be using the coding DNA strand shown on this slide, as an example.

It codes for a small protein sequence.

The first step is to create a sequence object for the above coding DNA strand.

Let us go back to the terminal.

Open the terminal

Type:

>>> from Bio.Seq import Seq

For creating a sequence object, import the Seq module from Bio package.

The Seq module provides methods to store and process sequence objects.

At the prompt, type from Bio dot Seq import Seq

press Enter.

Cursor on the terminal. Next, specify the alphabets in the strand explicitly, when creating your sequence object.

That is to specify whether the sequence of alphabets code for nucleotides or amino acids.

>>> from Bio.Alphabet import IUPAC To do so we will use IUPAC module from Alphabet package.

At the prompt, type:

from Bio dot Alphabet import IUPAC

Press Enter.

Note that, we have used import and from statements to load Seq and IUPAC modules.

Type >>> cdna = Seq("ATGTTACACTCCCGATGA", IUPAC.unambiguous_dna)

Press enter

cdna

press enter

Out put

Seq(ATGTTACACTCCCGATGA”, IUPAC unambiguousDNA())

Store the sequence object in a variable called cdna.

At the prompt, type: cdna equal to Seq as in normal strings.

Enclose the sequence within double quotes and parentheses.


We know our sequence is a DNA fragment.

So, type: unambiguous DNA alphabet object as an argument.


For the output type: cdna; press Enter

The output shows the DNA sequence as a sequence object.

Cursor on the terminal Let’s transcribe the coding DNA strand into the corresponding mRNA.

We will use the Seq module's built-in “transcribe” method.

Type

>>> mrna = coding_dna.transcribe()

press enter

Type

mrna press enter

>>> mrna

Seq('AUGUUACACUCCCGAUGA', IUPACUnambiguousRNA())

Type the following code:

Store the output in a variable mrna.


At the prompt type,

mrna equal to cdna dot transcribe open and close parentheses

press Enter.


For the output, type mrna

press Enter.

Highlight the output Observe the output.


The transcribe method replaces the Thiamin in the DNA sequence by Uracil.

Cursor on the terminal Next, to translate this mRNA to corresponding protein sequence, use the translate method.
Type

>>> mrna.translate()

press enter

Cursor on the terminal.

Output:

protein

Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*'))

Type the following code:

protein equal to mrna dot translate open and close parentheses

press Enter.


The translate method translates RNA or DNA sequence using the standard genetic code, if unspecified.

Cursor on the terminal.

Output:

protein

Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*'))

The output shows an amino acid sequence.

The output also shows information regarding the presence of stop codons in the

translated sequence.


Observe the asterix at the end of the protein sequence.

It indicates the stop codon.

Cursor on the terminal. In the above code, we have used a coding DNA strand for transcription.

In Biopython, transcribe method works only on coding DNA strand.


However in real biological systems the process of transcription starts with a template strand.

Type ,

coding_dna = template_dna.reverse_complement()

If you are starting with a template strand,
  • convert it to coding strand
  • by using reverse complement method, as shown on the terminal.
Cursor on the terminal Follow the rest of the code as shown above, for the coding strand.
Cursor on the terminal Using methods in Biopython we have translated a DNA sequence to a protein sequence.
Cursor on the terminal DNA sequence of any size can be translated to a protein using this code.
Slide Number 12

Summary

Let's summarize.

In this tutorial we have learnt

  • Important features of Biopython.
  • Information regarding download and installation on Linux OS.
  • Create a sequence object for the given DNA strand.
Slide Number 13

Summary

  • Transcription of the DNA sequence to mRNA.
  • Translation of mRNA to protein sequence.
Slide Number 14

Assignment

Now for the assignment.
  • Translate the given DNA sequence into protein sequence.
  • 'ATGGCCCTATAGTGTCTAAGCTAG'
  • Observe the output.
  • The protein sequence has an internal stop codon.
  • As it happens in nature, translate the DNA till first in frame stop codon.
Cursor on the terminal. Your completed assignment should have the following code.

Notice that, we have used 'to underscore stop' argument in the translate method.


Notice the output.

The stop codon itself is not translated.

The stop symbol is not included at the end of your protein sequence.

Slide Number 15

Acknowledgement

This video summarizes the Spoken Tutorial project.

If you do not have good bandwidth, you can download and watch it.

Slide Number 16 The Spoken Tutorial Project Team conducts workshops and gives certificates for those who pass an online test.

For more details, please write to us.

Slide number 17 Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.

More information on this Mission is available at this link.

This is Snehalatha from IIT Bombay signing off. Thank you for joining.

Contributors and Content Editors

Nancyvarkey, Snehalathak