Difference between revisions of "Biopython/C2/Introduction-to-Biopython/English"
Snehalathak (Talk | contribs) (Created page with " {| style="border-spacing:0;" ! <center>Visual Cue</center> ! <center>Narration</center> |- | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;...") |
Snehalathak (Talk | contribs) |
||
(One intermediate revision by one other user not shown) | |||
Line 25: | Line 25: | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To follow this tutorial you should be familiar with, | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To follow this tutorial you should be familiar with, | ||
− | + | * Undergraduate Biochemistry or Bioinformatics | |
− | + | * And basic''' Python''' programming | |
− | + | ||
Refer to the '''Python''' tutorials at the given link. | Refer to the '''Python''' tutorials at the given link. | ||
Line 37: | Line 36: | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To record this tutorial I am using | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To record this tutorial I am using | ||
− | '''Ubuntu''' OS version 12.04 | + | *'''Ubuntu''' OS version 12.04 |
− | + | *'''Python''' version 2.7.3 | |
− | '''Python''' version 2.7.3 | + | *'''Ipython''' version 0.12.1 |
− | + | *'''Biopython''' version 1.58 | |
− | '''Ipython''' version 0.12.1 | + | |
− | + | ||
− | '''Biopython''' 1.58 | + | |
|- | |- | ||
Line 96: | Line 92: | ||
'''Installation for Ubuntu/Linux systems''' | '''Installation for Ubuntu/Linux systems''' | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Installation on ''' | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Installation on '''Linux''' system. |
− | Install '''Python | + | * Install '''Python, Ipython''' and '''Biopython''' packages using '''Synaptic Package Manager'''. |
− | Prerequisite software will be installed automatically. | + | * Prerequisite software will be installed automatically. |
− | Additional packages must be installed for graphic outputs and plots. | + | * Additional packages must be installed for graphic outputs and plots. |
− | Open the terminal by pressing Ctrl, Alt and T keys simultaneously. | + | * Open the terminal by pressing '''Ctrl, Alt''' and '''T''' keys simultaneously. |
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| I have already installed '''Python | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| I have already installed '''Python, Ipython''' and '''Biopython''' on my system. |
− | Start '''Ipython''' | + | Start '''Ipython''' interpreter by typing '''ipython''' and press '''Enter'''. |
'''IPython''' prompt appears on screen. | '''IPython''' prompt appears on screen. | ||
Line 116: | Line 112: | ||
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Open the terminal and check installation of biopython | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Open the terminal and check installation of biopython | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To check the installation of '''Biopython''', | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To check the installation of '''Biopython''', at the prompt type: '''import Bio''' |
− | + | Press '''Enter'''. | |
− | ''' | + | If you don't get any error message, it means '''Biopython''' is installed. |
− | |||
Here let me remind you, '''Python''' language is case sensitive. | Here let me remind you, '''Python''' language is case sensitive. | ||
Line 128: | Line 123: | ||
Take precaution while typing keywords, variables or functions. | Take precaution while typing keywords, variables or functions. | ||
− | For instance in the above line “i” in '''import''' is | + | For instance, in the above line “i” in '''import''' is lowercase. |
And “B” is uppercase in '''Bio'''. | And “B” is uppercase in '''Bio'''. | ||
Line 134: | Line 129: | ||
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal. | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal. | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| In this tutorial we will make use of '''Biopython''' modules to translate a DNA sequence. | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| In this tutorial, we will make use of '''Biopython''' modules to translate a '''DNA sequence'''. |
|- | |- | ||
Line 142: | Line 137: | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| It involves the following steps. | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| It involves the following steps. | ||
− | First create a '''sequence object''' for coding DNA strand. | + | #First create a '''sequence object''' for coding '''DNA''' strand. |
− | + | #Next '''transcription''' of coding '''DNA''' strand to '''mRNA'''. | |
− | Next '''transcription''' of coding DNA strand to mRNA. | + | #Finally''' translation''' of '''mRNA''' to a '''protein''' sequence. |
− | + | ||
− | Finally''' translation''' of mRNA to a protein sequence. | + | |
|- | |- | ||
Line 153: | Line 146: | ||
'''Sequence Object''' | '''Sequence Object''' | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| We will | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| We will be using the coding '''DNA''' strand shown on this slide, as an example. |
− | It codes for a small protein sequence. | + | It codes for a small '''protein''' sequence. |
− | The first step is to create a '''sequence object''' for the above coding DNA strand. | + | The first step is to create a '''sequence object''' for the above coding '''DNA''' strand. |
− | Let us go back to the terminal. | + | Let us go back to the '''terminal'''. |
|- | |- | ||
Line 168: | Line 161: | ||
>>> from Bio.Seq import Seq | >>> from Bio.Seq import Seq | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| For creating a sequence object import the '''Seq''' module from '''Bio '''package. | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| For creating a sequence object, import the '''Seq''' module from '''Bio '''package. |
The''' Seq '''module provides methods to store and process sequence objects. | The''' Seq '''module provides methods to store and process sequence objects. | ||
− | At the prompt type | + | At the prompt, type '''from Bio dot Seq import Seq ''' |
− | ''' | + | press '''Enter'''. |
− | + | ||
− | + | ||
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal. | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal. | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Next specify the alphabets in the strand explicitly when creating your '''sequence object'''. | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Next, specify the alphabets in the strand explicitly, when creating your '''sequence object'''. |
− | That is to specify whether the sequence of alphabets code for nucleotides or amino acids. | + | That is to specify whether the sequence of alphabets code for '''nucleotides''' or '''amino acids'''. |
|- | |- | ||
Line 188: | Line 179: | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To do so we will use '''IUPAC '''module from '''Alphabet '''package. | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| To do so we will use '''IUPAC '''module from '''Alphabet '''package. | ||
− | At the prompt type | + | At the prompt, type: |
'''from Bio dot Alphabet import IUPAC''' | '''from Bio dot Alphabet import IUPAC''' | ||
− | Press | + | Press '''Enter'''. |
− | Note that we have used import and from statements to load Seq and '''IUPAC''' modules. | + | Note that, we have used import and from statements to load Seq and '''IUPAC''' modules. |
|- | |- | ||
Line 210: | Line 201: | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Store the sequence object in a variable called '''cdna'''. | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Store the sequence object in a variable called '''cdna'''. | ||
− | At the prompt | + | At the prompt, type: '''cdna equal to Seq''' as in normal strings. |
− | + | Enclose the sequence within double quotes and parentheses. | |
− | |||
− | We know our sequence is a DNA fragment. | + | We know our sequence is a '''DNA''' fragment. |
− | So type '''unambiguous DNA alphabet object''' as an argument. | + | So, type: '''unambiguous DNA alphabet object''' as an argument. |
− | |||
− | '''cdna''' | + | For the output type: '''cdna'''; press '''Enter''' |
− | + | ||
− | press | + | |
The output shows the DNA sequence as a sequence object. | The output shows the DNA sequence as a sequence object. | ||
Line 230: | Line 217: | ||
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Let’s transcribe the coding strand into the corresponding''' mRNA.''' | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Let’s transcribe the coding '''DNA''' strand into the corresponding''' mRNA.''' |
− | We will use the Seq module's built in '''“transcribe”''' method. | + | We will use the Seq module's built-in '''“transcribe”''' method. |
|- | |- | ||
Line 251: | Line 238: | ||
Store the output in a variable '''mrna'''. | Store the output in a variable '''mrna'''. | ||
+ | |||
At the prompt type, | At the prompt type, | ||
Line 256: | Line 244: | ||
'''mrna equal to cdna dot transcribe open and close parentheses''' | '''mrna equal to cdna dot transcribe open and close parentheses''' | ||
− | press | + | press '''Enter'''. |
+ | |||
For the output, type''' mrna''' | For the output, type''' mrna''' | ||
− | press | + | press '''Enter'''. |
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Highlight the output | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Highlight the output | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Observe the output | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Observe the output. |
+ | |||
+ | |||
+ | The '''transcribe''' method replaces the '''Thiamin''' in the '''DNA''' sequence by '''Uracil'''. | ||
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Next to translate this '''mRNA''' to corresponding protein sequence, use the '''translate''' method. | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Next, to translate this '''mRNA''' to corresponding '''protein''' sequence, use the '''translate''' method. |
|- | |- | ||
Line 285: | Line 277: | ||
Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*')) | Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*')) | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Type the following code | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Type the following code: |
'''protein equal to mrna dot translate open and close parentheses''' | '''protein equal to mrna dot translate open and close parentheses''' | ||
− | press | + | press '''Enter'''. |
+ | |||
− | The translate method translates RNA or DNA sequence using the standard genetic code if unspecified. | + | The translate method translates '''RNA''' or '''DNA''' sequence using the standard genetic code, if unspecified. |
|- | |- | ||
Line 301: | Line 294: | ||
Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*')) | Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*')) | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| The output shows an amino acid sequence. | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| The output shows an '''amino acid''' sequence. |
The output also shows information regarding the presence of '''stop codons''' in the | The output also shows information regarding the presence of '''stop codons''' in the | ||
Line 307: | Line 300: | ||
translated sequence. | translated sequence. | ||
− | Observe the | + | |
+ | Observe the asterix at the end of the '''protein''' sequence. | ||
It indicates the '''stop codon'''. | It indicates the '''stop codon'''. | ||
Line 314: | Line 308: | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal. | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal. | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| In the above code we have used a coding DNA strand for ''' | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| In the above code, we have used a coding '''DNA''' strand for '''transcription'''. |
+ | |||
+ | In '''Biopython, transcribe method''' works only on coding '''DNA''' strand. | ||
− | |||
However in real biological systems the process of '''transcription''' starts with a '''template strand'''. | However in real biological systems the process of '''transcription''' starts with a '''template strand'''. | ||
Line 324: | Line 319: | ||
coding_dna = template_dna.reverse_complement() | coding_dna = template_dna.reverse_complement() | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| If you are starting with a''' template strand''' | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| If you are starting with a''' template strand''', |
− | + | * convert it to coding strand | |
− | + | * by using '''reverse complement method''', as shown on the '''terminal'''. | |
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Follow the rest of the code as shown above for the coding strand. | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Follow the rest of the code as shown above, for the coding strand. |
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Using methods in '''Biopython''' we have translated a DNA sequence to a protein sequence. | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Using methods in '''Biopython''' we have translated a '''DNA''' sequence to a '''protein''' sequence. |
|- | |- | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:none;padding:0.097cm;"| Cursor on the terminal | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| DNA sequence of any size can be translated to a protein using this code. | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| '''DNA''' sequence of any size can be translated to a '''protein''' using this code. |
|- | |- | ||
Line 344: | Line 339: | ||
'''Summary''' | '''Summary''' | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Let's summarize. |
In this tutorial we have learnt | In this tutorial we have learnt | ||
* Important features of '''Biopython'''. | * Important features of '''Biopython'''. | ||
− | * Information regarding download and installation on Linux OS. | + | * Information regarding download and installation on '''Linux OS'''. |
− | * Create a sequence object for the given DNA strand. | + | * Create a sequence object for the given '''DNA''' strand. |
|- | |- | ||
Line 357: | Line 352: | ||
'''Summary''' | '''Summary''' | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| | ||
− | * '''Transcription''' of the DNA sequence to mRNA. | + | * '''Transcription''' of the '''DNA''' sequence to '''mRNA'''. |
− | * '''Translation''' of mRNA to protein sequence. | + | * '''Translation''' of '''mRNA''' to '''protein''' sequence. |
|- | |- | ||
Line 364: | Line 359: | ||
Assignment | Assignment | ||
− | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Now for the assignment | + | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Now for the assignment. |
− | * Translate the given DNA sequence into protein sequence. | + | |
+ | * Translate the given '''DNA''' sequence into '''protein''' sequence. | ||
+ | * ''''ATGGCCCTATAGTGTCTAAGCTAG'''' | ||
* Observe the output. | * Observe the output. | ||
− | * The protein sequence has an internal '''stop codon'''. | + | * The '''protein''' sequence has an internal '''stop codon'''. |
− | * As it happens in nature, translate the DNA till first in frame '''stop codon'''. | + | * As it happens in nature, translate the '''DNA''' till first in frame '''stop codon'''. |
|- | |- | ||
Line 374: | Line 371: | ||
| style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Your completed assignment should have the following code. | | style="background-color:#ffffff;border-top:none;border-bottom:1pt solid #000000;border-left:1pt solid #000000;border-right:1pt solid #000000;padding:0.097cm;"| Your completed assignment should have the following code. | ||
− | Notice that we have used ''''to underscore stop'''' argument in the '''translate method'''. | + | Notice that, we have used ''''to underscore stop'''' argument in the '''translate method'''. |
+ | |||
− | Notice the output | + | Notice the output. |
The '''stop codon''' itself is not translated. | The '''stop codon''' itself is not translated. | ||
− | The stop symbol is not included at the end of your protein sequence. | + | The stop symbol is not included at the end of your '''protein''' sequence. |
|- | |- |
Latest revision as of 10:46, 26 June 2015
|
|
---|---|
Slide Number 1
Title Slide |
Welcome to this tutorial on Introduction to Biopython |
Slide Number 2
Learning Objectives |
In this tutorial, we will learn about
|
Slide Number 3
Pre-requisites |
To follow this tutorial you should be familiar with,
Refer to the Python tutorials at the given link. |
Slide Number 4
System Requirement |
To record this tutorial I am using
|
Slide Number 5.
About Biopython |
Biopython is a collection of modules for computational biology.
It can perform most basic to advanced tasks required for bioinformatics. |
Slide number 6
Biopython functionality |
Biopython tools are used for:
1. Parsing that is extracting information from various file formats such as FASTA, Genbank etc. 2. Download data from database websites such as NCBI, ExPASY etc 3. Run Bioinformatic algorithms such as BLAST |
Slide Number 7
Biopython functionality |
4. It has tools for performing common operations on sequences.
For example to obtain complements, transcription, translation etc. 5. Code for dealing with alignments. 6. And code to split up tasks into separate processes. |
Slide Number 8
Download |
Information regarding download.
Biopython package is not part of the Python distribution. It needs to be downloaded independently. For details refer the following link |
Slide Number 9
Installation for Ubuntu/Linux systems |
Installation on Linux system.
|
Cursor on the terminal | I have already installed Python, Ipython and Biopython on my system.
Start Ipython interpreter by typing ipython and press Enter. IPython prompt appears on screen. |
Open the terminal and check installation of biopython | To check the installation of Biopython, at the prompt type: import Bio
Press Enter. If you don't get any error message, it means Biopython is installed.
Take precaution while typing keywords, variables or functions. For instance, in the above line “i” in import is lowercase. And “B” is uppercase in Bio. |
Cursor on the terminal. | In this tutorial, we will make use of Biopython modules to translate a DNA sequence. |
Slide Number 10
DNA Translation |
It involves the following steps.
|
Slide Number 11
Sequence Object |
We will be using the coding DNA strand shown on this slide, as an example.
It codes for a small protein sequence. The first step is to create a sequence object for the above coding DNA strand. Let us go back to the terminal. |
Open the terminal
Type: >>> from Bio.Seq import Seq |
For creating a sequence object, import the Seq module from Bio package.
The Seq module provides methods to store and process sequence objects. At the prompt, type from Bio dot Seq import Seq press Enter. |
Cursor on the terminal. | Next, specify the alphabets in the strand explicitly, when creating your sequence object.
That is to specify whether the sequence of alphabets code for nucleotides or amino acids. |
>>> from Bio.Alphabet import IUPAC | To do so we will use IUPAC module from Alphabet package.
At the prompt, type: from Bio dot Alphabet import IUPAC Press Enter. Note that, we have used import and from statements to load Seq and IUPAC modules. |
Type >>> cdna = Seq("ATGTTACACTCCCGATGA", IUPAC.unambiguous_dna)
Press enter cdna press enter Out put Seq(ATGTTACACTCCCGATGA”, IUPAC unambiguousDNA()) |
Store the sequence object in a variable called cdna.
At the prompt, type: cdna equal to Seq as in normal strings. Enclose the sequence within double quotes and parentheses.
So, type: unambiguous DNA alphabet object as an argument.
The output shows the DNA sequence as a sequence object. |
Cursor on the terminal | Let’s transcribe the coding DNA strand into the corresponding mRNA.
We will use the Seq module's built-in “transcribe” method. |
Type
>>> mrna = coding_dna.transcribe() press enter Type mrna press enter >>> mrna Seq('AUGUUACACUCCCGAUGA', IUPACUnambiguousRNA()) |
Type the following code:
Store the output in a variable mrna.
mrna equal to cdna dot transcribe open and close parentheses press Enter.
press Enter. |
Highlight the output | Observe the output.
|
Cursor on the terminal | Next, to translate this mRNA to corresponding protein sequence, use the translate method. |
Type
>>> mrna.translate() press enter Cursor on the terminal. Output: protein Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*')) |
Type the following code:
protein equal to mrna dot translate open and close parentheses press Enter.
|
Cursor on the terminal.
Output: protein Seq('MLHSR*', HasStopCodon(IUPACProtein(), '*')) |
The output shows an amino acid sequence.
The output also shows information regarding the presence of stop codons in the translated sequence.
It indicates the stop codon. |
Cursor on the terminal. | In the above code, we have used a coding DNA strand for transcription.
In Biopython, transcribe method works only on coding DNA strand.
|
Type ,
coding_dna = template_dna.reverse_complement() |
If you are starting with a template strand,
|
Cursor on the terminal | Follow the rest of the code as shown above, for the coding strand. |
Cursor on the terminal | Using methods in Biopython we have translated a DNA sequence to a protein sequence. |
Cursor on the terminal | DNA sequence of any size can be translated to a protein using this code. |
Slide Number 12
Summary |
Let's summarize.
In this tutorial we have learnt
|
Slide Number 13
Summary |
|
Slide Number 14
Assignment |
Now for the assignment.
|
Cursor on the terminal. | Your completed assignment should have the following code.
Notice that, we have used 'to underscore stop' argument in the translate method.
The stop codon itself is not translated. The stop symbol is not included at the end of your protein sequence. |
Slide Number 15
Acknowledgement |
This video summarizes the Spoken Tutorial project.
If you do not have good bandwidth, you can download and watch it. |
Slide Number 16 | The Spoken Tutorial Project Team conducts workshops and gives certificates for those who pass an online test.
For more details, please write to us. |
Slide number 17 | Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.
More information on this Mission is available at this link. |
This is Snehalatha from IIT Bombay signing off. Thank you for joining. |