Difference between revisions of "Biopython/C2/Introduction-to-Biopython/English-timed"
From Script | Spoken-Tutorial
PoojaMoolya (Talk | contribs) (Created page with " {| Border=1 ! <center>Time</center> ! <center>Narration</center> |- | 00:01 |Welcome to this tutorial on '''Introduction to Biopython''' |- | 00:05 |In this tutorial, we w...") |
Sandhya.np14 (Talk | contribs) |
||
| Line 6: | Line 6: | ||
|- | |- | ||
| 00:01 | | 00:01 | ||
| − | |Welcome to this tutorial on '''Introduction to Biopython''' | + | |Welcome to this tutorial on '''Introduction to Biopython'''. |
|- | |- | ||
| 00:05 | | 00:05 | ||
| − | |In this tutorial, we will learn about important features of '''Biopython''' | + | |In this tutorial, we will learn about: *important features of '''Biopython''' |
|- | |- | ||
| 00:10 | | 00:10 | ||
| − | |Information regarding download and installation on Linux Operating System | + | |* Information regarding download and installation on Linux Operating System |
|- | |- | ||
| 00:15 | | 00:15 | ||
| − | |And '''translation''' of a DNA sequence to a protein sequence using '''Biopython''' tools. | + | |* And, '''translation''' of a DNA sequence to a protein sequence using '''Biopython''' tools. |
|- | |- | ||
| 00:22 | | 00:22 | ||
| − | |To follow this tutorial you should be familiar with | + | |To follow this tutorial, you should be familiar with- |
|- | |- | ||
| 00:25 | | 00:25 | ||
| − | |Undergraduate Biochemistry or Bioinformatics | + | |* Undergraduate Biochemistry or Bioinformatics |
|- | |- | ||
| 00:29 | | 00:29 | ||
| − | |And basic''' Python''' programming | + | |* And basic''' Python''' programming. |
|- | |- | ||
| Line 38: | Line 38: | ||
|- | |- | ||
| 00:35 | | 00:35 | ||
| − | |To record this tutorial I am using '''Ubuntu''' | + | |To record this tutorial, I am using: * '''Ubuntu OS''' version 12.04 |
|- | |- | ||
| 00:41 | | 00:41 | ||
| − | |'''Python''' version 2.7.3 | + | |* '''Python''' version 2.7.3 |
|- | |- | ||
| 00:44 | | 00:44 | ||
| − | |'''Ipython''' version 0.12.1 and | + | |* '''Ipython''' version 0.12.1 and |
|- | |- | ||
| 00:48 | | 00:48 | ||
| − | |'''Biopython''' version 1.58 | + | |* '''Biopython''' version 1.58. |
|- | |- | ||
| 00:51 | | 00:51 | ||
| Line 65: | Line 65: | ||
|- | |- | ||
| 01:05 | | 01:05 | ||
| − | | '''Parsing''' that is extracting information from various file formats such as '''FASTA''', '''Genbank''' etc. | + | |* '''Parsing''', that is extracting information from various file formats such as '''FASTA''', '''Genbank''' etc. |
|- | |- | ||
| 01:14 | | 01:14 | ||
| − | | Download data from database websites such as '''NCBI''', '''ExPASY''' etc | + | |* Download data from database websites such as '''NCBI''', '''ExPASY''' etc. |
|- | |- | ||
| 01:22 | | 01:22 | ||
| − | | Run '''Bioinformatic''' | + | |* '''Run''' '''Bioinformatic algorithm'''s such as '''BLAST'''. |
|- | |- | ||
| Line 81: | Line 81: | ||
|- | |- | ||
| 01:31 | | 01:31 | ||
| − | | For example to obtain '''complements''', '''transcription''',''' translation''' etc. | + | | For example- to obtain '''complements''', '''transcription''',''' translation''' etc. |
|- | |- | ||
| 01:38 | | 01:38 | ||
| − | | Code for dealing with alignments | + | | Code for dealing with alignments |
|- | |- | ||
| 01:40 | | 01:40 | ||
| − | | | + | | and code to split up tasks into separate processes. |
|- | |- | ||
| 01:46 | | 01:46 | ||
| − | |Information regarding download | + | |Information regarding download: |
|- | |- | ||
| Line 101: | Line 101: | ||
|- | |- | ||
| 01:54 | | 01:54 | ||
| − | | For details refer the following link | + | | For details, refer the following link. |
|- | |- | ||
| 01:59 | | 01:59 | ||
| − | |Installation on '''Linux''' system | + | |Installation on '''Linux''' system: |
|- | |- | ||
| Line 117: | Line 117: | ||
|- | |- | ||
| 02:13 | | 02:13 | ||
| − | | Additional packages must be installed for graphic | + | | Additional packages must be installed for '''graphic output'''s and '''plot'''s. |
|- | |- | ||
| 02:18 | | 02:18 | ||
| − | | Open the terminal by pressing '''Ctrl, Alt''' and '''T''' keys simultaneously. | + | | Open the '''terminal''' by pressing '''Ctrl, Alt''' and '''T''' keys simultaneously. |
|- | |- | ||
| Line 137: | Line 137: | ||
|- | |- | ||
| 02:38 | | 02:38 | ||
| − | |To check the installation of '''Biopython''' | + | |To check the installation of '''Biopython'''- at the prompt, type: '''import Bio''' press '''Enter'''. |
|- | |- | ||
| Line 145: | Line 145: | ||
|- | |- | ||
| 02:54 | | 02:54 | ||
| − | | Here let me remind you, '''Python''' language is case sensitive. | + | | Here, let me remind you, '''Python''' language is case sensitive. |
|- | |- | ||
| 02:59 | | 02:59 | ||
| − | | Take precaution while typing keywords, variables or | + | | Take precaution while typing keywords, variables or '''function'''s. |
|- | |- | ||
| Line 165: | Line 165: | ||
|- | |- | ||
| 03:22 | | 03:22 | ||
| − | |First create a '''sequence object''' for coding '''DNA''' strand. | + | |First, create a '''sequence object''' for coding '''DNA''' strand. |
|- | |- | ||
| 03:27 | | 03:27 | ||
| − | |Next '''transcription''' of coding '''DNA''' strand to '''mRNA'''. | + | |Next, '''transcription''' of coding '''DNA''' strand to '''mRNA'''. |
|- | |- | ||
| 03:32 | | 03:32 | ||
| − | | Finally''' translation''' of '''mRNA''' to a '''protein''' sequence. | + | | Finally,''' translation''' of '''mRNA''' to a '''protein''' sequence. |
|- | |- | ||
| Line 201: | Line 201: | ||
|- | |- | ||
| 04:08 | | 04:08 | ||
| − | | At the prompt, type '''from Bio dot Seq import Seq ''' press '''Enter'''. | + | | At the prompt, type: '''from Bio dot Seq import Seq ''' press '''Enter'''. |
|- | |- | ||
| Line 213: | Line 213: | ||
|- | |- | ||
| 04:32 | | 04:32 | ||
| − | |To do so we will use '''IUPAC '''module from '''Alphabet '''package. | + | |To do so, we will use '''IUPAC '''module from '''Alphabet '''package. |
|- | |- | ||
| 04:38 | | 04:38 | ||
| − | | At the prompt, type:'''from Bio dot Alphabet import IUPAC'''. Press '''Enter'''. | + | | At the prompt, type: '''from Bio dot Alphabet import IUPAC'''. Press '''Enter'''. |
|- | |- | ||
| 04:48 | | 04:48 | ||
| − | | Note that, we have used import and from statements to load Seq and '''IUPAC''' modules. | + | | Note that, we have used '''import''' and '''from''' statements to '''load''' "Seq" and '''IUPAC''' modules. |
|- | |- | ||
| Line 229: | Line 229: | ||
|- | |- | ||
| 05:01 | | 05:01 | ||
| − | | At the prompt, type: '''cdna equal to Seq''' as in normal | + | | At the prompt, type: '''cdna equal to Seq''' as in normal '''string'''s. |
|- | |- | ||
| Line 237: | Line 237: | ||
|- | |- | ||
| 05:13 | | 05:13 | ||
| − | | We know our sequence is a '''DNA''' fragment. So, type: '''unambiguous DNA alphabet object''' as an argument. | + | | We know our sequence is a '''DNA''' fragment. So, type: '''unambiguous DNA alphabet object''' as an '''argument'''. |
|- | |- | ||
| 05:21 | | 05:21 | ||
| − | | For the output type: '''cdna'''; press '''Enter''' | + | | For the output, type: '''cdna'''; press '''Enter'''. |
|- | |- | ||
| 05:26 | | 05:26 | ||
| − | | The output shows the DNA sequence as a sequence object. | + | | The output shows the '''DNA sequence''' as a sequence object. |
|- | |- | ||
| Line 253: | Line 253: | ||
|- | |- | ||
| 05:35 | | 05:35 | ||
| − | | We will use the Seq module's built-in '''“transcribe”''' method. | + | | We will use the '''Seq''' module's built-in '''“transcribe”''' method. |
|- | |- | ||
| Line 265: | Line 265: | ||
|- | |- | ||
| 05:45 | | 05:45 | ||
| − | | At the prompt | + | | At the prompt, type: '''mrna equal to cdna dot transcribe open and close parentheses''', press '''Enter'''. |
|- | |- | ||
| 05:55 | | 05:55 | ||
| − | | For the output, type''' mrna.''' Press '''Enter'''. | + | | For the output, type: ''' mrna.''' Press '''Enter'''. |
|- | |- | ||
| Line 281: | Line 281: | ||
|- | |- | ||
| 06:09 | | 06:09 | ||
| − | |Next, to translate this '''mRNA''' to corresponding '''protein''' sequence, use the '''translate''' method. | + | |Next, to '''translate''' this '''mRNA''' to corresponding '''protein''' sequence, use the '''translate''' method. |
|- | |- | ||
| Line 289: | Line 289: | ||
|- | |- | ||
| 06:27 | | 06:27 | ||
| − | | The translate method translates '''RNA''' or '''DNA''' sequence using the standard genetic code, if unspecified. | + | | The '''translate''' method translates '''RNA''' or '''DNA''' sequence using the standard genetic code, if unspecified. |
|- | |- | ||
| Line 301: | Line 301: | ||
|- | |- | ||
| 06:47 | | 06:47 | ||
| − | | Observe the | + | | Observe the asterisk at the end of the '''protein''' sequence. It indicates the '''stop codon'''. |
|- | |- | ||
| Line 313: | Line 313: | ||
|- | |- | ||
| 07:04 | | 07:04 | ||
| − | |However in real biological systems the process of '''transcription''' starts with a '''template strand'''. | + | |However, in real biological systems, the process of '''transcription''' starts with a '''template strand'''. |
|- | |- | ||
| 07:11 | | 07:11 | ||
| − | |If you are starting with a''' template strand''', convert it to coding strand by using '''reverse complement method''', as shown on the | + | |If you are starting with a''' template strand''', convert it to coding strand by using '''reverse complement method''', as shown on the terminal. |
|- | |- | ||
| Line 337: | Line 337: | ||
|- | |- | ||
| 07:38 | | 07:38 | ||
| − | |In this tutorial we have learnt | + | |In this tutorial, we have learnt: |
|- | |- | ||
| 07:41 | | 07:41 | ||
| − | |Important features of '''Biopython'''. | + | |* Important features of '''Biopython'''. |
|- | |- | ||
| 07:43 | | 07:43 | ||
| − | |Information regarding download and installation on '''Linux OS'''. | + | |* Information regarding download and installation on '''Linux OS'''. |
|- | |- | ||
| 07:48 | | 07:48 | ||
| − | |Create a sequence object for the given '''DNA''' strand. | + | |* Create a sequence object for the given '''DNA''' strand. |
|- | |- | ||
| 07:52 | | 07:52 | ||
| − | |'''Transcription''' of the '''DNA''' sequence to '''mRNA'''. | + | |* '''Transcription''' of the '''DNA''' sequence to '''mRNA'''. |
|- | |- | ||
| 07:56 | | 07:56 | ||
| − | |'''Translation''' of '''mRNA''' to '''protein''' sequence. | + | |* '''Translation''' of '''mRNA''' to '''protein''' sequence. |
|- | |- | ||
| 08:00 | | 08:00 | ||
| − | |Now for the assignment | + | |Now for the assignment- |
|- | |- | ||
| Line 385: | Line 385: | ||
|- | |- | ||
| 08:20 | | 08:20 | ||
| − | |Notice that | + | |Notice that we have used ''''to underscore stop'''' argument in the '''translate method.''' Notice the output. |
|- | |- | ||
| 08:27 | | 08:27 | ||
| Line 404: | Line 404: | ||
|- | |- | ||
|08:43 | |08:43 | ||
| − | |The Spoken Tutorial Project | + | |The Spoken Tutorial Project team conducts workshops and gives certificates for those who pass an online test. |
|- | |- | ||
| Line 416: | Line 416: | ||
|- | |- | ||
| 08:59 | | 08:59 | ||
| − | |More information on this | + | |More information on this mission is available at this link. |
|- | |- | ||
| 09:03 | | 09:03 | ||
| − | |This is Snehalatha from IIT Bombay signing off. Thank you for joining. | + | |This is Snehalatha from '''IIT Bombay''', signing off. Thank you for joining. |
|} | |} | ||
Revision as of 17:51, 1 August 2016
| |
|
|---|---|
| 00:01 | Welcome to this tutorial on Introduction to Biopython. |
| 00:05 | In this tutorial, we will learn about: *important features of Biopython |
| 00:10 | * Information regarding download and installation on Linux Operating System |
| 00:15 | * And, translation of a DNA sequence to a protein sequence using Biopython tools. |
| 00:22 | To follow this tutorial, you should be familiar with- |
| 00:25 | * Undergraduate Biochemistry or Bioinformatics |
| 00:29 | * And basic Python programming. |
| 00:31 | Refer to the Python tutorials at the given link. |
| 00:35 | To record this tutorial, I am using: * Ubuntu OS version 12.04 |
| 00:41 | * Python version 2.7.3 |
| 00:44 | * Ipython version 0.12.1 and |
| 00:48 | * Biopython version 1.58. |
| 00:51 | Biopython is a collection of modules for computational biology. |
| 00:57 | It can perform most basic to advanced tasks required for bioinformatics. |
| 01:03 | Biopython tools are used for: |
| 01:05 | * Parsing, that is extracting information from various file formats such as FASTA, Genbank etc. |
| 01:14 | * Download data from database websites such as NCBI, ExPASY etc. |
| 01:22 | * Run Bioinformatic algorithms such as BLAST. |
| 01:26 | It has tools for performing common operations on sequences. |
| 01:31 | For example- to obtain complements, transcription, translation etc. |
| 01:38 | Code for dealing with alignments |
| 01:40 | and code to split up tasks into separate processes. |
| 01:46 | Information regarding download: |
| 01:48 | Biopython package is not part of the Python distribution, it needs to be downloaded independently. |
| 01:54 | For details, refer the following link. |
| 01:59 | Installation on Linux system: |
| 02:02 | Install Python, Ipython and Biopython packages using Synaptic Package Manager. |
| 02:08 | Prerequisite software will be installed automatically. |
| 02:13 | Additional packages must be installed for graphic outputs and plots. |
| 02:18 | Open the terminal by pressing Ctrl, Alt and T keys simultaneously. |
| 02:24 | I have already installed Python, Ipython and Biopython on my system. |
| 02:30 | Start Ipython interpreter by typing ipython and press Enter. |
| 02:35 | IPython prompt appears on screen. |
| 02:38 | To check the installation of Biopython- at the prompt, type: import Bio press Enter. |
| 02:48 | If you don't get any error message, it means Biopython is installed. |
| 02:54 | Here, let me remind you, Python language is case sensitive. |
| 02:59 | Take precaution while typing keywords, variables or functions. |
| 03:04 | For instance, in the above line “i” in import is lowercase and “B” is uppercase in Bio. |
| 03:12 | In this tutorial, we will make use of Biopython modules to translate a DNA sequence. |
| 03:19 | It involves the following steps. |
| 03:22 | First, create a sequence object for coding DNA strand. |
| 03:27 | Next, transcription of coding DNA strand to mRNA. |
| 03:32 | Finally, translation of mRNA to a protein sequence. |
| 03:37 | We will be using the coding DNA strand shown on this slide, as an example. |
| 03:42 | It codes for a small protein sequence. |
| 03:46 | The first step is to create a sequence object for the above coding DNA strand. |
| 03:52 | Let us go back to the terminal. |
| 03:55 | For creating a sequence object, import the Seq module from Bio package. |
| 04:02 | The Seq module provides methods to store and process sequence objects. |
| 04:08 | At the prompt, type: from Bio dot Seq import Seq press Enter. |
| 04:17 | Next, specify the alphabets in the strand explicitly, when creating your sequence object. |
| 04:24 | That is to specify whether the sequence of alphabets code for nucleotides or amino acids. |
| 04:32 | To do so, we will use IUPAC module from Alphabet package. |
| 04:38 | At the prompt, type: from Bio dot Alphabet import IUPAC. Press Enter. |
| 04:48 | Note that, we have used import and from statements to load "Seq" and IUPAC modules. |
| 04:56 | Store the sequence object in a variable called cdna. |
| 05:01 | At the prompt, type: cdna equal to Seq as in normal strings. |
| 05:08 | Enclose the sequence within double quotes and parentheses. |
| 05:13 | We know our sequence is a DNA fragment. So, type: unambiguous DNA alphabet object as an argument. |
| 05:21 | For the output, type: cdna; press Enter. |
| 05:26 | The output shows the DNA sequence as a sequence object. |
| 05:30 | Let’s transcribe the coding DNA strand into the corresponding mRNA. |
| 05:35 | We will use the Seq module's built-in “transcribe” method. |
| 05:39 | Type the following code: |
| 05:41 | Store the output in a variable mrna. |
| 05:45 | At the prompt, type: mrna equal to cdna dot transcribe open and close parentheses, press Enter. |
| 05:55 | For the output, type: mrna. Press Enter. |
| 06:01 | Observe the output. |
| 06:02 | The transcribe method replaces the Thiamin in the DNA sequence by Uracil. |
| 06:09 | Next, to translate this mRNA to corresponding protein sequence, use the translate method. |
| 06:16 | Type the following code: protein equal to mrna dot translate open and close parentheses. Press Enter. |
| 06:27 | The translate method translates RNA or DNA sequence using the standard genetic code, if unspecified. |
| 06:36 | The output shows an amino acid sequence. |
| 06:40 | The output also shows information regarding the presence of stop codons in the translated sequence. |
| 06:47 | Observe the asterisk at the end of the protein sequence. It indicates the stop codon. |
| 06:53 | In the above code, we have used a coding DNA strand for transcription. |
| 06:59 | In Biopython, transcribe method works only on coding DNA strand. |
| 07:04 | However, in real biological systems, the process of transcription starts with a template strand. |
| 07:11 | If you are starting with a template strand, convert it to coding strand by using reverse complement method, as shown on the terminal. |
| 07:20 | Follow the rest of the code as shown above, for the coding strand. |
| 07:24 | Using methods in Biopython we have translated a DNA sequence to a protein sequence. |
| 07:31 | DNA sequence of any size can be translated to a protein sequence using this code. |
| 07:37 | Let's summarize. |
| 07:38 | In this tutorial, we have learnt: |
| 07:41 | * Important features of Biopython. |
| 07:43 | * Information regarding download and installation on Linux OS. |
| 07:48 | * Create a sequence object for the given DNA strand. |
| 07:52 | * Transcription of the DNA sequence to mRNA. |
| 07:56 | * Translation of mRNA to protein sequence. |
| 08:00 | Now for the assignment- |
| 08:02 | Translate the given DNA sequence into protein sequence. |
| 08:06 | Observe the output. |
| 08:08 | The protein sequence has an internal stop codon. |
| 08:11 | As it happens in nature, translate the DNA till first in frame stop codon. |
| 08:17 | Your completed assignment should have the following code. |
| 08:20 | Notice that we have used 'to underscore stop' argument in the translate method. Notice the output. |
| 08:27 | The stop codon itself is not translated. |
| 08:31 | The stop symbol is not included at the end of your protein sequence. |
| 08:36 | This video summarizes the Spoken Tutorial project. |
| 08:39 | If you do not have good bandwidth, you can download and watch it. |
| 08:43 | The Spoken Tutorial Project team conducts workshops and gives certificates for those who pass an online test. |
| 08:50 | For more details, please write to us. |
| 08:53 | Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India. |
| 08:59 | More information on this mission is available at this link. |
| 09:03 | This is Snehalatha from IIT Bombay, signing off. Thank you for joining. |