Difference between revisions of "Biopython/C2/Introduction-to-Biopython/English-timed"

From Script | Spoken-Tutorial
Jump to: navigation, search
(Created page with " {| Border=1 ! <center>Time</center> ! <center>Narration</center> |- | 00:01 |Welcome to this tutorial on '''Introduction to Biopython''' |- | 00:05 |In this tutorial, we w...")
 
Line 6: Line 6:
 
|-
 
|-
 
|  00:01
 
|  00:01
|Welcome to this tutorial on '''Introduction to Biopython'''
+
|Welcome to this tutorial on '''Introduction to Biopython'''.
  
 
|-
 
|-
 
| 00:05
 
| 00:05
|In this tutorial, we will learn about important features of '''Biopython'''.
+
|In this tutorial, we will learn about: *important features of '''Biopython'''
  
 
|-
 
|-
 
| 00:10
 
| 00:10
|Information regarding download and installation on Linux Operating System.
+
|* Information regarding download and installation on Linux Operating System
  
 
|-
 
|-
 
| 00:15
 
| 00:15
|And '''translation''' of a DNA sequence to a protein sequence using '''Biopython''' tools.
+
|* And, '''translation''' of a DNA sequence to a protein sequence using '''Biopython''' tools.
  
 
|-
 
|-
 
| 00:22
 
| 00:22
|To follow this tutorial you should be familiar with,
+
|To follow this tutorial, you should be familiar with-
  
 
|-
 
|-
 
| 00:25
 
| 00:25
|Undergraduate Biochemistry or Bioinformatics
+
|* Undergraduate Biochemistry or Bioinformatics
  
 
|-
 
|-
 
| 00:29
 
| 00:29
|And basic''' Python''' programming  
+
|* And basic''' Python''' programming.
  
 
|-
 
|-
Line 38: Line 38:
 
|-
 
|-
 
| 00:35
 
| 00:35
|To record this tutorial I am using '''Ubuntu''' OS version 12.04
+
|To record this tutorial, I am using: * '''Ubuntu OS''' version 12.04
  
 
|-
 
|-
 
| 00:41
 
| 00:41
|'''Python''' version 2.7.3
+
|* '''Python''' version 2.7.3
  
 
|-
 
|-
 
| 00:44
 
| 00:44
|'''Ipython''' version 0.12.1 and
+
|* '''Ipython''' version 0.12.1 and
  
 
|-
 
|-
 
| 00:48
 
| 00:48
|'''Biopython''' version 1.58
+
|* '''Biopython''' version 1.58.
 
|-
 
|-
 
|  00:51
 
|  00:51
Line 65: Line 65:
 
|-
 
|-
 
|  01:05
 
|  01:05
| '''Parsing''' that is extracting information from various file formats such as '''FASTA''', '''Genbank''' etc.
+
|* '''Parsing''', that is extracting information from various file formats such as '''FASTA''', '''Genbank''' etc.
  
 
|-
 
|-
 
|  01:14
 
|  01:14
| Download data from database websites such as '''NCBI''', '''ExPASY''' etc
+
|* Download data from database websites such as '''NCBI''', '''ExPASY''' etc.
  
 
|-
 
|-
 
|  01:22
 
|  01:22
| Run '''Bioinformatic''' algorithms such as '''BLAST'''
+
|* '''Run''' '''Bioinformatic algorithm'''s such as '''BLAST'''.
  
 
|-
 
|-
Line 81: Line 81:
 
|-
 
|-
 
| 01:31
 
| 01:31
| For example to obtain '''complements''', '''transcription''',''' translation''' etc.  
+
| For example- to obtain '''complements''', '''transcription''',''' translation''' etc.  
  
 
|-
 
|-
 
| 01:38
 
| 01:38
| Code for dealing with alignments.
+
| Code for dealing with alignments  
  
 
|-
 
|-
 
|  01:40
 
|  01:40
| And code to split up tasks into separate processes.  
+
| and code to split up tasks into separate processes.  
  
 
|-
 
|-
 
| 01:46
 
| 01:46
|Information regarding download.
+
|Information regarding download:
  
 
|-
 
|-
Line 101: Line 101:
 
|-
 
|-
 
| 01:54
 
| 01:54
| For details refer the following link
+
| For details, refer the following link.
  
 
|-
 
|-
 
| 01:59
 
| 01:59
|Installation on '''Linux''' system.
+
|Installation on '''Linux''' system:
  
 
|-
 
|-
Line 117: Line 117:
 
|-
 
|-
 
| 02:13
 
| 02:13
| Additional packages must be installed for graphic outputs and plots.  
+
| Additional packages must be installed for '''graphic output'''s and '''plot'''s.  
  
 
|-
 
|-
 
| 02:18
 
| 02:18
| Open the terminal by pressing '''Ctrl, Alt''' and '''T''' keys simultaneously.
+
| Open the '''terminal''' by pressing '''Ctrl, Alt''' and '''T''' keys simultaneously.
  
 
|-
 
|-
Line 137: Line 137:
 
|-
 
|-
 
| 02:38
 
| 02:38
|To check the installation of '''Biopython''', at the prompt type: '''import Bio'''  press '''Enter'''.
+
|To check the installation of '''Biopython'''- at the prompt, type: '''import Bio'''  press '''Enter'''.
  
 
|-
 
|-
Line 145: Line 145:
 
|-
 
|-
 
| 02:54
 
| 02:54
| Here let me remind you, '''Python''' language is case sensitive.
+
| Here, let me remind you, '''Python''' language is case sensitive.
  
 
|-
 
|-
 
| 02:59
 
| 02:59
| Take precaution while typing keywords, variables or functions.
+
| Take precaution while typing keywords, variables or '''function'''s.
  
 
|-
 
|-
Line 165: Line 165:
 
|-
 
|-
 
| 03:22
 
| 03:22
|First create a '''sequence object''' for coding '''DNA''' strand.
+
|First, create a '''sequence object''' for coding '''DNA''' strand.
  
 
|-
 
|-
 
| 03:27
 
| 03:27
|Next '''transcription''' of coding '''DNA''' strand to '''mRNA'''.
+
|Next, '''transcription''' of coding '''DNA''' strand to '''mRNA'''.
  
 
|-
 
|-
 
| 03:32
 
| 03:32
| Finally''' translation''' of '''mRNA''' to a '''protein''' sequence.
+
| Finally,''' translation''' of '''mRNA''' to a '''protein''' sequence.
  
 
|-
 
|-
Line 201: Line 201:
 
|-
 
|-
 
| 04:08
 
| 04:08
| At the prompt, type '''from Bio dot Seq import Seq ''' press '''Enter'''.
+
| At the prompt, type: '''from Bio dot Seq import Seq ''' press '''Enter'''.
  
 
|-
 
|-
Line 213: Line 213:
 
|-
 
|-
 
| 04:32
 
| 04:32
|To do so we will use '''IUPAC '''module from '''Alphabet '''package.  
+
|To do so, we will use '''IUPAC '''module from '''Alphabet '''package.  
  
 
|-
 
|-
 
| 04:38
 
| 04:38
| At the prompt, type:'''from Bio dot Alphabet import IUPAC'''.  Press '''Enter'''.
+
| At the prompt, type: '''from Bio dot Alphabet import IUPAC'''.  Press '''Enter'''.
  
 
|-
 
|-
 
| 04:48
 
| 04:48
| Note that, we have used import and from statements to load Seq and '''IUPAC''' modules.
+
| Note that, we have used '''import''' and '''from''' statements to '''load''' "Seq" and '''IUPAC''' modules.
  
 
|-
 
|-
Line 229: Line 229:
 
|-
 
|-
 
| 05:01
 
| 05:01
| At the prompt, type: '''cdna equal to Seq''' as in normal strings.
+
| At the prompt, type: '''cdna equal to Seq''' as in normal '''string'''s.
  
 
|-
 
|-
Line 237: Line 237:
 
|-
 
|-
 
|  05:13
 
|  05:13
| We know our sequence is a '''DNA''' fragment. So, type: '''unambiguous DNA alphabet object''' as an argument.
+
| We know our sequence is a '''DNA''' fragment. So, type: '''unambiguous DNA alphabet object''' as an '''argument'''.
  
 
|-
 
|-
 
| 05:21
 
| 05:21
| For the output type: '''cdna''';  press '''Enter'''
+
| For the output, type: '''cdna''';  press '''Enter'''.
  
 
|-
 
|-
 
| 05:26
 
| 05:26
| The output shows the DNA sequence as a sequence object.
+
| The output shows the '''DNA sequence''' as a sequence object.
  
 
|-
 
|-
Line 253: Line 253:
 
|-
 
|-
 
| 05:35  
 
| 05:35  
| We will use the Seq module's built-in '''“transcribe”''' method.
+
| We will use the '''Seq''' module's built-in '''“transcribe”''' method.
  
 
|-
 
|-
Line 265: Line 265:
 
|-
 
|-
 
|  05:45
 
|  05:45
| At the prompt type,'''mrna equal to cdna dot transcribe open and close parentheses''' press '''Enter'''.
+
| At the prompt, type: '''mrna equal to cdna dot transcribe open and close parentheses''', press '''Enter'''.
  
 
|-
 
|-
 
|  05:55
 
|  05:55
| For the output, type''' mrna.''' Press '''Enter'''.
+
| For the output, type: ''' mrna.''' Press '''Enter'''.
  
 
|-
 
|-
Line 281: Line 281:
 
|-
 
|-
 
| 06:09
 
| 06:09
|Next, to translate this '''mRNA''' to corresponding '''protein''' sequence, use the '''translate''' method.
+
|Next, to '''translate''' this '''mRNA''' to corresponding '''protein''' sequence, use the '''translate''' method.
  
 
|-
 
|-
Line 289: Line 289:
 
|-
 
|-
 
|  06:27
 
|  06:27
| The translate method translates '''RNA''' or '''DNA''' sequence using the standard genetic code, if unspecified.
+
| The '''translate''' method translates '''RNA''' or '''DNA''' sequence using the standard genetic code, if unspecified.
  
 
|-
 
|-
Line 301: Line 301:
 
|-
 
|-
 
| 06:47
 
| 06:47
| Observe the asterix at the end of the '''protein''' sequence. It indicates the '''stop codon'''.  
+
| Observe the asterisk at the end of the '''protein''' sequence. It indicates the '''stop codon'''.  
  
 
|-
 
|-
Line 313: Line 313:
 
|-
 
|-
 
| 07:04
 
| 07:04
|However in real biological systems the process of '''transcription''' starts with a '''template strand'''.
+
|However, in real biological systems, the process of '''transcription''' starts with a '''template strand'''.
  
 
|-
 
|-
 
| 07:11
 
| 07:11
|If you are starting with a''' template strand''', convert it to coding strand by using '''reverse complement method''', as shown on the '''terminal'''.
+
|If you are starting with a''' template strand''', convert it to coding strand by using '''reverse complement method''', as shown on the terminal.
  
 
|-
 
|-
Line 337: Line 337:
 
|-
 
|-
 
|  07:38
 
|  07:38
|In this tutorial we have learnt
+
|In this tutorial, we have learnt:
  
 
|-
 
|-
 
| 07:41
 
| 07:41
|Important features of '''Biopython'''.
+
|* Important features of '''Biopython'''.
  
 
|-
 
|-
 
| 07:43
 
| 07:43
|Information regarding download and installation on '''Linux OS'''.
+
|* Information regarding download and installation on '''Linux OS'''.
  
 
|-
 
|-
 
| 07:48
 
| 07:48
|Create a sequence object for the given '''DNA''' strand.
+
|* Create a sequence object for the given '''DNA''' strand.
  
 
|-
 
|-
 
| 07:52
 
| 07:52
|'''Transcription''' of the '''DNA''' sequence to '''mRNA'''.
+
|* '''Transcription''' of the '''DNA''' sequence to '''mRNA'''.
  
 
|-
 
|-
 
| 07:56
 
| 07:56
|'''Translation''' of '''mRNA''' to '''protein''' sequence.
+
|* '''Translation''' of '''mRNA''' to '''protein''' sequence.
  
 
|-
 
|-
 
| 08:00
 
| 08:00
|Now for the assignment.
+
|Now for the assignment-
  
 
|-
 
|-
Line 385: Line 385:
 
|-
 
|-
 
| 08:20
 
| 08:20
|Notice that, we have used ''''to underscore stop'''' argument in the '''translate method.''' Notice the output.
+
|Notice that we have used ''''to underscore stop'''' argument in the '''translate method.''' Notice the output.
 
|-
 
|-
 
| 08:27
 
| 08:27
Line 404: Line 404:
 
|-
 
|-
 
|08:43   
 
|08:43   
|The Spoken Tutorial Project Team conducts workshops and gives certificates for those who pass an online test.  
+
|The Spoken Tutorial Project team conducts workshops and gives certificates for those who pass an online test.  
  
 
|-
 
|-
Line 416: Line 416:
 
|-
 
|-
 
| 08:59  
 
| 08:59  
|More information on this Mission is available at this link.  
+
|More information on this mission is available at this link.  
  
 
|-
 
|-
 
| 09:03
 
| 09:03
|This is Snehalatha from IIT Bombay signing off. Thank you for joining.  
+
|This is Snehalatha from '''IIT Bombay''', signing off. Thank you for joining.  
  
 
|}
 
|}

Revision as of 17:51, 1 August 2016

Time
Narration
00:01 Welcome to this tutorial on Introduction to Biopython.
00:05 In this tutorial, we will learn about: *important features of Biopython
00:10 * Information regarding download and installation on Linux Operating System
00:15 * And, translation of a DNA sequence to a protein sequence using Biopython tools.
00:22 To follow this tutorial, you should be familiar with-
00:25 * Undergraduate Biochemistry or Bioinformatics
00:29 * And basic Python programming.
00:31 Refer to the Python tutorials at the given link.
00:35 To record this tutorial, I am using: * Ubuntu OS version 12.04
00:41 * Python version 2.7.3
00:44 * Ipython version 0.12.1 and
00:48 * Biopython version 1.58.
00:51 Biopython is a collection of modules for computational biology.
00:57 It can perform most basic to advanced tasks required for bioinformatics.
01:03 Biopython tools are used for:
01:05 * Parsing, that is extracting information from various file formats such as FASTA, Genbank etc.
01:14 * Download data from database websites such as NCBI, ExPASY etc.
01:22 * Run Bioinformatic algorithms such as BLAST.
01:26 It has tools for performing common operations on sequences.
01:31 For example- to obtain complements, transcription, translation etc.
01:38 Code for dealing with alignments
01:40 and code to split up tasks into separate processes.
01:46 Information regarding download:
01:48 Biopython package is not part of the Python distribution, it needs to be downloaded independently.
01:54 For details, refer the following link.
01:59 Installation on Linux system:
02:02 Install Python, Ipython and Biopython packages using Synaptic Package Manager.
02:08 Prerequisite software will be installed automatically.
02:13 Additional packages must be installed for graphic outputs and plots.
02:18 Open the terminal by pressing Ctrl, Alt and T keys simultaneously.
02:24 I have already installed Python, Ipython and Biopython on my system.
02:30 Start Ipython interpreter by typing ipython and press Enter.
02:35 IPython prompt appears on screen.
02:38 To check the installation of Biopython- at the prompt, type: import Bio press Enter.
02:48 If you don't get any error message, it means Biopython is installed.
02:54 Here, let me remind you, Python language is case sensitive.
02:59 Take precaution while typing keywords, variables or functions.
03:04 For instance, in the above line “i” in import is lowercase and “B” is uppercase in Bio.
03:12 In this tutorial, we will make use of Biopython modules to translate a DNA sequence.
03:19 It involves the following steps.
03:22 First, create a sequence object for coding DNA strand.
03:27 Next, transcription of coding DNA strand to mRNA.
03:32 Finally, translation of mRNA to a protein sequence.
03:37 We will be using the coding DNA strand shown on this slide, as an example.
03:42 It codes for a small protein sequence.
03:46 The first step is to create a sequence object for the above coding DNA strand.
03:52 Let us go back to the terminal.
03:55 For creating a sequence object, import the Seq module from Bio package.
04:02 The Seq module provides methods to store and process sequence objects.
04:08 At the prompt, type: from Bio dot Seq import Seq press Enter.
04:17 Next, specify the alphabets in the strand explicitly, when creating your sequence object.
04:24 That is to specify whether the sequence of alphabets code for nucleotides or amino acids.
04:32 To do so, we will use IUPAC module from Alphabet package.
04:38 At the prompt, type: from Bio dot Alphabet import IUPAC. Press Enter.
04:48 Note that, we have used import and from statements to load "Seq" and IUPAC modules.
04:56 Store the sequence object in a variable called cdna.
05:01 At the prompt, type: cdna equal to Seq as in normal strings.
05:08 Enclose the sequence within double quotes and parentheses.
05:13 We know our sequence is a DNA fragment. So, type: unambiguous DNA alphabet object as an argument.
05:21 For the output, type: cdna; press Enter.
05:26 The output shows the DNA sequence as a sequence object.
05:30 Let’s transcribe the coding DNA strand into the corresponding mRNA.
05:35 We will use the Seq module's built-in “transcribe” method.
05:39 Type the following code:
05:41 Store the output in a variable mrna.
05:45 At the prompt, type: mrna equal to cdna dot transcribe open and close parentheses, press Enter.
05:55 For the output, type: mrna. Press Enter.
06:01 Observe the output.
06:02 The transcribe method replaces the Thiamin in the DNA sequence by Uracil.
06:09 Next, to translate this mRNA to corresponding protein sequence, use the translate method.
06:16 Type the following code: protein equal to mrna dot translate open and close parentheses. Press Enter.
06:27 The translate method translates RNA or DNA sequence using the standard genetic code, if unspecified.
06:36 The output shows an amino acid sequence.
06:40 The output also shows information regarding the presence of stop codons in the translated sequence.
06:47 Observe the asterisk at the end of the protein sequence. It indicates the stop codon.
06:53 In the above code, we have used a coding DNA strand for transcription.
06:59 In Biopython, transcribe method works only on coding DNA strand.
07:04 However, in real biological systems, the process of transcription starts with a template strand.
07:11 If you are starting with a template strand, convert it to coding strand by using reverse complement method, as shown on the terminal.
07:20 Follow the rest of the code as shown above, for the coding strand.
07:24 Using methods in Biopython we have translated a DNA sequence to a protein sequence.
07:31 DNA sequence of any size can be translated to a protein sequence using this code.
07:37 Let's summarize.
07:38 In this tutorial, we have learnt:
07:41 * Important features of Biopython.
07:43 * Information regarding download and installation on Linux OS.
07:48 * Create a sequence object for the given DNA strand.
07:52 * Transcription of the DNA sequence to mRNA.
07:56 * Translation of mRNA to protein sequence.
08:00 Now for the assignment-
08:02 Translate the given DNA sequence into protein sequence.
08:06 Observe the output.
08:08 The protein sequence has an internal stop codon.
08:11 As it happens in nature, translate the DNA till first in frame stop codon.
08:17 Your completed assignment should have the following code.
08:20 Notice that we have used 'to underscore stop' argument in the translate method. Notice the output.
08:27 The stop codon itself is not translated.
08:31 The stop symbol is not included at the end of your protein sequence.
08:36 This video summarizes the Spoken Tutorial project.
08:39 If you do not have good bandwidth, you can download and watch it.
08:43 The Spoken Tutorial Project team conducts workshops and gives certificates for those who pass an online test.
08:50 For more details, please write to us.
08:53 Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.
08:59 More information on this mission is available at this link.
09:03 This is Snehalatha from IIT Bombay, signing off. Thank you for joining.

Contributors and Content Editors

PoojaMoolya, Sandhya.np14