Difference between revisions of "Biopython/C2/Manipulating-Sequences/English-timed"
From Script | Spoken-Tutorial
PoojaMoolya (Talk | contribs) (Created page with " {| Border=1 ! <center>Time</center> ! <center>Narration</center> |- | 00:01 | Welcome to this tutorial on '''Manipulating Sequences.''' |- | 00:06 | In this tutorial, we wi...") |
Sandhya.np14 (Talk | contribs) |
||
Line 10: | Line 10: | ||
|- | |- | ||
| 00:06 | | 00:06 | ||
− | | In this tutorial, we will use Biopython tools To Generate a random DNA sequence | + | | In this tutorial, we will use Biopython tools:* To Generate a random DNA sequence |
|- | |- | ||
| 00:13 | | 00:13 | ||
− | |Slice a DNA sequence at specified locations | + | |* Slice a DNA sequence at specified locations |
|- | |- | ||
| 00:17 | | 00:17 | ||
− | | Join two sequences together to form a new sequence that is to | + | |* Join two sequences together to form a new sequence that is to concatenate |
|- | |- | ||
| 00:22 | | 00:22 | ||
− | | Find the length of the sequence | + | |* Find the length of the sequence |
|- | |- | ||
| 00:26 | | 00:26 | ||
− | |Count the number of individual bases or part of the string | + | |* Count the number of individual bases or part of the string |
|- | |- | ||
| 00:31 | | 00:31 | ||
− | |Find a particular base or part of the string. | + | |* Find a particular base or part of the string. |
|- | |- | ||
| 00:35 | | 00:35 | ||
− | |Convert a sequence object to a mutable sequence object. | + | |* Convert a sequence object to a mutable sequence object. |
|- | |- | ||
| 00:40 | | 00:40 | ||
− | | To follow this tutorial you should be familiar with | + | | To follow this tutorial, you should be familiar with undergraduate Biochemistry or Bioinformatics |
|- | |- | ||
| 00:47 | | 00:47 | ||
− | | | + | |and basic Python programming. |
|- | |- | ||
| 00:51 | | 00:51 | ||
− | |If not | + | |If not, refer to the '''Python''' tutorials at the given link. |
|- | |- | ||
| 00:56 | | 00:56 | ||
− | | To record this tutorial I am using | + | | To record this tutorial, I am using: * '''Ubuntu OS''' version 14.10 |
|- | |- | ||
| 01:03 | | 01:03 | ||
− | |Python version 2.7.8 | + | |* '''Python''' version 2.7.8 |
|- | |- | ||
| 01:07 | | 01:07 | ||
− | |Ipython | + | |* '''Ipython interpreter''' version 2.3.0 |
|- | |- | ||
| 01:12 | | 01:12 | ||
− | |Biopython version 1.64 | + | |* '''Biopython''' version 1.64. |
|- | |- | ||
| 01:16 | | 01:16 | ||
− | |Let me open the terminal and start ipython | + | |Let me open the '''terminal''' and start '''ipython interpreter'''. |
|- | |- | ||
| 01:21 | | 01:21 | ||
− | |Press | + | |Press '''Ctrl, Alt''' and '''t''' keys simultaneously. |
|- | |- | ||
|01:26 | |01:26 | ||
− | |At the prompt, type ipython and press | + | |At the prompt, type: '''ipython''' and press '''Enter'''. |
|- | |- | ||
| 01:31 | | 01:31 | ||
− | |Ipython prompt appears on the screen. | + | |'''Ipython''' prompt appears on the screen. |
|- | |- |
Revision as of 23:46, 1 August 2016
|
|
---|---|
00:01 | Welcome to this tutorial on Manipulating Sequences. |
00:06 | In this tutorial, we will use Biopython tools:* To Generate a random DNA sequence |
00:13 | * Slice a DNA sequence at specified locations |
00:17 | * Join two sequences together to form a new sequence that is to concatenate |
00:22 | * Find the length of the sequence |
00:26 | * Count the number of individual bases or part of the string |
00:31 | * Find a particular base or part of the string. |
00:35 | * Convert a sequence object to a mutable sequence object. |
00:40 | To follow this tutorial, you should be familiar with undergraduate Biochemistry or Bioinformatics |
00:47 | and basic Python programming. |
00:51 | If not, refer to the Python tutorials at the given link. |
00:56 | To record this tutorial, I am using: * Ubuntu OS version 14.10 |
01:03 | * Python version 2.7.8 |
01:07 | * Ipython interpreter version 2.3.0 |
01:12 | * Biopython version 1.64. |
01:16 | Let me open the terminal and start ipython interpreter. |
01:21 | Press Ctrl, Alt and t keys simultaneously. |
01:26 | At the prompt, type: ipython and press Enter. |
01:31 | Ipython prompt appears on the screen. |
01:35 | Using Biopython we can generate a sequence object for a random DNA sequence of any specified length. |
01:44 | Let us now generate a sequence object for a DNA sequence of 20 bases. |
01:50 | At the prompt. Typeimport random press enter. |
01:56 | Next import Seq module from Bio package. |
02:01 | Often Seq is pronounced as seek. |
02:06 | At the prompt type,From Bio dot Seq import Seq . Press enter. |
02:15 | We will use Bio.Alphabet module to specify the alphabets in the DNA sequence. |
02:22 | Type,from Bio dot alphabet import generic underscore dna. press enter. |
02:32 | Type the following command to create a sequence object for the random DNA sequence; |
02:38 | Store the sequence in a variable dna1 |
02:42 | Please note in this command use two single quotes instead of a double quote. Press enter. |
02:50 | For the output, type dna1. Press enter |
02:55 | The output shows the sequence object for the random DNA sequence. |
03:00 | If you want a new sequence, press up arrow key to get the same command as above. Press enter. |
03:11 | For the output, type the variable name, dna1. Press enter |
03:17 | The output shows a new DNA sequence, which is different from the first one. |
03:23 | About Sequence Objects |
03:25 | The sequence objects usually act like normal Python strings. |
03:30 | So follow the normal conventions as you do for Python strings |
03:35 | In Python, we count the characters in the string starting from 0 instead of 1. |
03:41 | The first character in the sequence is position zero. |
03:45 | Back to the terminal. |
03:47 | Often you many need to work with only a part of the sequence. |
03:52 | Now lets see how to extract parts of the string and store them as sequence objects. |
03:58 | For example we will slice the DNA sequence at two positions. |
04:04 | First between bases 6 and 7. |
04:08 | This will extract a fragment from the beginning of the sequence to the 6th base in the sequence. |
04:15 | The second slice will be between bases 11 and 12. |
04:20 | The second fragment will be from the 12th base to the end of the sequence. |
04:26 | Type the following command at the prompt to extract the first fragment. |
04:31 | String1 equal to dna1 within brackets 0 semicolon 6. |
04:39 | string1 is the variable to store the first fragment. |
04:43 | The rest of the command follows as in normal Python. |
04:47 | Enclosed in these brackets are the start and stop positions separated by a colon. |
04:53 | The positions are inclusive of the start, but exclusive of the stop position. Press Enter |
05:01 | To view the output type, string1, Press enter. |
05:04 | The output shows the first fragment as the sequence object. |
05:10 | To extract the second string from the sequence, Press up arrow key and edit the command as follows: |
05:17 | Change the name of the variable to string2, and positions to 11 and 20. |
05:24 | For the output type string2. Press enter. |
05:30 | Now we have the 2nd fragment also as a sequence object. |
05:34 | Let us concatenate, that is, add the two strings together to form a new fragment: |
05:42 | Store the new sequence in a variable dna2. |
05:46 | Type,dna2 equal to string1 plus string2. Press enter |
05:53 | Please note; we cannot add sequences with incompatible alphabets. |
05:59 | That is we cannot concatenate a DNA sequence and a protein sequence, to form a new sequence. |
06:07 | The two sequences must have the same alphabet attribute. |
06:12 | To view the output, type dna2. Press enter |
06:17 | The output shows a new sequence which is a combination of string1 and string2. |
06:23 | To find the length of the new sequence, we will use len function. |
06:29 | Type len within parenthesis dna2. Press enter |
06:34 | Output shows the sequence as 15 bases long. |
06:39 | We can also count the number of individual bases present in the sequence. |
06:44 | To do so we will use count function. |
06:47 | For example to count the number of alanines present in the sequence, type the following command dna2 dot count within parenthesis within doublequotes alphabet A. |
07:02 | Press enter |
07:04 | The output shows the number of alanines present in the sequence dna2. |
07:10 | To find a particular base or part of the string we will use find function. |
07:16 | Type dna2 dot find within parenthesis within doublequotes GC. Press enter |
07:26 | The output indicates the position of the first instance of the appearance of GC in the string. |
07:32 | Normally a sequence object cannot be edited. |
07:35 | To edit a sequence we have to convert it to the mutable sequence object. |
07:41 | To do so, type,dna3 equal to dna2 dot to mutable open and close parenthesis. Press enter |
07:52 | For the output, type dna3. Press enter |
07:55 | Now the sequence object can be edited. |
07:59 | Let us replace a base from the sequence. |
08:01 | For example to replace a base present at 5th position to alanine type dna3 within brackets 5 equal to within double quotes alphabet A. Press enter |
08:19 | For the output type dna3. Press enter. |
08:24 | Observe the output, the cytosine at position 5 is replaced with alanine. |
08:31 | To replace a part of the string, type the following command. |
08:35 | Dna3 within brackets 6 semicolon 10 equal to within double quotes ATGC. Press enter |
08:45 | For the output type dna3. Press enter. |
08:52 | The output shows the 4 bases from position the 6 to 9 are replaced with new bases ATGC. |
09:01 | Once you have edited your sequence object, convert it back to the “read only” form. |
09:07 | Type the following dna4 equal to dna3 dot to seq open and close parenthesis. Press enter. |
09:19 | For the output type dna4. Press enter. |
09:25 | Let's summarize, |
09:27 | In this tutorial, we have learnt to, Generate a random DNA sequence. |
09:32 | Slice a DNA sequence at specified locations |
09:36 | Join two sequences together to form a new sequence that is to Concatenate. |
09:43 | We have also learnt how to use len, count and find functions. |
09:49 | Convert a sequence object to a mutable sequence object and replace a base or part of the string. |
09:57 | For the assignment Generate a random DNA sequence of 30 bases. |
10:02 | Using Biopython tools calculate the GC percentage and Molecular Weight of the sequence. |
10:09 | Your completed assignment will be as follows. |
10:13 | The out put shows the GC content as percentage. |
10:18 | The output shows the molecular weight of the DNA sequence. |
10:23 | This video summarizes the Spoken Tutorial project |
10:26 | If you do not have good bandwidth, you can download and watch it. |
10:30 | We Conduct workshops and give certificates. |
10:32 | Please contact us. |
10:35 | Spoken-Tutorial project is supported by the National Mission on Education through ICT, MHRD, Government of India |
10:43 | This is Snehalatha from IIT Bombay signing off. Thank you for joining. |