Biopython/C2/Manipulating-Sequences/English
|
|
---|---|
Slide Number 1
Title Slide |
Welcome to this tutorial on Manipulating Sequences. |
Slide Number 2
Learning Objectives |
In this tutorial, we will use Biopython tools to;
sequence that is to Concatenate. |
Slide Number 3
Learning Objectives (Biopython Functions) |
4. Find the length of the sequence.
|
Slide Number 4
Pre-requisites |
To follow this tutorial you should be familiar with,
If not: Refer to the Python tutorials at the given link. |
Slide Number 5
System Requirement |
To record this tutorial I am using,
Ubuntu OS version. 14.10 Python version 2.7.8 Ipython interpretor version 2.3.0 Biopython version 1.64 |
Press ctrl, alt and t simultaneously.
At the prompt, type ipython. |
Let me open the terminal and start ipython interpretor.
|
Generating random DNA sequence in python
|
Using Biopython we can generate a sequence object for a random DNA sequence of any specified length.
|
Fr At the prompt type
|
Next import Seq module from Bio package.
|
Type,
from Bio.Seq import Seq
|
At the prompt type,
(from Bio.Seq import Seq )
|
Cursor on the terminal. | We will use Bio.Alphabet module to specify the alphabets in the DNA sequence. |
Type,
>>>from Bio.Alphabet import generic_dna |
Type,
|
Type,
dna1 = Seq( .join(random.choice('AGTC') for _ in range(30)),generic_dna)
|
Type the following command to create a sequence object for the random DNA sequence;
|
type
>>> dna1 Press enter |
For the output, type dna1.
Press enter |
Highlight output. | The output shows the sequence object for the random DNA sequence. |
Cursor on the terminal.
Press up arrow key Press enter. type >>> dna1 Press enter |
If you want a new sequence, press up arrow key to get the same command as above.
For the output, type the variable name, dna1. Press enter |
Cursor on the terminal. | The output shows a new DNA sequence, which is different from the first one. |
Slide number 6and 7
Sequence Objects |
About Sequence Objects
|
Cursor on the terminal.
|
Back to the terminal.
|
Highlight the second string from 18th -30th character. | The second slice will be between bases 11 and 12.
|
At the prompt type,
|
Type the following command at the prompt to extract the first fragment.
|
Type,
string1 press enter |
To view the output
Type, string1,press enter.
|
Type
|
To extract the second string from the sequence,
|
Type string2
press enter. |
For the output
Type string2 press enter.
|
Type
dna2 = string1 + string2
|
Let us concatenate, that is, add the two strings together to form a new fragment:
(dna2 = string1 + string2)
we can not add sequences with incompatible alphabets.
|
Type
dna2 Press enter |
To view the output, type dna2.
Press enter
|
Cursor on the terminal | To find the length of the new sequence, we will use len function. |
Type
len(dna2) press Enter
|
Type
len within parenthesis dna2.
Output shows the sequence as 15 bases long. |
Type,
my_seq.count("A") Press enter
|
We can also count the number of individual bases present in the sequence.
For example to count the number of alanines present in the sequence .
|
Type
dna1.find(“AT”) Press enter |
To find a particular base or part of the string we will use find function.
|
Cursor on the terminal. | Normally a sequence object cannot be edited.
|
Type,
dna3=dna2.tomutable() press enter |
To do so, type,
dna3 equal to dna2 dot tomutable open and close parenthesis.
|
Type
dna3 press enter |
For the output, type
dna3 press enter |
Type
>>>mutable_seq[5]A Press enter |
Now the sequence object can be edited.
|
Type,
dna3[6:10]=ATGC press enter
|
To replace a part of the string,
type the following command.
|
Type,
dna4=mutable_seq.toseq()
dna4 Press enter |
Once you have edited your sequence object, convert it back to the “read only” form.
Press enter
|
Slide Number 8
Summary |
Let's summarize,
In this tutorial, we have learnt to, 1. Generate a random DNA sequence. 2. Slice a DNA sequence at specified locations 3. Join two sequences together to form a new sequence that is to Concatenate.
|
Slide Number 9
Summary |
4. We have also learnt how to use
len, count and find functions. 5. Convert a sequence object to a mutable sequence object. 6. And replace a base or part of the string. |
Slide Number 10
Assignment
|
For the assignment
|
GC content of the DNA sequence.
Type,
Press enter Type GC(dna) Press enter
|
The out put shows the GC content as percentage. |
Type,
from Bio.SeqUtils import molecular_weight press enter
molecular_weight(dna)
|
The output shows the molecular weight of the DNA sequence. |
Slide Number 11
Acknowledgement |
This video summarizes the Spoken Tutorial project
If you do not have good bandwidth, you can download and watch it. |
Slide Number 12 | We Conduct workshops and give certificates.
Please contact us. |
Slide number 13 | Spoken-Tutorial project is supported by the National Mission on Education through ICT, MHRD, Government of India |
Slide number 13 | This is Snehalatha from IIT Bombay signing off. Thank you for joining. |