Biopython/C2/Manipulating-Sequences/Khasi

From Script | Spoken-Tutorial
Revision as of 17:44, 2 February 2018 by Hezekiah2016 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Time Narration
00:01 Welcome to this tutorial on Manipulating Sequences.

(Ngi pdiang sngewbha ia phi sha kane ka tutorial jong ka Manipulating Sequences.)


00:06 In this tutorial, we will use Biopython tools: To generate a random DNA sequence

(Ha kane ka tutorial, ngin pyndonkam ki tools ka Biopython: Ban pynmih ia u random DNA sequence)

00:13 Slice a DNA sequence at specified locations

(Phiah ia u DNA sequence ha ki jaka ba lah bynshet )

00:17 Join two sequences together to form a new sequence that is to concatenate

(Pyndait lang ia artylli ki sequences ban pynlong kawei ka sequence ba thymmai)

00:22 Find the length of the sequence

(Wad ia ka jingjrong jong ka sequence)

00:26 Count the number of individual bases or part of the string

( Niew ia ki number jong ki base s lane ki dkhot jong u string)

00:31 Find a particular base or part of the string.

(Wad ia u base uba bniah lane bynta jong u string)

00:35 Convert a sequence object to a mutable sequence object.

(Pynkylla ia ka sequence object sha ka mutable sequence object)

00:40 To follow this tutorial, you should be familiar with undergraduate Biochemistry or Bioinformatics

(Ban sngewthuh ia kane ka tutorial, phi dei ban long kiba shemphang ha ka undergraduate Biochemistry lane Bioinformatics)

00:47 and basic Python programming.

(Bad ka basic Python programming)

00:51 If not, refer to the Python tutorials at the given link.

(lem kumta, pyndonkam da ka nuksa Python ha ka link ba bud )

00:56 To record this tutorial, I am using: * Ubuntu OS version 14.10

(Ban record ia kane ka nuksa , nga pyndonkam da ka * Ubuntu OS version 14.10)

01:03 Python version 2.7.8

(Python version 2.7.8)

01:07 Ipython interpreter version 2.3.0

(Ipython interpreter version 2.3.0)

01:12 Biopython version 1.64.

(Biopython version 1.64.)

01:16 Let me open the terminal and start ipython interpreter.

(To ngin ia plie iaka terminal bad sdang ia ka ipython interpreter)

01:21 Press Ctrl, Alt and t keys simultaneously.

(Nion sah ia u Ctrl, Alt and t key)

01:26 At the prompt, type: "ipython" and press Enter.

(Ha ka prompt, type: "ipython" bad sa nion Enter)

01:31 Ipython prompt appears on the screen.

(Ka Ipython prompt kan sa mih ha ka screen)

01:35 Using Biopython, we can generate a sequence object for a random DNA sequence of any specified length.

(Da kaba pyndonkam ia ka Biopython, ngi lah ban pynmih ia ka sequence object na ka bynta ka random DNA sequence ha kano kano ka jingjrong)

01:44 Let us now generate a sequence object for a DNA sequence of 20 bases.

(To ngin ia pynmih ia ka sequence object na ka bynta ka DNA sequence kaba 20 bases)

01:50 At the prompt, type: "import random", press Enter.

(Ha ka prompt, type: "import random", bad nion Enter.)

01:56 Next, import Seq module from Bio package.

(Nangta, import Seq module na Bio package.)

02:01 Often Seq is pronounced as seek.

(Barabor Seq ngi shait ong seek.)

02:06 At the prompt, type: From Bio dot Seq import Seq. Press Enter.

(Ha ka prompt, type: From Bio dot Seq import Seq. Nion Enter.)

02:15 We will use Bio.Alphabet module to specify the alphabets in the DNA sequence.

(Ngin pyndonkam da ka Bio.Alphabet module ban pyntikna ia ki alphabets ha ka DNA sequence)

02:22 Type: from Bio dot Alphabet import generic underscore dna. Press Enter.

(Type: from Bio dot Alphabet import generic underscore dna. Nion Enter.)

02:32 Type the following command to create a sequence object for the random DNA sequence.

(Type kumne harum command ban shna ia ka sequence object na ka bynta ka DNA sequence. )

02:38 Store the sequence in a variable dna1.

(Pynlang ia ka sequence ha ka variable dna1)

02:42 Please note: in this command, use two single quotes instead of a double quote. Press Enter.

(Sngewbha peit: ha kane ka command, pyndonkam ar tylli ki single quotes ha ka jaka ka double quote. Nion Enter)

02:50 For the output, type: dna1. Press Enter.

(Na ka bynta ka output, type: dna1. Nion Enter)

02:55 The output shows the sequence object for the random DNA sequence.

(Ka output ka pyni ia ka sequence object na ka bynta ka random DNA sequence. )

03:00 If you want a new sequence, press up-arrow key to get the same command as above. Press Enter.

(Lada phi kwah ka sequence ba thymmai, nion u up-arrow key ban ioh ia ka juh ka command kum haneng. Nion Enter.)

03:11 For the output, type the variable name dna1. Press Enter.

(Na ka bynta ka output, type ia ka kyrteng ka variable dna1. Nion Enter)

03:17 The output shows a new DNA sequence which is different from the first one.

(Ka output ka pyni iaka DNA sequence ba thymmai kaba pher na kaba nyngkong.)

03:23 About Sequence Objects:

(Shaphang ka Sequence Objects)

03:25 The sequence objects usually act like normal Python strings.

(Ka sequence objects ha ka jingshisha ka long beit Python strings.)

03:30 So, follow the normal conventions as you do for Python strings.

(Te, bud beit ia ka rukom leh kumba leh ia ka Python strings.)

03:35 In Python, we count the characters in the string starting from 0, instead of 1.

(Ha ka Python, ngi niew ia ki characters ha ka string kaba sdang na 0, ha jaka u 1.)

03:41 The first character in the sequence is position zero.

(U character ba nyngkong ha ka sequence u dei u zero.)

03:45 Back to the terminal.

(Phai biang sha ka terminal.)

03:47 Often you may need to work with only a part of the sequence.

(Teng teng ngi hap ban trei tang shi bynta jong ka sequence)

03:52 Now, let's see how to extract parts of the string and store them as sequence objects.

(Mynta, ngin ia peit kumno ban sei ia ki dkhot jong ka string bad buh ia ki kum ki sequence objects.)

03:58 For example, we will slice the DNA sequence at two positions.

(Nuksa, ngin slice ia ka DNA sequence ha ki ar bynta.)

04:04 First, between bases 6 and 7.

(Nyngkong, hapdeng ki bases 6 bad 7.)

04:08 This will extract a fragment from the beginning of the sequence to the 6th base in the sequence.

(Kane kan sei ia ka fragment na kaba sdang jong ka sequence sha ka base ba 6 ha ka sequence.)

04:15 The second slice will be between bases 11 and 12.

(Ka slice ba ar kan dei hapdeng ka bases 11 bad 12.)

04:20 The second fragment will be from the 12th base to the end of the sequence.

(Ka fragment ba ar kan dei na ka base ba 12 shaduh ba kut jong ka sequence.)

04:26 Type the following command, at the prompt, to extract the first fragment.

(Type ia ka command ba harum, ha ka prompt, ban sei ia ka fragment ba nyngkong.)

04:31 String1 equal to dna1 within brackets 0 colon 6.

(String1 equal to dna1 within brackets 0 colon 6.)

04:39 string1 is the variable to store the first fragment.

(string1 ka dei ka variable ban lum ia ka fragment ba nyngkong.)

04:43 The rest of the command follows as in normal Python.

(Ki command ba bud kin long beit kum ha ka Python.)

04:47 Enclosed in these brackets are the start and the stop positions separated by a colon.

(Hapoh kine ki brackets dei ki jaka ba sdang bad ba kut ba shah phiah da u colon.)

04:53 The positions are inclusive of the start but exclusive of the stop position. Press Enter.

(Kine ki jaka (positions) ki lum lang ha kaba sdang hynrei kim shah lum shuh ha kaba kut . Nion Enter.)

05:01 To view the output, type: "string1", press Enter.

(Ban peit ia ka output, type: "string1", nion Enter.)

05:04 The output shows the first fragment as the sequence object.

(Ka output ka pyni ia ka fragment ba nyngkong kum ka sequence object.)

05:10 To extract the second string from the sequence, press up-arrow key and edit the command as follows:

(Ban sei ia ka string ba ar na ka sequence, nion ia u up-arrow key bad edit ia ka command kumne harum: )

05:17 Change the name of the variable to string2 and positions to 11 and 20.

(Pynkylla ia ka ka kyrteng jong u variable sha ka string2 bad jaka sha u 11 bad 20.)

05:24 For the output, type: "string2". Press Enter.

(Na ka bynta ka output, type: "string2". Nion Enter.)

05:30 Now we have the 2nd fragment also as a sequence object.

(Mynta ngi ioh ia ka fragment ba ar ruh kum ka sequence object.)

05:34 Let us concatenate, that is, add the two strings together to form a new fragment.

(To ngin ia lum lang , kata, ban pyndait lang ia baroh ar ki strings ban ioh u fragment ba thymmai.)

05:42 Store the new sequence in a variable dna2.

(Buh ia ka sequence ba thymmai ha ka variable dna2.)

05:46 Type: dna2 equal to string1 plus string2. Press Enter.

(Type: dna2 equal to string1 plus string2. Nion Enter.)

05:53 Please note: we cannot add sequences with incompatible alphabets.

(Sngewbha tip ba : ngim lah ban iasnoh ia ki sequence bad ki alphabets ki bym iahap.)

05:59 That is, we cannot concatenate a DNA sequence and a protein sequence to form a new sequence.

(Kata ka mut, ngim lah ban lum ia ka DNA sequence bad ka protein sequence ban ioh ia ka sequence ba thymmai.)

06:07 The two sequences must have the same alphabet attribute.

(Ki ar tylli ki sequences ki dei ban don kajuh ka alphabet attribute.)

06:12 To view the output, type: "dna2". Press Enter.

(Ban peit ia ka output, type: "dna2". Nion Enter.)

06:17 The output shows a new sequence which is a combination of string1 and string2.

(Ka output ka pyni ia ka sequence ba thymmai kaba dei ka jinglum kyllum jong u string1 bad string2.)

06:23 To find the length of the new sequence, we will use len function.

(Ban ioh ia ka jingjrong jong ka sequence ba thymmai, ngin pyndonkam da ka len function.)

06:29 Type: "len" within parenthesis "dna2". Press Enter.

(Type: "len" hapoh parenthesis"dna2". Nion Enter.)

06:34 Output shows the sequence as 15 bases long.

(Ka output ka pyni ia ka sequence kaba jrong 15 bases.)

06:39 We can also count the number of individual bases present in the sequence.

(Ngi lah ruh ban niew ia ka jingdon jong ki bases bapher bapher ha ka sequence.)

06:44 To do so, we will use count() function.

(Ban leh ia kane, ngin pyndonkam da ka count() function.)

06:47 For example- to count the number of alanines present in the sequence, type the following command: dna2 dot count within parenthesis within double quotes alphabet A.

(Nuksa – ban niew ia ka jingdon jong u alanines ba don ha ka sequence, ngi type ia ka command ka ba harum: dna2 dot count hapoh parenthesis hapoh double quotes alphabet A. )

07:02 Press Enter.

(Nion Enter.)

07:04 The output shows the number of alanines present in the sequence dna2.

(Ka output ka pyni ia ka jingdon jong u alanines ha ka sequence dna2.)

07:10 To find a particular base or part of the string, we will use find() function.

(Ban wad ia u jait base ne u dkhot jong u string, ngin pyndonkam da u find() function.)

07:16 Type: dna2 dot find within parenthesis within double quotes "GC". Press Enter.

(Type: dna2 dot find hapoh parenthesis hapoh double quotes "GC". Nion Enter.)

07:26 The output indicates the position of the first instance of the appearance of GC in the string.

(Ka output ka pyni ia ka jaka ba nyngkong kaba mih ka GC ha u string.)

07:32 Normally a sequence object cannot be edited.

(Ha ka jingshisha ia ka sequence object um ju lah ban pynkylla.)

07:35 To edit a sequence, we have to convert it to the mutable sequence object.

(Ban pynkylla ia ka sequence, ngi hap ban pynkylla sha ka mutable sequence object.)

07:41 To do so, type: dna3 equal to dna2 dot to mutable open and close parenthesis. Press Enter.

(Ban leh ia kane, type: dna3 equal to dna2 dot to mutable plie bad khang ia u parenthesis. Nion Enter.)

07:52 For the output, type: dna3. Press Enter.

(Na ka bynta ka output, type: dna3 Nion Enter.)

07:55 Now the sequence object can be edited.

(Mynta ngin sa lah ban pynkylla ia ka sequence object.)

07:59 Let us replace a base from the sequence.

(To ngin ia weng/bujli ia u base na ka sequence.)

08:01 For example- to replace a base present at 5th position to alanine, type: dna3 within brackets 5 equal to within double quotes alphabet A. Press Enter.

(Nuksa – ban weng/bujli ia u base uba don ha ka jaka ba 5 sha ka alanine, type: dna3 within brackets 5 equal to within double quotes alphabet A. Nion Enter.)

08:19 For the output, type: dna3. Press Enter.

(Na ka bynta ka output, type: dna3. Nion Enter.)

08:24 Observe the output. The cytosine at position 5 is replaced with alanine.

(Ha khmih ia ka output. Ka cytosine ha ka jaka ba 5 la shah bujli da u alanine.)

08:31 To replace a part of the string, type the following command.

(Ban weng i bynta jong u string, type ia ka command ba harum.)

08:35 Dna3 within brackets 6 colon 10 equal to within double quotes ATGC. Press Enter.

(Dna3 within brackets 6 colon 10 equal to within double quotes ATGC. Nion Enter.)

08:45 For the output, type: dna3. Press Enter.

(Na ka bynta ka output, type: dna3. Nion Enter.)

08:52 The output shows the 4 bases from the position 6 to 9 are replaced with new bases ATGC.

(Ka output ka pyni ia ki 4 bases na ka jaka ba 6 haduh 9 kiba shah bujli da ki bases ATGC ba thymmai.)

09:01 Once you have edited your sequence object, convert it back to the “read only” form.

(Marsien dep pynkylla ia ka sequence object, pynkylla biang ia ka sha ka “read only” form.)

09:07 Type the following dna4 equal to dna3 dot to seq open and close parenthesis. Press Enter.

(Type kumne harumdna4 equal to dna3 dot to seq open and close parenthesis. Nion Enter.)

09:19 For the output, type: dna4. Press Enter.

(Na ka bynta ka output, type: dna4. Nion Enter.)

09:25 Let's summarize.

(To ngin ia khmih ia kiba ngi lah kdew haneng )

09:27 In this tutorial, we have learnt to: * Generate a random DNA sequence

(Ha kane ka jingbatai (tutorial) , ngi lah nang ban * Generate a random DNA sequence)

09:32 Slice a DNA sequence at specified locations

(Pynpra ia ka DNA sequence ha ki jaka ba lah buh.)

09:36 Join two sequences together to form a new sequence, that is, to concatenate.

(Pyndait lang ar tylli ki sequence ban ioh ia ka sequence ba thymmai, kata ka mut, ban lum lang .)

09:43 We have also learnt how to: * use len, count and find functions

(Ngi lah nang ruh kumno ban: * use len, count bad find functions.)


09:49 convert a sequence object to a mutable sequence object and replace a base or part of the string.

(Ban pynkylla ia ka sequence object sha ka mutable sequence object bad ban bujli ia u base lane bynta jong u string.)

09:57 For the assignment, generate a random DNA sequence of 30 bases.

(Na ka bynta ka assignment, wad ia uno uno u DNA sequence uba 30 bases.)

10:02 Using Biopython tools, calculate the GC percentage and molecular weight of the sequence.

(Da kaba pyndonkam da u Biopython tools, khein ia ka GC percentage bad molecular weight jong ka sequence.)

10:09 Your completed assignment will be as follows.

(Ka assignment ba lah dep jong phi kan long kumne harum.

10:13 The output shows the GC content as percentage.

(Ka output kan pyni ia ka GCcontent ha ka percentage.)

10:18 The output shows the molecular weight of the DNA sequence.

(Ka output kan pyni ia ka molecular weight jong ka DNA sequence.)

10:23 This video summarizes the Spoken Tutorial project.

(Kane ka video ka lum kyllum ia ka Spoken Tutorial project.)

10:26 If you do not have good bandwidth, you can download and watch it.

(Lada phim don ka bandwidth ba biang, phi lah ban download bad peit ia ka.)

10:30 We conduct workshops and give certificates.

(Ngi ju pynlong ia ki workshops bad sam ruh ia ki certificates.)

10:32 Please contact us.

(Sngewbha pyntip ia ngi)

10:35 Spoken-Tutorial project is supported by the National Mission on Education through ICT, MHRD, Government of India.

(Ia ka Spoken-Tutorial project la kyrshan da ka National Mission on Education through ICT, MHRD, Government of India.)

10:43 This is Snehalatha from IIT Bombay, signing off. Thank you for joining.

(Nga dei ka Snehalatha na IIT Bombay, signing off. Khublei shibun )

Contributors and Content Editors

Hezekiah2016, PoojaMoolya