Difference between revisions of "Biopython/C2/Blast/Khasi"

From Script | Spoken-Tutorial
Jump to: navigation, search
(Created page with "{| Border=1 ! <center>Time</center> ! <center>Narration</center> |- | 00:01 | Welcome to this tutorial on ''' BLAST''' using '''Biopython''' tools. (Ngi pdiang sngewbha ia ph...")
 
Line 5: Line 5:
 
|-
 
|-
 
| 00:01
 
| 00:01
| Welcome to this tutorial on ''' BLAST''' using '''Biopython''' tools.
+
| Ngi pdiang sngewbha ia phi sha kane ka tutorial shaphang u  ''' BLAST''' da kaba pyndonkam ki tools u '''Biopython'''
(Ngi pdiang sngewbha ia phi sha kane ka tutorial shaphang u  ''' BLAST''' da kaba pyndonkam ki tools u '''Biopython''')
+
  
 
|-
 
|-
 
| 00:06
 
| 00:06
| In this tutorial, we will learn: * To '''run''' "BLAST" for the '''query sequence''' using Biopython tools
+
| Ha kane ka jinghikai , ngin sa nang kumno  ban: '''run''' "BLAST" na ka bynta ka '''query sequence''' da kaba pyndonkam ki tools u Biopython.
(Ha kane ka jinghikai , ngin sa nang kumno  ban: '''run''' "BLAST" na ka bynta ka '''query sequence''' da kaba pyndonkam ki tools u Biopython.
+
  
 
|-
 
|-
 
|00:13
 
|00:13
| And, '''parse''' the BLAST output for further analysis.
+
| Bad, '''parse''' ia u BLAST output ban tip kham bniah .  
(Bad, '''parse''' ia u BLAST output ban tip kham bniah .  
+
  
 
|-
 
|-
 
| 00:17
 
| 00:17
|To follow this tutorial, you should be familiar with undergraduate Biochemistry or Bioinformatics
+
| Ban sngewthuh  ia kane ka tutorial, phi dei ban long kiba shemphang ha ka  undergraduate Biochemistry lane Bioinformatics.
(Ban sngewthuh  ia kane ka tutorial, phi dei ban long kiba shemphang ha ka  undergraduate Biochemistry lane Bioinformatics.
+
  
 
|-
 
|-
|00:24
+
| 00:24
|and basic '''Python''' programming.
+
| Bad ka basic '''Python''' programming.
(Bad ka basic '''Python''' programming.
+
  
 
|-
 
|-
 
|00:27
 
|00:27
|Refer to the '''Python''' tutorials at the given link.
+
|Pyndonkam da  ka '''Python''' tutorials na ka link ba ai harum.
(Pyndonkam da  ka '''Python''' tutorials na ka link ba ai harum.
+
  
 
|-
 
|-
 
| 00:31
 
| 00:31
|To record this tutorial, I am using: * '''Ubuntu''' Operating System version 14.10
+
|Ban record ia kane ka tutorial, nga pyndonkam da ka: * '''Ubuntu''' Operating System version 14.10   
(Ban record ia kane ka tutorial, nga pyndonkam da ka: * '''Ubuntu''' Operating System version 14.10   
+
  
 
|-
 
|-
Line 46: Line 39:
 
|00:41
 
|00:41
 
| '''Ipython interpretor''' version 2.3.0
 
| '''Ipython interpretor''' version 2.3.0
('''Ipython interpretor''' version 2.3.0
 
  
 
|-
 
|-
 
|00:46
 
|00:46
| '''Biopython''' version 1.64 and * a working Internet connection.
+
| '''Biopython''' version 1.64 bad ka internet ba trei kam.
('''Biopython''' version 1.64 bad ka internet ba trei kam.
+
  
 
|-
 
|-
 
| 00:52
 
| 00:52
|'''BLAST''' is the acronym for '''Basic Local Alignment Search Tool.'''
+
|'''BLAST''' ka dei ka acronym  jong ka '''Basic Local Alignment Search Tool.'''
('''BLAST''' ka dei ka acronym  jong ka '''Basic Local Alignment Search Tool.'''
+
  
 
|-
 
|-
 
|00:57
 
|00:57
|It is an '''algorithm''' for comparing '''sequence''' information.
+
|Ka dei ka '''algorithm''' ban pyniapher  ki '''sequence''' information.
(Ka dei ka '''algorithm''' ban pyniapher  ki '''sequence''' information.
+
  
 
|-
 
|-
 
| 01:02
 
| 01:02
|The program compares '''nucleotide''' or '''protein''' sequences to sequences in databases  and calculates the statistical significance of matches.
+
|Kane ka program ka pyniapher  ia ki '''nucleotide''' lane '''protein''' sequences sha ki sequences ha ki databases bad khein ia ka statistical significance jong ki matches .
(Kane ka program ka pyniapher  ia ki '''nucleotide''' lane '''protein''' sequences sha ki sequences ha ki databases bad khein ia ka statistical significance jong ki matches .
+
  
 
|-
 
|-
 
|01:14
 
|01:14
| There are two different ways to '''run''' BLAST:
+
| Ki don artylli ki rukom ban '''run''' ia ka BLAST:
(Ki don artylli ki rukom ban '''run''' ia ka BLAST:
+
  
 
|-
 
|-
 
|01:17
 
|01:17
|Local '''BLAST''' on your machine or run '''BLAST''' over Internet through NCBI servers.
+
|Ka Local '''BLAST''' ha ka machine jong phi lane run '''BLAST''' lyngba ka Internet jong  ka NCBI servers.
(Ka Local '''BLAST''' ha ka machine jong phi lane run '''BLAST''' lyngba ka Internet jong  ka NCBI servers.
+
  
 
|-
 
|-
 
| 01:24
 
| 01:24
|Running '''BLAST''' in '''Biopython''' has two steps.
+
|Ki don ar tylli ki rukom ban pyndonkam  ia ka '''BLAST''' ha ka '''Biopython'''     
(Ki don ar tylli ki rukom ban pyndonkam  ia ka '''BLAST''' ha ka '''Biopython'''     
+
  
 
|-
 
|-
 
|01:28
 
|01:28
|First, run '''BLAST''' for your '''query sequence''' and get some output.
+
|Nyngkong, run  '''BLAST''' na ka bynta ka '''query sequence''' jong phi bad phin sa ioh  output.  
(Nyngkong, run  '''BLAST''' na ka bynta ka '''query sequence''' jong phi bad phin sa ioh  output.  
+
  
 
|-
 
|-
 
|01:33
 
|01:33
|Second, '''parse''' the '''BLAST''' output for further analysis.
+
|Ba ar, '''parse''' ka '''BLAST''' output na ka bynta ka jingkdew  ba kham bniah hadien.  
(Ba ar, '''parse''' ka '''BLAST''' output na ka bynta ka jingkdew  ba kham bniah hadien.  
+
  
 
|-
 
|-
 
| 01:38
 
| 01:38
|We will open the '''terminal''' and run '''BLAST''' for a nucleotide sequence.
+
|Ngin sa  plie ia ka '''terminal''' bad run '''BLAST''' na ka bynta ka nucleotide sequence.  
(Ngin sa  plie ia ka '''terminal''' bad run '''BLAST''' na ka bynta ka nucleotide sequence.  
+
  
 
|-
 
|-
 
|01:43
 
|01:43
|Open the terminal by pressing '''Ctrl, Alt''' and '''T''' keys simultaneously.
+
|Plie ia ka terminal da kaba nion sah  ia '''Ctrl, Alt''' bad '''T''' keys.
(Plie ia ka terminal da kaba nion sah  ia '''Ctrl, Alt''' bad '''T''' keys.
+
  
 
|-
 
|-
 
| 01:48
 
| 01:48
|At the '''prompt''', type: '''ipython''' and press '''Enter'''.
+
|Ha ka '''prompt''', type: '''ipython''' bad nion '''Enter'''.
(Ha ka '''prompt''', type: '''ipython''' bad nion '''Enter'''.
+
  
 
|-
 
|-
 
| 01:52
 
| 01:52
|In this tutorial, I will demonstrate how to run '''BLAST''' over internet using '''NCBI BLAST''' service.
+
|Ha kane ka tutorial, ngan sa pyni kumno ban run ia ka '''BLAST''' lyngba ka internet da kaba pyndonkam da  ka '''NCBI BLAST''' service.
(Ha kane ka tutorial, ngan sa pyni kumno ban run ia ka '''BLAST''' lyngba ka internet da kaba pyndonkam da  ka '''NCBI BLAST''' service.
+
  
 
|-
 
|-
 
|02:01
 
|02:01
|Type the following at the prompt: '''from Bio.Blast Import NCBIWWW''' Press '''Enter'''.
+
|Type kumne harum ha ka prompt: '''from Bio.Blast Import NCBIWWW''' Nion '''Enter'''.
(Type kumne harum ha ka prompt: '''from Bio.Blast Import NCBIWWW''' Nion '''Enter'''.
+
  
 
|-
 
|-

Revision as of 11:31, 29 May 2018

Time
Narration
00:01 Ngi pdiang sngewbha ia phi sha kane ka tutorial shaphang u BLAST da kaba pyndonkam ki tools u Biopython
00:06 Ha kane ka jinghikai , ngin sa nang kumno ban: run "BLAST" na ka bynta ka query sequence da kaba pyndonkam ki tools u Biopython.
00:13 Bad, parse ia u BLAST output ban tip kham bniah .
00:17 Ban sngewthuh ia kane ka tutorial, phi dei ban long kiba shemphang ha ka undergraduate Biochemistry lane Bioinformatics.
00:24 Bad ka basic Python programming.
00:27 Pyndonkam da ka Python tutorials na ka link ba ai harum.
00:31 Ban record ia kane ka tutorial, nga pyndonkam da ka: * Ubuntu Operating System version 14.10
00:37 Python version 2.7.8

(Python version 2.7.8

00:41 Ipython interpretor version 2.3.0
00:46 Biopython version 1.64 bad ka internet ba trei kam.
00:52 BLAST ka dei ka acronym jong ka Basic Local Alignment Search Tool.
00:57 Ka dei ka algorithm ban pyniapher ki sequence information.
01:02 Kane ka program ka pyniapher ia ki nucleotide lane protein sequences sha ki sequences ha ki databases bad khein ia ka statistical significance jong ki matches .
01:14 Ki don artylli ki rukom ban run ia ka BLAST:
01:17 Ka Local BLAST ha ka machine jong phi lane run BLAST lyngba ka Internet jong ka NCBI servers.
01:24 Ki don ar tylli ki rukom ban pyndonkam ia ka BLAST ha ka Biopython
01:28 Nyngkong, run BLAST na ka bynta ka query sequence jong phi bad phin sa ioh output.
01:33 Ba ar, parse ka BLAST output na ka bynta ka jingkdew ba kham bniah hadien.
01:38 Ngin sa plie ia ka terminal bad run BLAST na ka bynta ka nucleotide sequence.
01:43 Plie ia ka terminal da kaba nion sah ia Ctrl, Alt bad T keys.
01:48 Ha ka prompt, type: ipython bad nion Enter.
01:52 Ha kane ka tutorial, ngan sa pyni kumno ban run ia ka BLAST lyngba ka internet da kaba pyndonkam da ka NCBI BLAST service.
02:01 Type kumne harum ha ka prompt: from Bio.Blast Import NCBIWWW Nion Enter.
02:14 Next, to run the BLAST over internet, type the following at the prompt.result= NCBIWWW.qblast("blastn","nt","186429").

(Hadien , ban run ia u BLAST lyngba ka internet, type ia kaba bud ha ka prompt.result= NCBIWWW.qblast("blastn","nt","186429").

02:20 We will use qblast function in the NCBIWWW module.

(Ngin pyndonkam da ka qblast function ha ka NCBIWWW module.

02:25 qblast function takes three arguments:

(qblast function ka shim lai tylli ki arguments:

02:29 The first argument is the blast program to use for the search.

(Ka argument ba nyngkong ka dei ka blast program ba pyndonkam haba wad.

02:33 Second, specifies the databases to search against.

(Ba ar, ka pyntikna ia ka databases ban wad pyrshah.

02:38 The third argument is your query sequence.

(Ka argument ba lai ka dei ka query sequence. jong phi

02:43 The input for the query sequence can be in the form of GI number or a FASTA file. Or, it can also be a sequence record object.

(Ka input jong ka query sequence ka lah ban dei ha ka rukon jong ka GI number lane ka FASTA file. Lane, ka lah ban dei ka sequence record object.

02:53 For this demonstration, I am using the GI number for a nucleotide sequence.

(Na ka bynta kane ka jingpyni, nga pyndonkam ia ka GI number na ka bynta ka nucleotide sequence.

02:58 The GI number is for a nucleotide sequence of insulin.

(Ka GI number ka dei na ka bynta ka nucleotide sequence jong ka insulin

03:03 The qblast function also takes a number of other option arguments.

(Ka qblast function ka shim ruh ia kiwei ki jait option arguments .

03:09 These arguments are analogous to the different parameters you can set on the BLAST web page.

(Kine ki arguments ki long kiba syrïem ha ki parameters bapher bapher ba phi lah ban buh ha ka BLAST web page.

03:15 The qblast function will return the BLAST results in xml format.

(Ka qblast function kan pynphai ia ka BLAST results ha ka xml format.

03:20 Back to the terminal.

(To ngin ia phai biang sha ka terminal.

03:22 We have to use the appropriate Blast program,

(Ngi hap ban pyndonkam da ka Blast program, ba tikna.

03:25 depending on whether our query sequence is a nucleotide or protein sequence.

(ka shong ruh lada ka query sequence jong ngi ka dei ka nucleotide lane protein sequence.

03:30 Since our query is a nucleotide, we will use blastn program and "nt" refers to the nucleotide database.

(Katba ka query jong ngi ka dei ka nucleotide, ngin pyndonkam ka blastn program bad "nt" jong ka nucleotide database.

03:39 Details about this are available at the NCBI BLAST webpage.

(Ki jingtip ba kham bniah jong kine lah ban ioh na NCBI BLAST webpage.

03:45 The blast output is stored in the variable result, in the form of an xml file.

(Ia ka blast output la buh ha ka variable result, ha ka rukom jong ka xml file.

03:51 Press Enter.

(Nion Enter.

03:53 Depending upon the speed of your Internet, it may take a few minutes to complete the BLAST search.

(Ka shong ha ka jingstet jong ka internet jong ngi ban pyndep ia ka jingwad jong ka BLAST

03:59 It is important to save the xml file on the disk before processing further.

(Ka long ruh kaba donkam ban save ia ka xml file ha ka disk shuwa ban iaid shakhmat.

04:05 Type the following lines to save the xml file.

(Type ia ki line ba bud bansave iaka xml file.

04:09 These lines of code will save the search result as blast.xml in the home folder.

(Kine ki lines jong ki code kin sa save ia ka jingwad kum blast.xml ha ka home folder.

04:18 Navigate to your home folder and locate the file.

(Phai sha ka home folder jong ngi bad wad ia ka file.

04:21 Click on the file and check the contents of the file.

(Click ha ka file bad check ia ka contents (jingdon) jong ka file.

04:30 Use the code shown in this text file, if you want to use a FASTA file as a query.

(Pyndonkam ia u code ba pyni ha kane ka text file, lada phi kwah ban pyndonkam ia u FASTA file kum u query.

04:36 Here is the code, if you want to use sequence record object from a FASTA file as a query.

(Hangne u don u code, lada phi kwah ban pyndonkam ia ka sequence record object na ka FASTA file kum ka query.

04:42 Back to the terminal.

(To ngin ia phai biang sha ka terminal.

04:44 The next step is to parse the file to extract data.

(Ka bynta ba bud ka dei ban parse ia ka file sha ka extract data.

04:48 The first step in parsing is to open the xml file for input.

(Ka bynta ba nyngkong jong ka parsing ka dei ban plie ia ka xml file na ka bynta ka input.

04:53 Type the following at the prompt. Press Enter.

(Type kumne harum ha ka prompt. Nion Enter.

04:57 Next, import the module NCBIXML from "Bio.Blast" package.

(Hadien , import ia ka module NCBIXML na "Bio.Blast" package.

05:05 Press Enter.

(Nion Enter.

05:07 Type the following lines to parse the Blast output.

(Type ia ki lines ba bud sha ka parse ia ka Blast output.

05:11 A BLAST record contains all the information you want to extract from the BLAST output.

(Ka BLAST record ka don ia baroh ki jingpyntip ba phi kwah ban extract na ka BLAST output.

05:18 Let us print out some information about all the hits in our blast report greater than a particular threshold.

(To ngin ia print khyndiat ia ki jingpyntip shaphang baroh ki hits ha ki blast report jong ngi kiba kham heh ban ha ki jaka ba ki mih.

05:27 Type the following code.

(Type ia u code harum.

05:30 For a match to be significant, expect score should be less than 0.01.

(Khnang ba ka match kan long kaba tikna, peit ba u score u dei ban rit ia ka 0.01.

05:37 For each hsp, that is, high scoring pair, we get the title, length, hsp score, gaps and expect value.

(Na ka bynta uwei pa uwei u hsp, kata ka mut, ki shijur ba heh score, ngi ioh ia ka title, length, hsp score, gaps bad expect value.

05:49 We will also print strings containing the query, the aligned database sequence and string specifying the match and mismatch positions.

(Ngin ia print ruh ia u strings uba don ia ka query, ka database sequence ba la pynbeit bad u string ba pyntikna ia ka jingiadei bad jingbymiadei ki positions.

06:02 Press Enter key twice to get the output.

(Nion ia u Enter key arsien ban ioh ia ka output.

06:05 Observe the output.

(Peit ia ka output.

06:09 For each alignment, we have length, score, gaps, evalue and strings.

(Na ka bynta kawei pa kawei ka alignment, ngi don length, score, gaps, evalue bad strings.

06:16 You can extract the required information using other functions, available in Bio.Blast package.

(Phi lah ban ioh ia ki jingpyntip ba donkam da kaba pyndonkam kiwei ki functions, ki ba don ha Bio.Blast package.

06:24 We have come to the end of this tutorial.

(Ngi lah poi sha kaba kut jong kane ka tutorial.

06:26 Let's summarize. In this tutorial, we have learnt to run BLAST for the query nucleotide sequence using GI number.

(To ngin ia khmih ia kaei kaba ngi lah leh haneng . Ha kane ka tutorial, ngi lah tip kumno ban run ia ka BLAST na ka bynta ka query nucleotide sequence da kaba pyndonkam da u GI number.


06:36 And, parse the BLAST output using Bio.Blast.Record module.

(Bad, ban parse ia u BLAST output da kaba pyndonkam da ka Bio.Blast.Record module.

06:43 For the assignment, run BLAST Search for a protein sequence of your choice.

(Na ka bynta ka assignment, run BLAST Search na ka bynta ka protein sequence kaba ngi kwah.

06:50 Save the output file and parse the data contained in the file.

(Save ia ka output file bad parse ia ka data kaba don ha ka file.

06:55 Your completed assignment should have the following lines of code, as shown in this file.

(Ka assignment ba lah dep jong phi ka dei ban don ia ki lines ba don code , kumba lah pyni ha kane ka file.

07:01 Observe the code. Since our query is protein sequence, we have used blastp program and "nr", that is, non-redundant protein database for the BLAST search.

(Peit ia u code. Katba ka query jong ngi ka dei protein sequence, ngi lah pyndonkam blastp program and "nr", kata ka mut, ka protein database kaba donkam na ka bynta ka BLAST search.

07:16 The video at the following link summarizes the Spoken Tutorial project.

(Ka video kaba don ha ka link ba bud ka batai shai shaphang ka Spoken Tutorial project.

07:20 Please download and watch it.

(Sngewbha download bad peit ia ka.

07:22 The Spoken Tutorial Project team conducts workshops and gives certificates for those who pass an online test.

(Ka Spoken Tutorial Project team ka ju pynlong ia ki workshops bad ai ruh ki certificates ia kito kiba pass ia ka online test.

07:30 For more details, please write to us.

(Na ka bynta ka jingtip ba kham bniah, sngewbha thoh sha ngi.

07:33 Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.

(Ia ka Spoken Tutorial Project la bei tyngka da ka NMEICT, MHRD, Government of India.

07:40 More information on this mission is available at the link shown.

(Ki jingtip ba kham pura ia kane ka mission lah ban ioh na link harum.

07:45 This is Snehalatha from IIT Bombay signing off. Thank you for joining.

(Nga dei ka Snehalatha na IIT Bombaysigning off. Khublei shibun .

Contributors and Content Editors

Hezekiah2016