Biopython/C2/Parsing-Data/Khasi

From Script | Spoken-Tutorial
Jump to: navigation, search
Time
Narration
00:01 Khublei baroh. Ngi pdiang sngewbha ia phi sha kane ka tutorial halor ka Parsing Data.
00:06 Ha kane ka tutorial, ngin pule kumno ban download ia ki FASTA bad GenBank files na ka NCBI database website.
00:14 Bad ban Parseia ki data files da kaba pyndonkam ia ki functions ha ka Sequence Input/Output module.
00:19 Ban bud ia kane ka tutorial, phi dei ban tip bha ia ka undergraduate biochemistry lane bioinformatics.
00:26 bad ka basic Python programming.
00:30 Peit ia ka Python tutorials ha ka link ba la ai.
00:34 Ban record ia kane ka tutorial, nga pyndonkam da ka : * Ubuntu OS version 14.10
00:40 Python version 2.7.8
00:44 Ipython interpretor version 2.3.0
00:48 Biopython version 1.64 bad * Mozilla Firefox browser 35.0.
00:56 Ki scientific data jong ka biology ju store barabor ia ki ha ka text file kum FASTA, GenBank, EMBL, Swiss-Prot kumta ter ter.
01:07 Ia ki data files lah ban download na ka database websites.
01:12 Plie ia ka website link ba lah ai harum, da uno uno u web browser.
01:17 Ka web-page ka plie.
01:19 To ngin download iaka FASTA bad GenBank files na ka bynta ka human insulin gene.
01:25 Ha ka search box, type: "human insulin", click ha Search button.
01:31 Ka web-page ka pyni shibun tylli ki flies na ka bynta ka human insulin gene.
01:35 Ban pyni nuksa, ngan jied 4 tylli ki files kiba kyrteng “Homo sapiens Insulin mRNA”.
01:43 Ngan jied ia ki files ba duna ia ka 500 base pairs.
01:48 Click ha ka check-box ban jied ia ka file ban download.
01:56 Wanrah ia u cursor sha “Send to” option, kaba don ha kyndong khlieh ka mon jong ka page.
02:02 Click ha i selection button barit ba don u khnam ba kdew shapoh, ba don hajan ka “Send to” button.
02:09 Hapoh ka heading “Choose destination”, click ha File option.
02:13 Phi lah ban save ia kane ka file ha kano kano ka format, kiba don hapoh format drop-down list box.
02:21 Jied FASTA na ki options ba ai hapoh.
02:25 Nangta sa click ha Create file option.
02:29 Ka dialog-box kan sa mih ha ka screen.
02:32 Jied Open with, click ha OK.
02:36 Ka file ka plie ha ka text editor.
02:39 Kane ka file ka pyni 4 tylli ki records, namar ngi la jied 4 tylli ki files ban download.
02:46 U line banyngkong ha kawei pa kawei ka record u dei u identifier line.
02:50 U sdang da u “greater than >)” symbol.
02:53 Nangta la pynbud da usequence.
02:56 Save ia ka file ha homefolder jong phi kum ka“sequence.fasta'”.
03:01 Khang ia ka text editor.
03:03 Pynbud ki juh ki syn jam kum haneng ban download ia ki files ha GenBank format
03:08 na ka bynta ki files ba la jied hashwa.
03:12 Jied ia ka file format kum GenBank.
03:16 Shna ia ka file. Plie da u text editor.
03:21 Peit thuh ba ka sequence file ha GenBank format ka kham bun features ban ia ka FASTA file.
03:27 Save ia ka file kum "sequence.gb" ha ka home folder. Khang noh u text editor.
03:34 Ban peit nuksa, ngi donkam ia ka FASTA file ba don record tang iwei.
03:39 Na ka bynta kane, pynkhuid ia ki jingjied ba hashwa da kaba click biang ha ki check box.
03:48 Mynta, jied ia ka file “Human insulin gene complete cds”.
03:54 Click ha ka check-box.
03:57 Bad sa bud ia ki rukom kumba la pyni mynne ban save ia ka file ha ka home folder.
04:01 Save ia ka file kum "insulin.fasta".
04:08 Ia ki Biological data ba lah store ha kine ki files lah ban sei bad pynkylla da kaba pyndonkam da ka Biopython libraries.
04:16 Khang ia u text-editor.
04:19 Ban sei ia ki data na data files la khot Parsing.
04:23 Jan ia baroh ki file formats lah ban parsed da kaba pyndonkam functions kiba don ha SeqIO module.
04:30 Ki function jong ka SeqIO module kiba ju kham pyndonkam bha baroh ki long: parse, read, write bad convert.
04:38 Plie ia ka terminal da kaba nion lang ia Ctrl, Alt bad t keys.
04:44 Plie ia ka Ipython da kaba type "ipython" ha ka prompt. Nion Enter.
04:51 Nangta, import "SeqIO" module na Bio package.
04:56 Ha ka prompt, type: from Bio import SeqIO. Nion Enter.
05:04 Ngin ia sdang da ka function kaba kham donkam ka “parse”.
05:07 Na ka bynta ka nuksa, ngan pyndonkam ka FASTA file ka ba don bun records kaba ngi download hashwa na ka database.
05:17 Na ba bynta ka FASTA parsing ba kham suk, type kumne harum ha ka prompt.
05:22 Hangne, ngi pyndonkam ia ka parse function ban read ia ki contents jong ka sequence.fasta file.
05:30 Na ka bynta ka output, print record id, sequence kaba don ha ka record bad ruh ia ka jingjrong jong ka sequence.
05:41 Peit bha ruh ba ia ka parse function la pyndonkam ban read sequence data kum Sequence record objects.
05:48 La ju pyndonkam barabor bad ka for loop.
05:52 Ka pdiang ar tylli ki arguments, kaba nyngkong dei ka file name ban read ia ka data.
05:59 Ka ba ar ka batai shai ia ka format jong ka file.
06:02 Nion ia u Enter key ar sien ban ioh ia ka output.
06:07 Ka output ka pyni ia u identifier line, nangta bud sa u sequence uba don ha ka file, bad ruh ka jingjrong jong ka sequence na ka bynta baroh ki records ha ka file.
06:21 Peit thuh ba ka FASTA format kam batai shai ia u alphabet.
06:26 Te, ka output kan nym pyni ia ka kum ka DNA sequence.
06:31 Ki juh ki synjam lah ban leh biang ban parse iaka GenBank file.
06:36 Ban pyni nuksa ngin pyndonkam ia ka GenBank file kaba ngi lah download mynne na ka database.
06:43 Nion ia u up-arrow key ban ioh ia ki lines jong ki code kiba ngi lah pyndonkam hashwa.
06:49 Pynkylla ia ka kyrteng jong ka file sha ka sequence.gb .
06:53 Pynkylla ia ka file format sha ka genbank.
06:56 Ki code ba sah kin neh kumjuh.
06:58 Nion ia u Enter key arsien ban ioh ia ka output.
07:03 Hangne ruh ka output ka pyni ia ka record id, sequence bad ka jingjrong jong ka sequence na ka bynta baroh ki records ha ka file.
07:12 Peit thuh ba ka GenBank format ka pyntikna ia ka sequence kum ka DNA sequence.
07:19 Kumjuh ruh, ia ki Swiss-prot bad ki file EMBL lah ban parse da kaba pyndonkam ia u juh u code kum haneng.
07:27 Lada ka file jong phi ka don uwei u record phi hap type ia ki line harum ban leh parsing.
07:34 Hangne, ngin pyndonkam iaka FASTA file ba lah save mynne, ba tang uwei u record, uta u dei insulin.fasta kum ka nuksa.
07:43 Peit thuh ba ngi lah dep pyndonkam ia ka read function ha jaka jong parse function. Nion Enter.
07:50 Ka output ka pyni ki contents na ka bynta ka file insulin.fasta.
07:55 Ka pyni ia ka sequence kum sequence record object.
07:59 Bad kiwei ki jinglong kum GI, accession number bad description.
08:06 Ngi lah ruh ban peit ia ki jinglong ba shimet jong kane ka record kumne harum.
08:11 Ha ka prompt, type: record dot seq. Nion Enter.
08:18 Ka output ka pyni ia ka sequence ba don ha ka file.
08:22 Ban peit ia ki identifiers jong kane ka record, type: record dot id. Nion Enter.
08:29 Ka output ka pyni ia u GI number bad accession number bad kumta ter ter.
08:34 Phi lah ban pyndonkam ia u function ba lah ong haneng ban parse ia ki data files kiba phi kwah.
08:40 Mynta to ngin batai lyngkot).
08:42 Ha kane ka tutorial, ngi lah pule ban: downloadFASTA bad GenBank files na ka NCBI database website bad pyndonkam iaka parse bad read functions na ka SeqIO module.
08:55 Ban sei ia ki data kum ki record ids, description bad sequences na FASTA bad GenBank files.
09:03 Mynta, na ka bynta ka assignment-
09:06 Download ia ki FASTA files na ka bynta ka nucleotide sequence haka jingjied jong phi na NCBI database.
09:13 Pynkylla ia ki file jong ki sequences sha ki reverse complements jong ki.
09:17 Ka assignment ba lah dep jong phi ka dei ban don ki lines of code kumne harum.
09:22 Pyndonkam ka parse function ban load ia nucleotide sequences na ka FASTA file.
09:28 Nangta, print ia ka reverse complements da kaba pyndonkam ia ka Sequence object’s built in reverse complement method.
09:37 Ka video ha ka link harum ka batai lyngkot ia kane ka spoken-tutorial project.
09:42 Sngewbha download bad peit ia ka.
09:44 Ka Spoken Tutorial Project team ka ju pynlong ia ki workshops bad ai ruh ia ki certificates sha kito ba pass ia ka on-line test.
09:51 Ban tip kham bniah, sngewbha thoh sha ngi.
09:55 Ia ka Spoken Tutorial Project la bei tyngka da ka NMEICT, MHRD, Sorkar India.
10:01 Khambun ki jingtip halor kane ka mission ka don ka ha link harum.
10:06 ïa kane ka script la pynkylla sha ka Ktien Khasi da u Yuwanki Kharlukhi na Shillong,bad ma nga u Hezekiah Lyngdoh ngan pynkut noh. khublei ba phi la ïasnoh lang.

Contributors and Content Editors

Hezekiah2016