Difference between revisions of "Python for Biologists/C2/Introduction-to-Python-for-Biologists/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
 
(4 intermediate revisions by one other user not shown)
Line 54: Line 54:
 
You can also refer to '''Spoken Tutorials''' on '''Python''' for better understanding of this tutorial.  
 
You can also refer to '''Spoken Tutorials''' on '''Python''' for better understanding of this tutorial.  
  
These are available at '''www.spoken-tutorial.org'''
+
These are available at the given link.
  
 
|-
 
|-
Line 66: Line 66:
 
* It has built-in-libraries for common tasks.  
 
* It has built-in-libraries for common tasks.  
 
* We can manipulate DNA and protein sequences easily .  
 
* We can manipulate DNA and protein sequences easily .  
 
  
  
Line 74: Line 73:
 
'''Why Pyhton for biologists?'''  
 
'''Why Pyhton for biologists?'''  
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"|  
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"|  
* It has a large user base as it is commonly used in bioinformatics.  
+
* It has a large user base as it is commonly used in '''bioinformatics'''.  
* Listed here are examples of few bioinformatic tools in Python:  
+
* Listed here are examples of few '''bioinformatic''' tools in '''Python''':  
  
 
'''Biopython, Modeller, chemopy, BLASTorage, Pymol'''  
 
'''Biopython, Modeller, chemopy, BLASTorage, Pymol'''  
  
For more information, refer the given website :  
+
For more information, refer the given website:  
  
 
[http://pythonforbiologists.com/ http://pythonforbiologists.com]  
 
[http://pythonforbiologists.com/ http://pythonforbiologists.com]  
Line 88: Line 87:
 
'''Installation'''  
 
'''Installation'''  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"|  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"|  
* '''Python''' comes installed, by default on Ubuntu.  
+
* '''Python''' comes installed, by default on '''Ubuntu'''.  
* IPython is an interactive terminal for Python  
+
* '''IPython''' is an '''interactive terminal''' for '''Python'''
* To install '''Python '''on '''Windows, Mac OS '''and '''Android '''devices, visit the given link<br/> '''www.python.org'''
+
* To install '''Python '''on '''Windows, Mac OS '''and '''Android '''devices, visit the given link
 
+
  
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Open terminal by pressing '''Ctrl+Alt+T''' at the same time.  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Open terminal by pressing '''Ctrl+Alt+T''' at the same time.  
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Open the terminal by pressing '''Ctrl+Alt+T''' simultaneously.  
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Open the '''terminal''' by pressing '''Ctrl+Alt+T''' simultaneously.  
 
+
Python comes installed, by default on Ubuntu.
+
 
+
  
 +
'''Python''' comes installed, by default on '''Ubuntu'''.
  
  
Line 109: Line 105:
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| In case you don't, then manually install the latest version of '''IPython''', by typing  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| In case you don't, then manually install the latest version of '''IPython''', by typing  
  
'''sudo apt-get install ipython3'''  
+
'''sudo apt-get install ipython3''' and press '''Enter.'''  
 
+
and press '''Enter.'''  
+
  
Give root password if asked.  
+
Give '''root password''' if asked.  
  
 
|-
 
|-
Line 119: Line 113:
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Wait for a few minutes for the installation to complete.  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Wait for a few minutes for the installation to complete.  
  
Note : '''Python3''' does not overwrite the default Python on the system  
+
Note : '''Python3''' does not overwrite the default '''Python''' on the system.
  
 
|-
 
|-
Line 127: Line 121:
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To check whether '''ipython3''' is installed successfully on your system,  
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To check whether '''ipython3''' is installed successfully on your system,  
  
Type '''ipython3 '''and press '''Enter.'''  
+
type '''ipython3 '''and press '''Enter.'''  
  
 
|-
 
|-
Line 133: Line 127:
  
 
Highlight the prompt
 
Highlight the prompt
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| You will see few lines of information on '''Python '''like, the version number etc.  
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| You will see few lines of information on '''Python '''like, the version number.
  
You will also see the '''Ipython '''prompt on the''' '''terminal'''.'''
+
You will also see the '''Ipython prompt''' on the '''terminal'''.  
  
Prompt indicates that '''Ipython '''is installed successfully.
+
'''Prompt''' indicates that '''Ipython '''is installed successfully.
  
 
|-
 
|-
Line 145: Line 139:
 
|-
 
|-
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Cursor on terminal  
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Cursor on terminal  
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To begin with, we will store data, i.e '''DNA sequence '''in a variable called '''my_DNA.'''  
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To begin with, we will store data, i.e '''DNA sequence, '''in a variable called '''my_DNA.'''  
  
 
|-
 
|-
Line 154: Line 148:
  
 
A '''string''' is a data in the form of a text.  
 
A '''string''' is a data in the form of a text.  
 +
 +
Let us go back to the '''terminal'''.
 +
 +
Let us first clear the '''terimal''' by typing '''clear''' and press '''Enter'''.
  
 
|-
 
|-
Line 163: Line 161:
  
 
Press '''Enter'''  
 
Press '''Enter'''  
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us go back to the terminal.
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"|Type,
  
Type,
+
'''my_DNA is equal to within double quotes ATGCGCAT.'''
  
'''my_DNA is equal to within double quotes ATGCGCAT.'''Press '''Enter'''.  
+
Press '''Enter'''.  
  
 
|-
 
|-
Line 175: Line 173:
 
|-
 
|-
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Highlight '''my_DNA'''  
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Highlight '''my_DNA'''  
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| For writing a code, we can use the variable name instead of the string itself.  
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| For writing a code, we can use the variable name instead of the '''string''' itself.  
  
 
|-
 
|-
Line 183: Line 181:
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To print the DNA sequence,  
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To print the DNA sequence,  
  
we will use '''print '''function.
+
we will use '''print function.'''
  
 
For that type,
 
For that type,
  
'''print(my_DNA) '''and press '''Enter.'''  
+
'''print inside brackets my underscore DNA''' and press '''Enter.'''  
  
 
|-
 
|-
Line 202: Line 200:
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Press '''up '''arrow  
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Press '''up '''arrow  
  
Add '''\n '''and '''DNA '''after '''ATGCGCAT''' '''.'''
+
Add '''\n '''and '''DNA '''after '''ATGCGCAT'''.
  
 
'''my_DNA = "ATGCGCAT\nDNA"'''  
 
'''my_DNA = "ATGCGCAT\nDNA"'''  
Line 213: Line 211:
 
Lets edit this line.  
 
Lets edit this line.  
  
Type '''\n '''and '''DNA '''after the sequence''' '''within double quotes.  
+
Type '''\n DNA '''after the sequence within double quotes.  
  
 
Press '''Enter.'''  
 
Press '''Enter.'''  
Line 223: Line 221:
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Type,  
 
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Type,  
  
'''print(my_DNA) '''and press '''Enter.'''  
+
'''print inside brackets my underscore DNA''' and press '''Enter.'''  
 
+
 
+
  
  
Line 234: Line 230:
  
 
'''DNA'''  
 
'''DNA'''  
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| The output prints the sequence on two separate lines ,
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| The output prints the sequence on two separate lines.
 
+
 
+
  
  
Line 246: Line 240:
  
 
* Using example of a short proteinsequence given
 
* Using example of a short proteinsequence given
 
+
* Print the sequence on a single line,  
* Print the sequence on a single line, and print the sequence on two separate lines.  
+
* And print the sequence on two separate lines.  
 
+
  
  
Line 256: Line 249:
  
 
|-
 
|-
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Slide 10
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| '''Slide 10'''
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Another useful built-in tool in '''Python''' is the '''len''' '''function'''.  
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Another useful built-in tool in '''Python''' is the '''len function'''.  
  
 
It is used to calculate the length of a string.  
 
It is used to calculate the length of a string.  
  
 
|-
 
|-
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us go back to the terminal.
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us go back to the '''terminal'''.
  
 
Press up arrow key  
 
Press up arrow key  
Line 269: Line 262:
  
 
Press '''Enter'''  
 
Press '''Enter'''  
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us go back to the terminal.
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us go back to the '''terminal'''.
  
 
Press '''up '''arrow on the key board till we get this command on the '''terminal.'''  
 
Press '''up '''arrow on the key board till we get this command on the '''terminal.'''  
Line 299: Line 292:
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Another assignment for you  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Another assignment for you  
  
* Calculate the length of the given '''DNA '''sequence `ATGGCATGCGC'
+
* Calculate the length of the given '''DNA '''sequence ''''ATGGCATGCGC''''
 
+
* And Store the output in a variable.  
* and Store the output in a variable.  
+
 
+
  
  
 
|-
 
|-
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Slide 12
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| '''Slide 12'''
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Many times in biochemistry, sequences are represented either in &nbsp;lowercase or uppercase &nbsp;alphabets.  
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Many times in '''biochemistry''', sequences are represented either in lowercase or uppercase alphabets.  
 
+
 
+
 
+
  
 
|-
 
|-
Line 322: Line 310:
  
 
Press '''Enter.'''  
 
Press '''Enter.'''  
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To convert the uppercase alphabets in a string to &nbsp;lowercase:
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To convert the uppercase alphabets in a string to lowercase, we make use of '''lower() method. '''
  
We make use of '''lower()''' method.
+
Let us go back to the '''terminal.'''
 
+
Let us go back to the terminal.
+
  
 
Type,  
 
Type,  
  
'''my_DNA=”ATGCGCAT”. &nbsp;'''Press '''Enter'''  
+
'''my_DNA=”ATGCGCAT”''' and press '''Enter'''  
 +
 
  
Then type, '''my_DNA'''.'''lower().'''
+
Then type, '''my_DNA dot lower open and close brackets''' .
  
 
In a method, we write,  
 
In a method, we write,  
Line 338: Line 325:
 
* The name of the variable first,  
 
* The name of the variable first,  
 
* followed by a period(.),  
 
* followed by a period(.),  
* then the name of the method  
+
* then the name of the method,
 
* then we open and close parentheses.  
 
* then we open and close parentheses.  
  
Line 345: Line 332:
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Highlight &nbsp;''''atgcgcat''''  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Highlight &nbsp;''''atgcgcat''''  
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| The output shows the string in lowercase.  
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| The output shows the '''string''' in lowercase.  
  
 
|-
 
|-
Line 353: Line 340:
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| As an assignment,  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| As an assignment,  
  
Using example of a short protein  
+
*Using example of a short protein sequence given
 
+
*Convert the sequence to uppercase  
sequence given
+
* Hint: Use '''upper() method.'''  
 
+
Convert the sequence to uppercase'''.'''
+
 
+
'''Hint: Use upper() method.'''  
+
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"|  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"|  
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us go back to terminal again  
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us go back to '''terminal''' again.
  
 
|-
 
|-
Line 369: Line 352:
  
 
my_protein = <tt>"alspadkanl"</tt>  
 
my_protein = <tt>"alspadkanl"</tt>  
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Lets take an example of an amino acid sequence.
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Lets take an example of an '''amino acid''' sequence.
 
+
Store it in a variable called '''my_protein'''  
+
 
+
my_protein = <tt>"alspadkanl"</tt>
+
 
+
  
 +
Store it in a variable called '''my_protein'''.
  
 +
So type '''my_protein = <tt>"alspadkanl"</tt>''' and press '''Enter'''.
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Slide 14
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Slide 14
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To find out the number of times an amino acid or a sequence of amino acids occurs in a string.
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| To find out the number of times an '''amino acid''' or a sequence of '''amino acids''' occurs in a '''string''', we make use of '''count function. '''
 
+
We make use of '''count '''function
+
 
+
 
+
  
  
Line 393: Line 369:
  
 
Press '''Enter'''
 
Press '''Enter'''
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us go back to the terminal.
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us go back to the '''terminal'''.
  
For example to know the number of times amino acid '''Alanine''' occurs in the string  
+
For example, to know the number of times '''amino acid Alanine''' occurs in the '''string''', type
  
Type
+
my_protein.count ('a') [[my underscore protein dot count open and close brackets within single quotes a]]
 
+
my_protein.count ('a')  
+
  
 
Press '''Enter'''  
 
Press '''Enter'''  
Line 405: Line 379:
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Highlight 3
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Highlight 3
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| <tt>Output shows number 3.</tt>
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Output shows number 3.  
 
+
<tt>There are 3 '''Alanines''' in the string. </tt>
+
 
+
  
 +
There are 3 '''Alanines''' in the '''string.'''
  
  
Line 418: Line 390:
  
 
Press '''Enter'''
 
Press '''Enter'''
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Similarly to find number of''' Leucines''' in the string  
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Similarly to find number of''' Leucines''' in the '''string''', type
  
Type
+
my_protein.count('l') [[my underscore protein dot count open and close brackets within single quotes l]]
 
+
my_protein.count('l')  
+
  
 
Press '''Enter'''
 
Press '''Enter'''
 
 
 
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Highlight 2
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Highlight 2
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| We get an output as '''2''', there are '''2''' '''Leucines''' in the string.  
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| We get an output as '''2'''.
 +
 
 +
There are '''2 Leucines''' in the '''string'''.  
  
 
|-
 
|-
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Cursor on the terminal  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Cursor on the terminal  
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Similarly we can use DNA or an RNA sequence as string to count the ocurrences of basepairs .  
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Similarly, we can use '''DNA''' or an '''RNA''' sequence as '''string''' to count the ocurrances of '''basepairs'''.  
  
 
|-
 
|-
Line 444: Line 413:
  
  
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us summarize,
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Let us summarize.
  
 
In this tutorial we learnt:  
 
In this tutorial we learnt:  
  
 
* Installation of '''IPython Interpreter'''  
 
* Installation of '''IPython Interpreter'''  
 
+
* Storing data in variables using examples of '''DNA''' and '''Protein''' sequences.  
* Storing data in variables using  
+
 
+
examples of '''DNA''' and '''Protein'''  
+
 
+
sequences.  
+
 
+
 
+
 
* Printing a sequence in single and on two separate lines  
 
* Printing a sequence in single and on two separate lines  
 
 
  
 
|-
 
|-
Line 466: Line 426:
 
'''Summary'''  
 
'''Summary'''  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"|  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"|  
* Find the length of the string  
+
* Find the length of the '''string'''
 
+
* Change case of the '''string'''
* Change case of the string  
+
* Count the number of times a character appears in a '''string'''
 
+
* Count the number of times a character appears in a string  
+
 
+
 
+
  
 
|-
 
|-
Line 478: Line 434:
  
 
'''Assignment'''  
 
'''Assignment'''  
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Here is an assignment,  
+
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| As an assignment,  
 
+
Calculate GC content in the given DNA sequence.
+
 
+
'ATGGCATGCGC'
+
 
+
  
 +
*Calculate '''GC content''' in the given DNA sequence.
 +
*'ATGGCATGCGC'
  
  
Line 514: Line 467:
  
  
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Spoken Tutorial Project is supported by the NMEICT, MHRD, Government of India.  
+
| style="background-color:#ffffff;border:0.75pt solid #000001;padding-top:0.049cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| Spoken Tutorial Project is supported by NMEICT, MHRD, Government of India.  
  
 
More information on this Mission is available at this link.  
 
More information on this Mission is available at this link.  
Line 522: Line 475:
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| This script is contributed by Snehalatha and Trupti Kini.  
 
| style="background-color:#ffffff;border-top:none;border-bottom:0.75pt solid #000001;border-left:0.75pt solid #000001;border-right:0.75pt solid #000001;padding-top:0cm;padding-bottom:0.049cm;padding-left:0.191cm;padding-right:0.191cm;"| This script is contributed by Snehalatha and Trupti Kini.  
  
And this is Trupti Kini from '''IIT Bombay''' signing off.  
+
And this is Trupti Kini from IIT Bombay signing off.  
  
 
Thanks for joining.  
 
Thanks for joining.  
  
 
|}
 
|}

Latest revision as of 14:51, 13 August 2014

Title of script: Introduction to Python for Biologists

Author: Trupti Rajesh Kini & Snehalatha

Keywords: video tutorial, Python, DNA seqences, Protein sequences, Biologists


Visual Cue Narration
Slide 1 Welcome to the spoken-tutorial on Introduction to Python for Biologists.
Slide 2

Learning Objectives

In this tutorial we will learn,
  • Installation of Python/IPython interpreter.
  • Simple Python programs using examples of DNA and Protein sequences.


Slide 3

System Requirements


To record this tutorial, I am using
  • Ubuntu OS version 12.04
  • Python 3.2.3
  • IPython 0.12.1


Slide 4

Prerequisites


To practice this tutorial you should be familiar with,
  • Basic biochemistry

You can also refer to Spoken Tutorials on Python for better understanding of this tutorial.

These are available at the given link.

Slide 5

Why Python for biologists?

Some of the features of Python useful for biologists are as follows:
  • Python has many tools to write small programs that are useful in biology.
  • It has a consistent syntax.
  • It has built-in-libraries for common tasks.
  • We can manipulate DNA and protein sequences easily .


Slide 6

Why Pyhton for biologists?

  • It has a large user base as it is commonly used in bioinformatics.
  • Listed here are examples of few bioinformatic tools in Python:

Biopython, Modeller, chemopy, BLASTorage, Pymol

For more information, refer the given website:

http://pythonforbiologists.com

Slide 7

Installation

  • Python comes installed, by default on Ubuntu.
  • IPython is an interactive terminal for Python
  • To install Python on Windows, Mac OS and Android devices, visit the given link


Open terminal by pressing Ctrl+Alt+T at the same time. Open the terminal by pressing Ctrl+Alt+T simultaneously.

Python comes installed, by default on Ubuntu.


Type sudo apt-get install ipython3

and press Enter.

In case you don't, then manually install the latest version of IPython, by typing

sudo apt-get install ipython3 and press Enter.

Give root password if asked.

Cursor on the terminal. Wait for a few minutes for the installation to complete.

Note : Python3 does not overwrite the default Python on the system.

Open the terminal

Type ipython3 and press Enter.

To check whether ipython3 is installed successfully on your system,

type ipython3 and press Enter.

Cursor on the terminal

Highlight the prompt

You will see few lines of information on Python like, the version number.

You will also see the Ipython prompt on the terminal.

Prompt indicates that Ipython is installed successfully.

Cursor on terminal Let's type a few simple Python commands with an example of a DNA sequence.
Cursor on terminal To begin with, we will store data, i.e DNA sequence, in a variable called my_DNA.
Slide 8

What is a string?

In Python language, data such as protein and DNA sequences are called as strings.

A string is a data in the form of a text.

Let us go back to the terminal.

Let us first clear the terimal by typing clear and press Enter.

Type in the terminal,

my_DNA = "ATGCGCAT"

Highlight my_DNA

Press Enter

Type,

my_DNA is equal to within double quotes ATGCGCAT.

Press Enter.

Highlight my_DNA We call this as assigning a variable.
Highlight my_DNA For writing a code, we can use the variable name instead of the string itself.
Type,

print(my_DNA) and press Enter

To print the DNA sequence,

we will use print function.

For that type,

print inside brackets my underscore DNA and press Enter.

Highlight the output,

ATGCGCAT

We get the sequence as output.
Cursor on the terminal. Now let us print the sequence on two separate lines.
Press up arrow

Add \n and DNA after ATGCGCAT.

my_DNA = "ATGCGCAT\nDNA"

Press Enter

Press up arrow on the key board till we get this command on the terminal.

my_DNA = "ATGCGCAT”

Lets edit this line.

Type \n DNA after the sequence within double quotes.

Press Enter.

Type,

print(my_DNA) and press Enter

Type,

print inside brackets my underscore DNA and press Enter.


Highlight the output

ATGCGCAT

DNA

The output prints the sequence on two separate lines.


Slide 9

Assignment

As an assignment,
  • Using example of a short proteinsequence given
  • Print the sequence on a single line,
  • And print the sequence on two separate lines.


Cursor on the terminal Let us now learn a few more functions and methods.
Slide 10 Another useful built-in tool in Python is the len function.

It is used to calculate the length of a string.

Let us go back to the terminal.

Press up arrow key

my_DNA = "ATGCGCAT"

Press Enter

Let us go back to the terminal.

Press up arrow on the key board till we get this command on the terminal.

my_DNA = "ATGCGCAT”

Press Enter

Type:

len(my_DNA)

To find the length of the DNA sequence in a variable, type,

len within brackets my_DNA

Press Enter.

Cursor on the terminal The output on the screen shows the number 8.

This is the length of the DNA sequence stored in the variable my_DNA.

Slide 11

Assignment

Another assignment for you
  • Calculate the length of the given DNA sequence 'ATGGCATGCGC'
  • And Store the output in a variable.


Slide 12 Many times in biochemistry, sequences are represented either in lowercase or uppercase alphabets.
Type,

my_DNA=”ATGCGCAT”

Press Enter

Type my_DNA.lower()

Press Enter.

To convert the uppercase alphabets in a string to lowercase, we make use of lower() method.

Let us go back to the terminal.

Type,

my_DNA=”ATGCGCAT” and press Enter


Then type, my_DNA dot lower open and close brackets .

In a method, we write,

  • The name of the variable first,
  • followed by a period(.),
  • then the name of the method,
  • then we open and close parentheses.

Press Enter.

Highlight  'atgcgcat' The output shows the string in lowercase.
Slide 13

Assignment

As an assignment,
  • Using example of a short protein sequence given
  • Convert the sequence to uppercase
  • Hint: Use upper() method.
Let us go back to terminal again.
Type

my_protein = "alspadkanl"

Lets take an example of an amino acid sequence.

Store it in a variable called my_protein.

So type my_protein = "alspadkanl" and press Enter.

Slide 14 To find out the number of times an amino acid or a sequence of amino acids occurs in a string, we make use of count function.


Type

my_protein.count ('a')

Press Enter

Let us go back to the terminal.

For example, to know the number of times amino acid Alanine occurs in the string, type

my_protein.count ('a') my underscore protein dot count open and close brackets within single quotes a

Press Enter

Highlight 3 Output shows number 3.

There are 3 Alanines in the string.


Type

my_protein.count('l')

Press Enter

Similarly to find number of Leucines in the string, type

my_protein.count('l') my underscore protein dot count open and close brackets within single quotes l

Press Enter

Highlight 2 We get an output as 2.

There are 2 Leucines in the string.

Cursor on the terminal Similarly, we can use DNA or an RNA sequence as string to count the ocurrances of basepairs.
Slide 15

Summary


Let us summarize.

In this tutorial we learnt:

  • Installation of IPython Interpreter
  • Storing data in variables using examples of DNA and Protein sequences.
  • Printing a sequence in single and on two separate lines
Slide 16

Summary

  • Find the length of the string
  • Change case of the string
  • Count the number of times a character appears in a string
Slide 17

Assignment

As an assignment,
  • Calculate GC content in the given DNA sequence.
  • 'ATGGCATGCGC'


Slide 18

About Spoken Tutorial Project


The video available at the following link summarizes the Spoken Tutorial project. Pls watch it.
Slide 19

About Spoken Tutorial workshops


The Spoken Tutorial Project Team conducts workshops and gives certificates to those who pass an online test.

For more details, please write to us.

Slide 20

Acknowledgement


Spoken Tutorial Project is supported by NMEICT, MHRD, Government of India.

More information on this Mission is available at this link.

This script is contributed by Snehalatha and Trupti Kini.

And this is Trupti Kini from IIT Bombay signing off.

Thanks for joining.

Contributors and Content Editors

Nancyvarkey, Snehalathak, Trupti