Python for Biologists/C2/Introduction-to-Python-for-Biologists/English

From Script | Spoken-Tutorial
Jump to: navigation, search

Title of script: Introduction to Python for Biologists

Author: Trupti Rajesh Kini & Snehalatha

Keywords: video tutorial, Python, DNA seqences, Protein sequences, Biologists


Visual Cue Narration
Slide 1 Welcome to the spoken-tutorial on Introduction to Python for Biologists.
Slide 2

Learning Objectives

In this tutorial we will learn,
  • Installation of Python/IPython interpreter.
  • Simple Python programs using examples of DNA and Protein sequences.


Slide 3

System Requirements


To record this tutorial, I am using
  • Ubuntu OS version 12.04
  • Python 3.2.3
  • IPython 0.12.1


Slide 4

Prerequisites


To practice this tutorial you should be familiar with,
  • Basic biochemistry

You can also refer to Spoken Tutorials on Python for better understanding of this tutorial.

These are available at the given link.

Slide 5

Why Python for biologists?

Some of the features of Python useful for biologists are as follows:
  • Python has many tools to write small programs that are useful in biology.
  • It has a consistent syntax.
  • It has built-in-libraries for common tasks.
  • We can manipulate DNA and protein sequences easily .


Slide 6

Why Pyhton for biologists?

  • It has a large user base as it is commonly used in bioinformatics.
  • Listed here are examples of few bioinformatic tools in Python:

Biopython, Modeller, chemopy, BLASTorage, Pymol

For more information, refer the given website:

http://pythonforbiologists.com

Slide 7

Installation

  • Python comes installed, by default on Ubuntu.
  • IPython is an interactive terminal for Python
  • To install Python on Windows, Mac OS and Android devices, visit the given link


Open terminal by pressing Ctrl+Alt+T at the same time. Open the terminal by pressing Ctrl+Alt+T simultaneously.

Python comes installed, by default on Ubuntu.


Type sudo apt-get install ipython3

and press Enter.

In case you don't, then manually install the latest version of IPython, by typing

sudo apt-get install ipython3 and press Enter.

Give root password if asked.

Cursor on the terminal. Wait for a few minutes for the installation to complete.

Note : Python3 does not overwrite the default Python on the system.

Open the terminal

Type ipython3 and press Enter.

To check whether ipython3 is installed successfully on your system,

type ipython3 and press Enter.

Cursor on the terminal

Highlight the prompt

You will see few lines of information on Python like, the version number.

You will also see the Ipython prompt on the terminal.

Prompt indicates that Ipython is installed successfully.

Cursor on terminal Let's type a few simple Python commands with an example of a DNA sequence.
Cursor on terminal To begin with, we will store data, i.e DNA sequence, in a variable called my_DNA.
Slide 8

What is a string?

In Python language, data such as protein and DNA sequences are called as strings.

A string is a data in the form of a text.

Let us go back to the terminal.

Let us first clear the terimal by typing clear and press Enter.

Type in the terminal,

my_DNA = "ATGCGCAT"

Highlight my_DNA

Press Enter

Type,

my_DNA is equal to within double quotes ATGCGCAT.

Press Enter.

Highlight my_DNA We call this as assigning a variable.
Highlight my_DNA For writing a code, we can use the variable name instead of the string itself.
Type,

print(my_DNA) and press Enter

To print the DNA sequence,

we will use print function.

For that type,

print inside brackets my underscore DNA and press Enter.

Highlight the output,

ATGCGCAT

We get the sequence as output.
Cursor on the terminal. Now let us print the sequence on two separate lines.
Press up arrow

Add \n and DNA after ATGCGCAT.

my_DNA = "ATGCGCAT\nDNA"

Press Enter

Press up arrow on the key board till we get this command on the terminal.

my_DNA = "ATGCGCAT”

Lets edit this line.

Type \n DNA after the sequence within double quotes.

Press Enter.

Type,

print(my_DNA) and press Enter

Type,

print inside brackets my underscore DNA and press Enter.


Highlight the output

ATGCGCAT

DNA

The output prints the sequence on two separate lines.


Slide 9

Assignment

As an assignment,
  • Using example of a short proteinsequence given
  • Print the sequence on a single line,
  • And print the sequence on two separate lines.


Cursor on the terminal Let us now learn a few more functions and methods.
Slide 10 Another useful built-in tool in Python is the len function.

It is used to calculate the length of a string.

Let us go back to the terminal.

Press up arrow key

my_DNA = "ATGCGCAT"

Press Enter

Let us go back to the terminal.

Press up arrow on the key board till we get this command on the terminal.

my_DNA = "ATGCGCAT”

Press Enter

Type:

len(my_DNA)

To find the length of the DNA sequence in a variable, type,

len within brackets my_DNA

Press Enter.

Cursor on the terminal The output on the screen shows the number 8.

This is the length of the DNA sequence stored in the variable my_DNA.

Slide 11

Assignment

Another assignment for you
  • Calculate the length of the given DNA sequence 'ATGGCATGCGC'
  • And Store the output in a variable.


Slide 12 Many times in biochemistry, sequences are represented either in lowercase or uppercase alphabets.
Type,

my_DNA=”ATGCGCAT”

Press Enter

Type my_DNA.lower()

Press Enter.

To convert the uppercase alphabets in a string to lowercase, we make use of lower() method.

Let us go back to the terminal.

Type,

my_DNA=”ATGCGCAT” and press Enter


Then type, my_DNA dot lower open and close brackets .

In a method, we write,

  • The name of the variable first,
  • followed by a period(.),
  • then the name of the method,
  • then we open and close parentheses.

Press Enter.

Highlight  'atgcgcat' The output shows the string in lowercase.
Slide 13

Assignment

As an assignment,
  • Using example of a short protein sequence given
  • Convert the sequence to uppercase
  • Hint: Use upper() method.
Let us go back to terminal again.
Type

my_protein = "alspadkanl"

Lets take an example of an amino acid sequence.

Store it in a variable called my_protein.

So type my_protein = "alspadkanl" and press Enter.

Slide 14 To find out the number of times an amino acid or a sequence of amino acids occurs in a string, we make use of count function.


Type

my_protein.count ('a')

Press Enter

Let us go back to the terminal.

For example, to know the number of times amino acid Alanine occurs in the string, type

my_protein.count ('a') my underscore protein dot count open and close brackets within single quotes a

Press Enter

Highlight 3 Output shows number 3.

There are 3 Alanines in the string.


Type

my_protein.count('l')

Press Enter

Similarly to find number of Leucines in the string, type

my_protein.count('l') my underscore protein dot count open and close brackets within single quotes l

Press Enter

Highlight 2 We get an output as 2.

There are 2 Leucines in the string.

Cursor on the terminal Similarly, we can use DNA or an RNA sequence as string to count the ocurrances of basepairs.
Slide 15

Summary


Let us summarize.

In this tutorial we learnt:

  • Installation of IPython Interpreter
  • Storing data in variables using examples of DNA and Protein sequences.
  • Printing a sequence in single and on two separate lines
Slide 16

Summary

  • Find the length of the string
  • Change case of the string
  • Count the number of times a character appears in a string
Slide 17

Assignment

As an assignment,
  • Calculate GC content in the given DNA sequence.
  • 'ATGGCATGCGC'


Slide 18

About Spoken Tutorial Project


The video available at the following link summarizes the Spoken Tutorial project. Pls watch it.
Slide 19

About Spoken Tutorial workshops


The Spoken Tutorial Project Team conducts workshops and gives certificates to those who pass an online test.

For more details, please write to us.

Slide 20

Acknowledgement


Spoken Tutorial Project is supported by NMEICT, MHRD, Government of India.

More information on this Mission is available at this link.

This script is contributed by Snehalatha and Trupti Kini.

And this is Trupti Kini from IIT Bombay signing off.

Thanks for joining.

Contributors and Content Editors

Nancyvarkey, Snehalathak, Trupti