Difference between revisions of "Gromacs/C2/PDB-File-Format-and-Website/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
 
(3 intermediate revisions by the same user not shown)
Line 87: Line 87:
  
 
'''Protein Data Bank'''
 
'''Protein Data Bank'''
 +
 
[https://www.rcsb.org/ https://www.rcsb.org/] '''
 
[https://www.rcsb.org/ https://www.rcsb.org/] '''
 
|| * '''Protein data bank''' is a repository of 3D structural data of biomolecules
 
|| * '''Protein data bank''' is a repository of 3D structural data of biomolecules
 
* The website link is shown here.
 
* The website link is shown here.
* The structures are determined by '''X-ray''' diffraction, '''NMR''', '''electron microscopy'''
+
* The structures are determined by '''X-ray''' diffraction, '''NMR''', & '''electron microscopy'''
  
 
|-
 
|-
Line 108: Line 109:
 
|-
 
|-
 
|| Cursor on '''MODEL'''.
 
|| Cursor on '''MODEL'''.
|| The '''MODEL''' indicates the first set of coordinates are listed below it.
+
|| The '''MODEL''' indicates the first set of coordinates which is listed below it.
  
 
This happens when more than a single coordinate set is selected to represent the molecule.  
 
This happens when more than a single coordinate set is selected to represent the molecule.  
Line 114: Line 115:
 
|-
 
|-
 
|| Cursor on page.
 
|| Cursor on page.
|| Learner can refer to structures determined by NMR method, to know more.  
+
|| Learner can refer to structures determined by '''NMR''' method, to know more.  
  
An ensemble of structure is selected to represent the molecule.
+
In '''NMR''' an ensemble of structure is selected to represent the molecule.
  
 
|-
 
|-
Line 134: Line 135:
 
|| Third column gives the atom name.
 
|| Third column gives the atom name.
  
Fourth column shows the atom is in a '''Lysine''' residue.
+
The next column shows the amino acid residue.
  
 
|-
 
|-
Line 160: Line 161:
 
|-
 
|-
 
|| Scroll down and show '''TER''' line in the file.
 
|| Scroll down and show '''TER''' line in the file.
|| Scroll to the end of the file and notice the TER.
+
|| Scroll to the end of the file and notice the '''TER''' written here.
  
 
This denotes the end of the molecular model and the molecule.
 
This denotes the end of the molecular model and the molecule.
Line 166: Line 167:
 
|-
 
|-
 
|| Cursor on residue number 10.
 
|| Cursor on residue number 10.
|| Notice that the data is truncated at the 10th residue in this document.
+
|| Notice that, the data is truncated at the 10th residue in this document.
  
 
This file is for demonstration purpose only.
 
This file is for demonstration purpose only.
Line 172: Line 173:
 
|-
 
|-
 
|| Cursor on the interface and close the text editor.
 
|| Cursor on the interface and close the text editor.
|| All the residues for '''1AKI''' are not present here.
+
|| All residues for '''1AKI''' which is the '''lysozyme''' entry are not present here.
  
 
Lysozyme, or full length deposition of '''1AKI''' has 129 amino acid residues.
 
Lysozyme, or full length deposition of '''1AKI''' has 129 amino acid residues.
Line 178: Line 179:
 
Let’s consult the '''PDB''' site to know more.
 
Let’s consult the '''PDB''' site to know more.
  
Let’s close the text editor file.
+
Let’s close the text editor.
  
 
|-
 
|-
Line 209: Line 210:
 
|| Show details of '''1AKI'''.
 
|| Show details of '''1AKI'''.
 
|| Now the structure deposition details of this protein ID, appears.
 
|| Now the structure deposition details of this protein ID, appears.
The page opens in the '''structure summary''' tab.
+
The page opens in the '''Structure Summary''' tab.
  
 
|-
 
|-
Line 223: Line 224:
 
|-
 
|-
 
|| Click on '''3D View''' tab.
 
|| Click on '''3D View''' tab.
Scroll down the page.
+
Scroll to the top of the page.
 
|| Scroll to the top of the page and click on the '''3D View''' tab.
 
|| Scroll to the top of the page and click on the '''3D View''' tab.
  
 
Scroll down the page.
 
Scroll down the page.
  
An interactive window on the left with the protein structure.
+
An interactive window is seen on the left with the protein structure.
  
 
|-
 
|-
Line 321: Line 322:
  
 
|-
 
|-
|| Screenshot of '''1AKI''' file opened in text editor.
+
|| Curson on '''1AKI''' file opened in text editor.
 
|| The '''1AKI''' file, which we opened earlier, had hydrogen atoms added to the file.
 
|| The '''1AKI''' file, which we opened earlier, had hydrogen atoms added to the file.
  
That file also has a trimmed header.
+
The file also had a trimmed header.
  
 
Hydrogen atoms are added to the molecule afterwards as needed by the users.
 
Hydrogen atoms are added to the molecule afterwards as needed by the users.
Line 391: Line 392:
 
* The '''PDB''' website
 
* The '''PDB''' website
 
* '''FASTA''' file format
 
* '''FASTA''' file format
* Header details in '''PDB''' file
+
* Header details in the '''PDB''' file
 
* Downloaded a '''pdb''' file  
 
* Downloaded a '''pdb''' file  
  

Latest revision as of 18:48, 21 October 2021

Visual Cue Narration
Slide Number 1

Title Slide

Welcome to the tutorial on PDB File Format and Website.
Slide Number 2

Learning Objectives

In this tutorial, we will learn,
  • About PDB file format for molecular structure
  • Protein Data Bank (PDB) website
  • Download a pdb file from PDB website
  • About FASTA file format
Slide Number 3

System and Software Requirement

To record this tutorial, I am using
  • Ubuntu Linux v20.04 OS
  • Firefox web browser v92
  • gedit v3.36.2
  • A working internet connection to access the PDB website
Slide Number 4

Pre-requisites

https://spoken-tutorial.org

To follow this tutorial,
  • Learner must be familiar with basic computer skills.
  • For pre-requisite tutorials, please visit this site.
Slide Number 5

File from downloaded tarball

Files used in this tutorial, can be found in the,
  • Downloaded and extracted tarball directory used for Gromacs installation
Open the File manager. Let’s open the File manager.

Go to the downloaded and extracted tarball directory for Gromacs installation.

Cursor on the files of the extracted Gromacs directory.

Go to src directory.

Few test PDB files are given for testing purposes with Gromacs installation.

They can be found in the src directory of the extracted tarball of Gromacs.

Navigate to testutils directory and then to simulationdatabase. Now, navigate to testutils directory and then to simulationdatabase.
Scroll down and cursor on the lysozyme.pdf file. Scroll down and notice the lysozyme.pdb file.
Open lysozyme.pdb in a text editor. Let’s open this file in a text editor.

This file is in PDB file format, and the structure of truncated lysozyme.

Cursor on the file. The file gives the coordinates in xyz axes of each atom in a molecule.

It is in the PDB format defined by IUPAC convention.

Slide Number 6

Protein Data Bank

https://www.rcsb.org/

* Protein data bank is a repository of 3D structural data of biomolecules
  • The website link is shown here.
  • The structures are determined by X-ray diffraction, NMR, & electron microscopy
Slide Number 7

Protein Data Bank

Atomic resolution structures for
  • Proteins, DNA and RNA
  • Carbohydrates and lipids
  • Several ligand molecules

are available here

Cursor on the first 3 lines. The first few lines on the top are comments, giving details about the file.
Cursor on MODEL. The MODEL indicates the first set of coordinates which is listed below it.

This happens when more than a single coordinate set is selected to represent the molecule.

Cursor on page. Learner can refer to structures determined by NMR method, to know more.

In NMR an ensemble of structure is selected to represent the molecule.

Cursor on the ATOM lines. Notice the several columns below MODEL.

Each row gives details of each atom in the biomolecule.

Cursor on the first and second column. The first column shows, it is for the atom.

The second column incrementally counts each atom in the molecule.

Cursor on the third and fourth column. Third column gives the atom name.

The next column shows the amino acid residue.

Cursor on 5th and 6th column. Fifth column is the subunit number.

Sixth column is the residue position number of the atom in the biomolecule.

Cursor on 7, 8, and 9 columns. The next three columns are the x, y, and z coordinates.

They are the position of the atom in the biomolecule in 3 dimensional space.

Cursor on the 10th column. The 10th column is the occupancy of the atom at that position.
Cursor on 11th and 12th column. Then notice the R factor.

The last column lists the element that occupies the position.

Scroll down and show TER line in the file. Scroll to the end of the file and notice the TER written here.

This denotes the end of the molecular model and the molecule.

Cursor on residue number 10. Notice that, the data is truncated at the 10th residue in this document.

This file is for demonstration purpose only.

Cursor on the interface and close the text editor. All residues for 1AKI which is the lysozyme entry are not present here.

Lysozyme, or full length deposition of 1AKI has 129 amino acid residues.

Let’s consult the PDB site to know more.

Let’s close the text editor.

Open web browser and go to https://www.rcsb.org/ . Next, open a web browser.

Go to the PDB website as seen here.

Let’s learn to search and download coordinates from the PDB website.

Enter lysozyme in the search form on top and press Enter. In the search form on the top, let’s enter lysozyme.
Scroll down. Scroll down the page and notice the several search results for lysozyme.

Each molecular structure that is deposited in the PDB has a unique ID.

Scroll up and enter 1AKI in the search form. Now, go to the top of the page.

Click on Search, Basic Search.

Let’s search for the structure deposition ID 1AKI.

Show details of 1AKI. Now the structure deposition details of this protein ID, appears.

The page opens in the Structure Summary tab.

Cursor on the page and scroll down. Scroll down and notice the detailed information on this structure.

It is that of egg white lysozyme.

Cursor on X-ray diffraction. Notice that it is determined by X-ray diffraction method.
Click on 3D View tab.

Scroll to the top of the page.

Scroll to the top of the page and click on the 3D View tab.

Scroll down the page.

An interactive window is seen on the left with the protein structure.

Click, hold and move mouse, show protein in rotated view. Click, hold and move the mouse to rotate the protein in the window.
Cursor on primary sequence and click on 2-3 residues.

Move cursor to the protein to show the highlighted position.

On the top of the screen, notice the protein primary sequence.

Click on any of the residue and notice it highlighted in the structure.

Cursor on the protein structure. Currently, the secondary structure of the protein is also visible.
Cursor on red dot. All the red dots are water molecules.

Users may pause this video and explore all the details given on this PDB ID.

Scroll up the page. Let’s scroll up the page and download the molecule structure coordinate file.
Cursor on Display files and Download files. Notice the Display files and Download files pulldown towards the top right.
Click on the Display files. Click on the Display files and notice the FASTA Sequence option.
Click on FASTA. Click on it.
Cursor on the FASTA sequence tab. A new tab opens with the primary sequence of the protein in single letter code.

The first line is the header giving the ID and details about the sequence.

Close the FASTA tab. I will close this tab.
Click on the PDB format option. Next click on the PDB format option in the Display files pulldown.
Cursor on the new tab with the PDB file. Again, a new tab opens with the PDB file.
Cursor on the header lines. The first several lines are headers which give details about the deposition.
Cursor on details on top. Notice the organism, name, authors, publication details and resolution.
Scroll down the page. Scroll down the page and notice more details.

Learner may pause this video and explore the PDB file in detail.

Cursor on SSBOND. Any disulfide bond present in the protein is also listed here.
Often small molecule ligand or more chains are present in the system of study.

If so, their details will also be visible in the header.

Cursor on the atomic coordinates. The coordinates for each atom in the unit structure is listed below.
Cursor on the atomic coordinates. Notice that there are no hydrogens listed here.

This is because the x-ray diffraction method does not detect hydrogens.

Curson on 1AKI file opened in text editor. The 1AKI file, which we opened earlier, had hydrogen atoms added to the file.

The file also had a trimmed header.

Hydrogen atoms are added to the molecule afterwards as needed by the users.

Scroll down.

Cursor at TER.

Now, scroll almost to the end of the page.
Cursor on HETATM.

Point to the HOH atoms in the file.

Below TER tag, this there are many hetero atoms listed now.

They are water molecules that are present in the protein crystal.

Show O element for HETATM. Notice that, here too, hydrogens will not be observed.
Show O element for HETATM. Often, these water molecules are removed from the PDB file, before simulation.
Close the PDB details tab. Let’s close the PDB details tab.
Click on the Download Files pulldown. Now, click on the Download Files pulldown.
Choose the PDB format. Choose the PDB format.

A dialog box may or may not open prompting to save the file.

Choose Save File and click on Ok. If so, choose to save the file on the computer and click on Ok.

Allow the file download to complete.

Cursor on the PDB web page. Learner may pause this video and open the downloaded file in a text editor.

Notice that the file we viewed and downloaded are the same.

Cursor on the PDB web page. Open the PDB file in VMD or another molecular viewer.

Notice the structure displayed.

Slide Number 8

Summary

Now let’s summarize. In this tutorial, we learned about,
  • The PDB file format
  • The PDB website
  • FASTA file format
  • Header details in the PDB file
  • Downloaded a pdb file
Slide Number 9

Assignment

For assignment activity, please do the following.
  • Go through the PDB website
  • Familiarize with more details given for the PDB deposition.
  • Go through the header details of the file that is downloaded.
Slide Number 10

Assignment

Go through the given IDs from PDB website for enzyme DHFR and it’s complexes.
  • 2L28 for apo enzyme and multiple conformers
  • 1DIS for binary complex
  • 3DAT for ternary complex


Slide Number 11

Assignment

  • Notice the presence and absence of hydrogen atoms in the structure files
  • Familiarize with
    • Conformational changes in the enzyme
    • Hydrogen bonding with ligands
Slide Number 12

Spoken Tutorial Project

This video summarises the Spoken Tutorial Project.

Please download and watch it.

Slide Number 13

Spoken Tutorial workshops

We conduct workshops using spoken tutorials and give certificates.

Please write to us.

Slide Number 14

Forum for questions

Post your timed queries in this forum.
Slide Number 15

Acknowledgment

Spoken Tutorial Project is funded by MoE, Government of India.
This is Rani from IIT, Bombay. Thank you for joining.

Contributors and Content Editors

Ranipv076, Snehalathak