Difference between revisions of "Gromacs/C3/Production-Run-for-Lysozyme/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
 
(4 intermediate revisions by 2 users not shown)
Line 134: Line 134:
 
'''Timescale of Protein Motions'''
 
'''Timescale of Protein Motions'''
 
||  
 
||  
* In microsecond or millisecond, ligand binding or folding is studied.
+
* In microsecond or millisecond timescale, ligand binding or folding is studied.
* These are computationally intensive processes.
+
* These are computationally intensive processes
* Use a high performance computation facility or a '''GPU''' for longer calculations.
+
* Use a high performance computation facility or a '''GPU''' for longer calculations
  
 
|-
 
|-
|| Go to the '''terminal'''.
+
|| '''Slide Number 9'''
|| Go to the '''terminal'''.
+
 
 +
'''Timescale of Protein Motions'''
 +
||  
 +
* In microsecond or millisecond timescale, ligand binding or folding is studied.
 +
* These are computationally intensive processes
 +
* Use a high performance computation facility or a '''GPU''' for longer calculations
  
 
|-
 
|-
|| Type, '''cd Documents/firstmd''' and press '''Enter'''.
+
|| '''Slide Number 10'''
|| Change to the '''firstmd''' directory, where the files are saved.  
+
 
 +
'''Timescale of Protein Motions'''
 +
||  
 +
* Use a high performance computation facility or a '''GPU''' for longer calculations
 +
 
 +
|-
 +
|| Go to the '''terminal'''.
 +
|| Go to the '''terminal'''.
  
 
|-
 
|-
Line 150: Line 162:
 
|| Enter the command as seen here to assemble the '''configuration''' for the '''md''' run.
 
|| Enter the command as seen here to assemble the '''configuration''' for the '''md''' run.
  
'''npt.gro is''' the starting structure for the production md run.
+
'''npt.gro is''' the starting structure for the '''production md run'''.
  
 
|-
 
|-
Line 156: Line 168:
 
|| The '''tpr''' file is the output file from this step.
 
|| The '''tpr''' file is the output file from this step.
  
An '''-r''' flag is used if we need to restrain the protein conformation.
+
An '''r''' flag is used if we need to restrain the protein conformation.
  
 
This is often done when studying protein-ligand binding.
 
This is often done when studying protein-ligand binding.
Line 179: Line 191:
  
 
|-
 
|-
|| Type,  
+
|| Type, '''gmx mdrun -v -deffnm md-1''' and press '''Enter'''.
 
+
'''gmx mdrun -v -deffnm md-1''' and press '''Enter'''.
+
 
|| Enter the '''mdrun''' command as seen to start the '''production run'''.
 
|| Enter the '''mdrun''' command as seen to start the '''production run'''.
  
Line 251: Line 261:
  
 
|-
 
|-
|| '''Slide Number 9'''
+
|| '''Slide Number 11'''
  
 
'''Data Analysis'''
 
'''Data Analysis'''
Line 275: Line 285:
 
Atomic interactions, kinetics and '''PCA''' are also options.
 
Atomic interactions, kinetics and '''PCA''' are also options.
  
These are useful to analyse flexible structures or ones such as molten globules.
+
These are useful to analyze flexible structures or molten globules.
  
 
|-
 
|-
Line 299: Line 309:
 
'''Windows''' users must first open '''VMD'''.
 
'''Windows''' users must first open '''VMD'''.
  
Then open the '''gro''' file, followed by the '''xtc''' '''trajectory''' file.
+
Then open the '''gro''' file, followed by the '''xtc '''trajectory''' file.
  
 
|-
 
|-
Line 404: Line 414:
  
 
Go to '''representations''' and click on '''Create Rep'''.
 
Go to '''representations''' and click on '''Create Rep'''.
|| Let’s close this window.
+
|| Go to '''representations''' window and create a new representation.
 
+
Go to '''representations''' window and create a new representation.
+
  
 
|-
 
|-
Line 463: Line 471:
  
 
|-
 
|-
|| '''Slide Number 9'''
+
|| '''Slide Number 12'''
  
 
'''Summary '''
 
'''Summary '''
Line 474: Line 482:
  
 
|-
 
|-
|| '''Slide Number 10'''
+
|| '''Slide Number 13'''
  
 
'''Summary'''
 
'''Summary'''
Line 482: Line 490:
  
 
|-
 
|-
|| '''Slide Number 11'''
+
|| '''Slide Number 14'''
  
 
'''Assignment'''
 
'''Assignment'''
Line 488: Line 496:
  
 
* Open the '''log''' file and go through the details
 
* Open the '''log''' file and go through the details
* Plot '''RMSD''' of '''trace''' (C-alpha) atoms only
+
* Plot '''RMSD''' of '''trace''' (C-alpha) atoms
 
* Create '''Ramachandran''' plot of the lowest and highest energy structure
 
* Create '''Ramachandran''' plot of the lowest and highest energy structure
  
 
|-
 
|-
|| '''Slide Number 12'''
+
|| '''Slide Number 15'''
  
 
'''Assignment'''
 
'''Assignment'''
Line 501: Line 509:
  
 
|-
 
|-
|| '''Slide Number 13'''
+
|| '''Slide Number 16'''
  
 
'''Spoken Tutorial Project'''
 
'''Spoken Tutorial Project'''
Line 509: Line 517:
  
 
|-
 
|-
|| '''Slide Number 14'''
+
|| '''Slide Number 17'''
  
 
'''Spoken Tutorial workshops'''
 
'''Spoken Tutorial workshops'''
Line 517: Line 525:
  
 
|-
 
|-
|| '''Slide Number 15'''
+
|| '''Slide Number 18'''
  
 
'''Forum for questions'''
 
'''Forum for questions'''
Line 523: Line 531:
  
 
|-
 
|-
|| '''Slide Number 16'''
+
|| '''Slide Number 19'''
  
 
'''Acknowledgment'''
 
'''Acknowledgment'''

Latest revision as of 13:42, 6 June 2022

Visual Cue Narration
Slide Number 1

Title Slide

Welcome to the spoken tutorial on Production Run for Lysozyme.
Slide Number 2

Learning Objectives

In this tutorial, we will,
  • Complete 1 nanosecond production run for lysozyme
  • About timescales of motions
  • Time taken for simulation
  • Load the xtc file for trajectory
  • Calculate RMSD from trajectory
Slide Number 3

Learning Objectives

  • Save data to various output files
  • Align a frame in the trajectory with a reference molecule
Slide Number 4

System and Software Requirement

To record this tutorial, I am using
  • Ubuntu Linux v20.04 OS
  • Gromacs v2021.2
  • VMD 1.9.3
Slide Number 5

Pre-requisites

https://spoken-tutorial.org

To follow this tutorial,
  • Learner must be familiar with basics of Gromacs and VMD.
  • For pre-requisite tutorials, please visit this site.
Slide Number 6

Code Files

  • Files used in this tutorial are provided in the code files link.
  • Please download and extract the files.
  • Make a copy and then use them while practising.
Show input files.

Show, copied, files in firsmd directory.

Several files are provided with this tutorial.


First, copy, the mdp file to the working directory.

If needed, copy the required files from the previous step.

Open Ubuntu 20.04 LTS app (Windows) or terminal (Linux).

Type cd Documents/firstmd and press Enter.

Open a terminal and go to the working directory.
Type cd ls and press Enter. I will also list the files.
Cursor on npt.gro and npt.tpr. The structure and configuration files for MD simulation are npt.gro & npt.tpr.

At this point, the following has happened to the starting PDB file.

Show all the files in the folder. The protein is solvated, charge neutralized with ions and energy minimized.

Then, in the initial MD step, temperature and pressure was equilibrated.

This prepared the system for the Production run MD.

I have all the files from all the steps in this folder.

Cursor on md.mdp and open in a text editor. Open the mdp file provided for MD simulation in a text editor.
Cursor on the time. Here, the timescale of simulation is 1 ns.
This allows us to follow changes in the conformation of the protein.

In energy minimization and equilibration steps, the time duration was shorter.

Back to the text editor and close the mdp file. Often, step size and length of simulation are adjusted.

This file specifies ½ million steps in 1 ns.

Frequency with which structures are written to the trajectory is also adjustable.

Let’s close the mdp file.

Slide Number 7

Timescale of Protein Motions


500px

This image shows the timescale of molecular motions we usually encounter.

The ns-ps time encompasses the lower end of protein motions for classical MD.

Slide Number 8

Timescale of Protein Motions

  • In microsecond or millisecond timescale, ligand binding or folding is studied.
  • These are computationally intensive processes
  • Use a high performance computation facility or a GPU for longer calculations
Slide Number 9

Timescale of Protein Motions

  • In microsecond or millisecond timescale, ligand binding or folding is studied.
  • These are computationally intensive processes
  • Use a high performance computation facility or a GPU for longer calculations
Slide Number 10

Timescale of Protein Motions

  • Use a high performance computation facility or a GPU for longer calculations
Go to the terminal. Go to the terminal.
Type, gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md-1.tpr and press Enter. Enter the command as seen here to assemble the configuration for the md run.

npt.gro is the starting structure for the production md run.

Cursor on md-1.tpr. The tpr file is the output file from this step.

An r flag is used if we need to restrain the protein conformation.

This is often done when studying protein-ligand binding.

Here we will not used it.

Cursor on the message. While creating the file, few messages are seen on the terminal.
Type ls and press Enter. Enter ls on the terminal to list the files.

Notice the output md-1.tpr, that is created.

This prepared the system for the production md.

Press Ctrl+L to clear the screen. As we did for the energy minimization step, we will enter the next command.
Type, gmx mdrun -v -deffnm md-1 and press Enter. Enter the mdrun command as seen to start the production run.

Usually, molecular dynamics refers to equilibration followed by production run.

Show the output on the terminal. This time, the verbose flag shows only a few details.

The number of steps and time needed to complete are seen on the screen.

Show the time taken for the process. Process will stop when it reaches 500 thousand steps in 1 ns.

This process may take a day or so on many personal computers.

Press Ctrl+C. You may run this process later on.

Use the provided data for further analysis.

To abort the process, press Control and C keys together once.

In a few minutes Gromacs stops the process.

Show files in the file manager. Several files with md prefix are generated in the working directory.
Highlight and delete the files to be deleted. If you had aborted the process, delete the incomplete files that are created.
If you have completed the process, do not delete the files.

A gro file will get created at the completion of the process.

Show files to be copied from provided file.

Copy them to the working directory.

Copy the provided files with the md-1 prefix to the working directory.
Type ls -ltr and press Enter. This step creates additional file types with cpt and xtc extensions.
Cursor on md-1.xtc file. The xtc file is a compressed trajectory file.


This file is useful, in data analysis during long simulations.

Cursor on md.trr, md.log and md.edr files. Trr, log and edr files are also created in this process.
Show the output files. In case of errors, learner may refer to the log file for trouble shooting.
We have generated several files for further analysis.

Data analysis path varies depending on our aim or objective of simulation.

Slide Number 11

Data Analysis

Few examples of data analysis are,
  • Validation, visualization
  • Analysis of energetics, H-bonding
  • Secondary structure changes
  • Ligand Binding, surface analysis,
  • Analysis of subset of atoms in the system
Open a web browser.

Go to, https://manual.gromacs.org/documentation/2018/user-guide/cmdline.html .

Open a web browser.

Let’s go to the Gromacs manual site as seen here.

Cursor on the left panel. On the left frame notice the various categories of topics for analysis.

Atomic interactions, kinetics and PCA are also options.

These are useful to analyze flexible structures or molten globules.

Learner is encouraged to read further on this to know more.
Go to the terminal.

Press Ctrl+L.

Go to the terminal.

I will clear the terminal for clarity in the video.

Type, vmd md-1.gro md-1.xtc and press Enter. Let’s open VMD to view the trajectory from the 1ns long simulation.

This time we will use the xtc file instead of the trr file to load the trajectory.

Xtc is a compressed format of the simulation trajectory.

Windows users must first open VMD.

Then open the gro file, followed by the xtc trajectory file.

Click on Graphics, representation.


Type, protein for Selected atoms and press Enter.

Hiding the water molecules from graphics display will be useful for clarity.
You may adjust the graphic representation of the protein to your desire.

I will change it to show as ribbons.

In the VMD terminal enter the command pbc box. Display the box, using the command as seen here in the vmd terminal.
Play the trajectory. Pause the video and play the trajectory.

Notice the xtc file has 1002 frames.

Later on, you may load the trr file and notice, it has only 101 frames.

Xtc format has precision of 3 decimal places compared to 6 decimals for trr.

Hence, even though xtc has 10 times more frames, the file sizes are comparable.

Stop playing the trajectory.

Close the Window.

Visualization alone is not sufficient to know the features of the data.

Hence let’s do few data analysis to see the features embedded.

Click on Extensions, Analysis and choose RMSD Trajectory Tool. Click on Extensions, Analysis and choose RMSD Trajectory Tool.

A RMSD trajectory tool window opens.

Let’s calculate RMSD of the backbone atoms in the trajectory.
In Selection modifiers, check the box for Backbone. In Selection modifiers, check the box for Backbone.

Notice that, Trace or NH can also be selected for plotting.

Cursor over trajectory section. From the Trajectory section, we can also select a range of frames for analysis.

I will retain the defaults and check the boxes for Plot and Save.

Cursor on trajrmsd.dat. Notice that, the data will also get saved with the given filename.

You may change it if you desire.

I will leave it as is.

Click on md-1.gro in mol section. In the molecule section, click on md-1.gro file to select it.
Click on the RMSD.

Show the graph.

Click on the RMSD button and the graph is seen.

X axis is the frame number and y axis is the RMSD in the graph.

Click on File in graphics window. We can also export the data in many different formats as seen here.
Screenshot of the file manager and output file. The data can also be written to a text file in two column format.

Learner may use a plotting program of your choice to plot it.

Close the RMSD Trajectory Tool window. Let’s close this window.
Cursor on the graphics window. Now, let’s display frame 50 and align it to md-1.gro.
Close the RMSD graphical window.

Go to representations and click on Create Rep.

Go to representations window and create a new representation.
Go to the Trajectory tab. Go to the Trajectory tab.
Cursor on Draw multiple frames form, type, 50 and press Enter. In the Draw multiple frames form, type 50 and press Enter.

This shows the 50th frame also.

Open the RMSD Trajectory Tool. To align them, go back to the RMSD Trajectory Tool.
For Reference mol, choose Selected. For Reference mol, choose Selected.
Type 50 for Reference frame. Type 50 for Reference frame.
Uncheck the the box for Plot. I will uncheck the box for Plot.
Click on md-1.gro in mol section. In the molecule section, click on md-1.gro file to select it.
Click on ALIGN.

Show alignment in the graphical window.

Now, click on ALIGN and notice the two aligned molecules.
Click on FIle, Save Visualization State.

In the filename form, type filename analysis.vmd and click on Ok.

Click on FIle, Save Visualization State to save the state in VMD.

Enter file name analysis.vmd and retain the working directory to save the file.

Show the Options menu. Pause the video and explore the Options menu.

Various file formats and options are available for user convenience.

Screenshot of energy command options to export to xvg file. You may further, extract and plot more parameters.
Slide Number 12

Summary

Now let’s summarize. In this tutorial, we,
  • Completed 1ns Production run for lysozyme
  • About timescales of motions
  • Time taken for simulation
  • About output data and various types of data analysis
Slide Number 13

Summary

* Loaded the xtc trajectory file
  • About RMSD trajectory tool and analysis
  • Aligned a frame in the trajectory with a reference molecule
Slide Number 14

Assignment

For assignment activity, please do the following.
  • Open the log file and go through the details
  • Plot RMSD of trace (C-alpha) atoms
  • Create Ramachandran plot of the lowest and highest energy structure
Slide Number 15

Assignment

Using the energy command,
  • Plot solvent accessible surface area (SASA)
  • Explore gmxcheck, and ngmx commands
Slide Number 16

Spoken Tutorial Project

This video summarises the Spoken Tutorial Project.

Please download and watch it.

Slide Number 17

Spoken Tutorial workshops

We conduct workshops using spoken tutorials and give certificates.

Please write to us.

Slide Number 18

Forum for questions

Post your timed queries in this forum.
Slide Number 19

Acknowledgment

Spoken Tutorial Project is funded by MoE, Government of India.
This is Rani from IIT, Bombay. Thank you for joining.

Contributors and Content Editors

Ranipv076, Snehalathak