Introduction To Computing

From MDWiki
Revision as of 06:35, 8 May 2009 by Matt (talk | contribs)
Jump to navigationJump to search

Introduction

Hi there! This page is designed to give you a general introduction to computing at the MD lab. There are a number of topics.

General Policy

The computers are there to share. Anyone should be able to login to any computer, so please logout when you are not using the computer (within reason). Please don't switch off the computers. If there is a problem with the computer, such as the screen freezing, it is probably a problem with the network. If this occurs, please talk to a post-doc rather than just rebooting the computer. If the computer needs to be shutdown or rebooted, whether there is a problem or not, you should check to see if anyone is running a job on the computer using the top command and you should check to see if anyone is logged in using the who command.

Using Linux

Linux may be quite new to you, so a short introduction is provided here. Linux, like Microsoft Windows, is an Operating System, which means it is a priviledged program that manages the hardware and software on your computer. You will probably log on into a Graphical Desktop Environment such as GNOME, which should look similar to other Desktop Environments you have used in the past. The most important part of the Linux desktop is in fact a much older technology called the terminal. By selecting the Applications->Accessories (or Applicaitions->System Tools) menu you can click on the "Terminal" link, which brings up a Terminal emulator program, which runs, inside it, a shell (your shell will probably be the program /bin/bash). This looks like this:

Terminal.png

Commands

Try running a command. ls will list the files in the current working directory. A good list of commands may be found here.

Shells

There are different families of shells, such as /bin/tcsh and /bin/bash. The main different is the syntax used for shell scripts. Type echo $SHELL to work out which shell you have been given. A system administrator can change it for you.

Let's assume you are using bash. Whenever you login via SSH to a computer, the file called .bash_profile in your home directory is read and the commands within it are executed by the new shell (Type echo $HOME to see the location of your home directory). However, when you open a new Terminal window the file .bashrc in your home directory is instead read and executed by the new shell. It is recommended, therefore, that you include a command within your .bash_profile to read in the commands in the .bashrc file so that your shells are always initiated consistently. You can use nedit to edit .bash_profile to cause .bashrc to also be executed

if [ -f $HOME/.bashrc ]; then
  . $HOME/.bashrc;
fi;

One more thing. A shell script is a file and is basically equivalent to you typing commands directly into the terminal. A shell script looks this:

#!/bin/bash

echo "This is a shell script!"

The first line tells your running shell to use /bin/bash to run the script. The file itself (let's call it "test.sh") has to be executable to be able to run it from the command line like so:

[matt@uqmd13 ~] echo "This is a shell script!"
This is a shell script!
[matt@uqmd13 ~] ./test.sh
This is a shell script!
[matt@uqmd13 ~] ls -l test.sh
-rwx------ 1 matt MDGroup 44 2009-05-08 15:54 test.sh
[matt@uqmd13 ~]

A file is normally created not executable (-rw-------) but can be made executable using chmod +x FILENAME.

PATH Environment Variable

The PATH is an environment variable (i.e. a special variable given to a program from its parent, or caller) that is specified to contain a list of colon-separated directories through which the shell should look to find a particular executable file to execute it. If I type echo $PATH in my shell, I get the following output:

/home/matt/bin:/usr/local/bin:/home/matt/tmp/local/bin:/home/matt/bin:/usr/local/bin:/home/matt/tmp/local/bin:/usr/lib64/qt-3.3/bin:/bin:/usr/bin

This means that if I type, for example, ls it will search through the above list of directories until it finds an executable file with name "ls", where it then stops and executes it (it finds it in /bin/). If I ever have a situation like:

[matt@uqmd13 ~] mdrun
bash: mdrun: command not found

This means that the shell can't find a program called mdrun in any of the directories specified in the PATH variable. To add a directory to the path, simply do

[matt@uqmd13 ~] export PATH=$PATH:/marksw/gromacs/3.3.3/bin
[matt@uqmd13 ~] mdrun
                         :-)  G  R  O  M  A  C  S  (-:
<...truncated...>

I recommend putting

. /marksw/BASHRC
module gromacs/3.3.3 gcc/4.3.2 fftw/3.1.2-singleprecision lammpi/7.1.4

into your .bashrc file. The . /marksw/BASHRC adds a function to your shell called, appropriately, module that basically adds the directory /marksw/$i/bin to your PATH for each argument $i given to it. The module command below it adds in a few useful directories, including the one containing GROMACS 3.3.3 compiled in single precision.

Other programs are provided for you within the /marksw directory. To list the available programs type module avail. Inside each software subdirectory is information about how the software was built.

How to Submit a Job onto a Cluster

A discussion on how to use GROMACS is given later. To submit a mdrun job to the cluster (e.g. merlot), a shell script is recommended:

# This is a comment!
# This file is called "job.sh"
# These are special commands to the queuing system. 1) Which shell to use; 2) job name; 3) nodes & processors per node requested; 4) cluster/queue name
#PBS -S /bin/bash
#PBS -N JOBNAME!!!
#PBS -l nodes=1:ppn=2
#PBS -q merlot 

. /marksw/BASHRC
module gromacs/3.3.3 gcc/4.3.2 fftw/3.1.2-singleprecision lammpi/7.1.4
# This environment variable is needed by grompp
GMXLIB=/marksw/gromacs/3.3.3/share/gromacs/top

# Start MPI system (LAM-MPI is used), giving LAM a name of a temporary file (in the variable PBS_NODEFILE) containing a list of nodes to execute on
lamboot -d $PBS_NODEFILE

cd /my/working/directory
# Run grompp, specifying a run on 2 processors using MPI
grompp -f md.mdp -c INPUT_STRUCTURE.gro -p INPUT_TOPOLOGY_DESC.top -o OUTPUT_TOPOLOGY.tpr -np 2
mpirun -np 2 mdrun_mpi -np 2 -s OUTPUT_TOPOLOGY.tpr -o -x -c -e -g

# Stop MPI system
lamwipe $PBS_NODEFILE

To submit this script to the batch system, run qsub job.sh. When it is your turn to run, this script will be automatically executed on one of the nodes for 48 hours. LAM organises the parallel jobs.

Jobs on Workstations

The workstations are configured like the cluster which means that you can run jobs on them. To run a job on a workstation you don't need to use qsub, just execute the script directly in your shell. Please make sure that one core is free at all times (otherwise the person sitting in front of the computer will get poor performance). You can check what is running on the computer using the top command.


Using GROMACS

Online GROMACS manual

You can find the online GROMACS manual at http://wiki.gromacs.org/

GROMACS tutorials

http://compbio.chemistry.uq.edu.au/education/mdcourse/index.html

more tutorials are available at http://www.gromacs.org/content/view/137/182/

GROMOS tutorial on the local machines

You can find the GROMOS tutorial in /home/student/exercise/. There are four exercises and the explanation is saved in /home/student/exercise/Tutorial.

To perform the exercises, it is advised to copy the exercise folder to your local directory and run the jobs locally.