Introduction To Computing: Difference between revisions

From MDWiki
Jump to navigationJump to search
Line 106: Line 106:
== How to check the Jobs on Q ==
== How to check the Jobs on Q ==
   
   
The basic commands to check the jobs on ques <code>/<code>
The basic commands to check the jobs on ques <code></code>


qstat  
qstat  

Revision as of 05:48, 12 June 2009

Introduction

Hi there! This page is designed to give you a general introduction to computing at the MD lab. There are a number of topics.

General Policy

The computers are there to share. Anyone should be able to login to any computer, so please logout when you are not using the computer (within reason) - in any case, don't lock your screen. Please don't switch off the computers. If there is a problem with the computer, such as the screen freezing, it is probably a problem with the network. If this occurs, please talk to a post-doc rather than just rebooting the computer. If the computer needs to be shutdown or rebooted, whether there is a problem or not, you should check to see if anyone is running a job on the computer using the top command and you should check to see if anyone is logged in using the who command.

Using Linux

Linux may be quite new to you, so a short introduction is provided here. Linux, like Microsoft Windows, is an Operating System, which means it is a priviledged program that manages the hardware and software on your computer. You will probably log on into a Graphical Desktop Environment such as GNOME, which should look similar to other Desktop Environments you have used in the past. The most important part of the Linux desktop is in fact a much older technology called the terminal. By selecting the Applications->Accessories (or Applicaitions->System Tools) menu you can click on the "Terminal" link, which brings up a Terminal emulator program, which runs, inside it, a shell (your shell will probably be the program /bin/bash). This looks like this:

Terminal.png

Commands

Try running a command. ls will list the files in the current working directory. A good list of commands may be found here.

Shells

There are different families of shells, such as /bin/tcsh and /bin/bash. The main different is the syntax used for shell scripts. Type echo $SHELL to work out which shell you have been given. A system administrator can change it for you.

Let's assume you are using bash. Whenever you login via SSH to a computer, the file called .bash_profile in your home directory is read and the commands within it are executed by the new shell (Type echo $HOME to see the location of your home directory). However, when you open a new Terminal window the file .bashrc in your home directory is instead read and executed by the new shell. It is recommended, therefore, that you include a command within your .bash_profile to read in the commands in the .bashrc file so that your shells are always initiated consistently. You can use nedit to edit .bash_profile to cause .bashrc to also be executed

if [ -f $HOME/.bashrc ]; then
  . $HOME/.bashrc;
fi;

One more thing. A shell script is a file and is basically equivalent to you typing commands directly into the terminal. A shell script looks this:

#!/bin/bash

echo "This is a shell script!"

The first line tells your running shell to use /bin/bash to run the script. The file itself (let's call it "test.sh") has to be executable to be able to run it from the command line like so:

[matt@uqmd13 ~] echo "This is a shell script!"
This is a shell script!
[matt@uqmd13 ~] ./test.sh
This is a shell script!
[matt@uqmd13 ~] ls -l test.sh
-rwx------ 1 matt MDGroup 44 2009-05-08 15:54 test.sh
[matt@uqmd13 ~]

A file is normally created not executable (-rw-------) but can be made executable using chmod +x FILENAME.

PATH Environment Variable

The PATH is an environment variable (i.e. a special variable given to a program from its parent, or caller) that is specified to contain a list of colon-separated directories through which the shell should look to find a particular executable file to execute it. If I type echo $PATH in my shell, I get the following output:

/home/matt/bin:/usr/local/bin:/home/matt/tmp/local/bin:/home/matt/bin:/usr/local/bin:/home/matt/tmp/local/bin:/usr/lib64/qt-3.3/bin:/bin:/usr/bin

This means that if I type, for example, ls it will search through the above list of directories until it finds an executable file with name "ls", where it then stops and executes it (it finds it in /bin/). If I ever have a situation like:

[matt@uqmd13 ~] mdrun
bash: mdrun: command not found

This means that the shell can't find a program called mdrun in any of the directories specified in the PATH variable. To add a directory to the path, simply do

[matt@uqmd13 ~] export PATH=$PATH:/marksw/gromacs/3.3.3/bin
[matt@uqmd13 ~] mdrun
                         :-)  G  R  O  M  A  C  S  (-:
<...truncated...>

I recommend putting

. /marksw/BASHRC
module gromacs/3.3.3 gcc/4.3.2 fftw/3.1.2-singleprecision lammpi/7.1.4

into your .bashrc file. The . /marksw/BASHRC adds a function to your shell called, appropriately, module that basically adds the directory /marksw/$i/bin to your PATH for each argument $i given to it. The module command below it adds in a few useful directories, including the one containing GROMACS 3.3.3 compiled in single precision.

Other programs are provided for you within the /marksw directory. To list the available programs type module avail. Inside each software subdirectory is information about how the software was built.

How to Submit a Job onto a Cluster

A discussion on how to use GROMACS is given later. To submit a mdrun job to the cluster (e.g. merlot), a shell script is recommended:

# This is a comment!
# This file is called "job.sh"
# These are special commands to the queuing system. 1) Which shell to use; 2) job name; 3) nodes & processors per node requested; 4) cluster/queue name
#PBS -S /bin/bash
#PBS -N JOBNAME!!!
#PBS -l nodes=1:ppn=2
#PBS -q merlot 

. /marksw/BASHRC
module gromacs/3.3.3 gcc/4.3.2 fftw/3.1.2-singleprecision lammpi/7.1.4
# This environment variable is needed by grompp
GMXLIB=/marksw/gromacs/3.3.3/share/gromacs/top

# Start MPI system (LAM-MPI is used), giving LAM a name of a temporary file (in the variable PBS_NODEFILE) containing a list of nodes to execute on
lamboot -d $PBS_NODEFILE

cd /my/working/directory
# Run grompp, specifying a run on 2 processors using MPI
grompp -f md.mdp -c INPUT_STRUCTURE.gro -p INPUT_TOPOLOGY_DESC.top -o OUTPUT_TOPOLOGY.tpr -np 2
mpirun -np 2 mdrun_mpi -np 2 -s OUTPUT_TOPOLOGY.tpr -o -x -c -e -g

# Stop MPI system
lamwipe $PBS_NODEFILE

To submit this script to the batch system, run qsub job.sh. When it becomes your turn to run, this script will be automatically executed on one of the nodes for 48 hours. As it is called within this script, LAM organises the parallel execution of your mdrun process.

How to Submit a Jobs to a Workstations

The workstations are configured like the cluster which means that you can run jobs on them. To run a job on a workstation you can't use qsub, rather just execute mdrun directly in your shell. Please make sure that at least one core is free at all times otherwise the person sitting in front of the computer will get poor performance. You can check the load on the computer using the top command.

How to check the Jobs on Q

The basic commands to check the jobs on ques

qstat

showq

Compilers

Most software these days is written in high-level languages like C, C++ or Python, etc., which makes it easier for the programmer to write and debug their software. A compiler is a program to translate a high-level language into a lower-level language, which can then be easily translated into binary code that can run directly on the CPU. If you build software from source code, i.e. convert from a high-level language into a binary code, you will use a compiler. Most source code packages contain instructions or automated scripts to compile the code, and often a file named INSTALL in the base directory of the package will give you more information (and contains probably all you need to know to compile something). Some more detailed sources of information about compiling and compilers may be found at the sites below:

Note: When you are developing or modifying software you need to make sure that the binary you are executing is built in the way you expect, i.e. not using binary code that has not been updated if you have made modifications to its original source code.

Moving/Copying Files

We recommend using rsync to copy files, using the --bwlimit=1024 option. For example, to copy files from /home/USER/dir to /melon1/USER/dir,

rsync -avP --bwlimit=1024 /home/USER/dir/ /melon1/USER/dir/

It is safer to copy then delete, rather than move (using mv) because if mv dies some files will be lost if they are in the middle of being transferred when mv dies.

Using GROMACS

Online GROMACS manual

You can find the online GROMACS manual at http://wiki.gromacs.org/

GROMACS tutorials

http://compbio.chemistry.uq.edu.au/education/mdcourse/index.html

more tutorials are available at http://www.gromacs.org/content/view/137/182/

GROMOS tutorial on the local machines

You can find the GROMOS tutorial in /home/student/exercise/. There are four exercises and the explanation is saved in /home/student/exercise/Tutorial.

To perform the exercises, it is advised to copy the exercise folder to your local directory and run the jobs locally. You can use cp -r SOURCE DESTINATION to copy a directory.