Sun Grid Engine (SGE) is an open source batch queueing system. SGE enables both the users and the computational cluster to get the most out of available resources. Users can treat the collection of compute nodes like a single system, and submit jobs from any of the nodes. SGE will automatically run jobs on less loaded nodes, regardless of which node the jobs were submitted on; SGE will also queue jobs for later execution to avoid overloading available resources and causing the entire system to operate poorly for everyone.
From the user perspective, there are two major components to a batch queue: the node hardware and the job execution queues. The hardware determines the available computational capacity. The queues define the system job-execution policy, e.g., how many jobs can run simultaneously and how much CPU time a job can consume. Typically there are multiple queues, each defining a different policy, and users can choose which queue to use for a particular job.
For the CGL socrates cluster, we have created a single queue for running jobs. There are no time or memory limits imposed by SGE for jobs in this queue. There is also no user-specific limits on the number of jobs, either active or submitted. However, we expect users to demonstrate discretion when using SGE. Running a few active jobs at once when usage is light is probably fine; submitting 50 jobs at once and locking out other users will result in queue reconfiguration, such as defining per-user limits.
If the output ends in "sh", "ksh" or "bash", you are using a Bourne-shell compatible shell; if it ends in "csh" or "tcsh", you are using C-shell compatible shell. (For the remainder of the document, "Bourne shell" refers to any of the Bourne-shell compatible shells, and "C shell" refers to any of the C-shell compatible shells.)> echo $SHELL
If you are using Bourne shell, give the command:
If you are using C shell, give the command> . /usr/local/sge/CGL/common/settings.sh
Neither command generates any output. All they do is make the SGE commands available without requiring you to prefix them with "/usr/local/sge/bin" each time. They also set up default SGE job submission parameters, which may be overridden when you submit jobs. If you use SGE frequently, you can put the command in the ".profile" or ".cshrc" (for Bourne shell and C shell respectively) in your home directory, and it will be executed automatically when you log in.> source /usr/local/sge/CGL/common/settings.csh
Once you've set up your SGE environment, you need to create a shell script file that contains the commands to run your job. Here is an example job script file that simply echoes a string to the standard output and then prints the current working directory:
You can use either Bourne or C shell syntax, but you must select the shell to use on the first line; in this case, we are using Bourne (sh) shell syntax. The shell script may be named anything you wish. For this example, let's call the file "script.sge". To submit this script for execution in SGE, run the command (the "> " represents the shell prompt, you do not need to type it)#!/bin/sh ## The next line is an instruction to SGE: it tells SGE to email ## you when your job "b"egins, "a"borts, and "e"nds. #$ -m bae echo "Hello world" pwd ## end of batch script
If you create the file "script.sge" with the contents above, and "qsub" it, you will almost immediately get an email message saying that the job has been started, and almost immediately after, that the job has completed. If a lot of other people have jobs running, your job might not start immediately. You can check on the status of your job by using the "qstat" command:> qsub script.sge
In this case, only one job is running: its id is "240"; the command script is "script.sge"; and it's being run for user "conrad" (this, of course, will be your login name instead of mine when you try this). The "state" of the job is "r" which is short for "running"; it started running on January 30 around 10AM; and it is running in queue "all.q" on node "adenine.cgl.ucsf.edu". If the job has not yet started running, the state will be "qw", short for "queued and waiting".> qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 240 0.25000 script.sge conrad r 01/30/2006 10:12:49 all.q@adenine.cgl.ucsf.edu 1
It is, however, possible to override this behavior on the socrates cluster because all directories are shared across all nodes. So, instead of running jobs in your home directory, you can ask SGE to run the job in the directory where you submit your job. To do this, you can add the following line to the beginning of the script:
and our example script will look like:#$ -cwd
This will cause the script to execute in the directory from which the job was submitted. The output and error files will also be deposited in job-submission directory.#!/bin/sh #$ -cwd ## The next line is an instruction to SGE: it tells SGE to email ## you when your job "b"egins, "a"borts, and "e"nds. #$ -m bae echo "Hello world" pwd ## end of batch script
You also need to make sure that you are using the SACS version of C shell, which sets up access permissions to SACS tools and databases. This is done by making the first line of your script:source /usr/local/lib/seq/seqpaths source /usr/local/lib/seq/seqenvirons
The remainder of your script should then be able to use SACS commands the same way as if you were logged in. (We've only given the C shell solution for using SACS tools because all SACS users use C shell by default. If you are a SACS user using Bourne shell and are having difficulties, please contact a SACS staff member for help.)#!/sacs/shells/csh
It provides information on what environment variables to use, what options are available, how input and output streams are handled, and much more.man submit
For even more SGE documentation, visit the RBVI Sun Grid Engine page.
Laboratory Overview | Research | Outreach & Training | Available Resources | Visitors Center | Search