TWiki> RmiGrid Web>BasicUsage (revision 11)EditAttach

How to submit simple jobs onto the Grid

On this page, we intend to provide a brief introduction on simple job submission onto the Grid. One can also find a brief introductory material on the Grid homepage of RMKI.

Log onto a User Interface ( _UI) machine

After you have logged onto a UI machine, you are able to submit commands to the Grid.

You should have your Grid user key files usercert.pem and userkey.pem under ~/.globus directory. If they are located, for whatever reason, under an alternative location, it can be told to the Grid applications by setting the following environmental variables, like:

export X509_USER_CER=/some_directory/globus/usercert.pem

export X509_USER_KEY=/some_directory/globus/userkey.pem (for bash), or

setenv X509_USER_CER /some_directory/globus/usercert.pem

setenv X509_USER_KEY /some_directory/globus/userkey.pem (for tcsh).

By default, a directory called .glite should be present in your home directory. If, for whatever reason you wish to relocate this directory, you should tell this to the Grid system using the following environmental variables:

export GLITE_USER_HOME=/some_directory/glite (for bash), or

setenv GLITE_USER_HOME /some_directory/glite (for tcsh).

By default, on every User Interface machine, the Grid Certificates of the trusted sites are located under the /etc/grid-security/certificates directory. If they are located, for whatever reason, under an alternative location, it can be told to the Grid applications by setting the following environmental variables:

export X509_CERT_DIR=/some_directory/certificates (for bash), or

setenv X509_CERT_DIR /some_directory/certificates (for tcsh).

To customize the Grid behavior, the environmental variable GLITE_WMSUI_CONFIG_VAR can be used to point to a configuration file, like:

export GLITE_WMSUI_CONFIG_VAR=/some_directory/glite_wms.conf (for bash), or

setenv GLITE_WMSUI_CONFIG_VAR /some_directory/glite_wms.conf (for tcsh).

The default version of the file glite_wms.conf may be obtained from /opt/glite/etc/glite_wms.conf on any UI machine. (However, the prefix /opt may be different on various platforms, if non-standard installation directory is used for GLite.)

Log onto the Grid (get authenticated on the Grid)

This means getting a so called user proxy. Commands are:

> grid-proxy-init Here, you will be prompted for your grid password. Or:

> grid-proxy-init -valid 04:00 This is the same, but the authentication will expire in 4 hours. The default (and maximum) lifetime is 12 hours.

If you are member of more then one VO, you can choose between them by using the glite-voms-proxy-init for logging in, instead of grid-proxy-init command. E.g.:

> glite-voms-proxy-init -voms hungrid Or:

> glite-voms-proxy-init -voms hungrid -valid 04:00

To get information on your user proxy, you can use the commands grid-proxy-info or glite-voms-proxy-info -all. You can destroy your user proxy by grid-proxy-destroy or glite-voms-proxy-destroy.

Get your jobs authenticated on the Grid

This means getting a so called job proxy. Commands are:

> myproxy-init -d -n Here, you will be prompted for your grid password.

Running myproxy-init is necessary when you are running long-term jobs. Having a job proxy ensures that your jobs still will be authenticated on the Grid, even though your user proxy (used to perform interactive Grid manupulations) may have had expired. You can get information on your job proxy by myproxy-info -d. You can destroy your job proxy by myproxy-destroy -d. The default (and maximum) lifetime of a job proxy is 168 hours.

Note: If you don't get a job proxy, you may not be able to retrieve your job outputs for long-term jobs!

Prepare and submit your job

The program which you want to run on the Grid is called a job. These consist of some executable(s) and some input(s), which can be submitted to the Grid system. The result shall be some output(s) and error(s), which can be retrieved after your job has finished.

The job specifications are described for the Grid system by the so called Job Description Language ( JDL). For each of your jobs, you should prepare a JDL file. An example for a typical simple JDL file content may be:

[

    JobType = "Normal"

    Executable = "testjob.sh";

    StdOutput = "testjob.stdoutanderror";

    StdError = "testjob.stdoutanderror";

    InputSandbox = {"testjob.sh", "inputfile.dat"};

    OutputSandbox = {"testjob.stdoutanderror", "outputfile.dat"};

    Requirements = (
                     Member("AFS", other.GlueHostApplicationSoftwareRunTimeEnvironment) && 
                     other.GlueCEPolicyMaxWallClockTime>=2160 &&
                     other.GlueHostMainMemoryRAMSize>=512
                   );

]

The meaning of the above variables are:

JobType
This optional variable describes whether your job is a normal job ( "Normal"), or interactive ( "Interactive"). If your job is interactive, the StdIn, StdOut and StdError shall be connected to the terminal, from where you submitted the job, so you are able to communicate with the job during the running time. If unspecified, defaults to "Normal".
Executable
This variable specifies the executable file of your job.
StdOutput
The StdOut of your program shall be written into this file.
StdError
The StdError of your program shall be written into this file.
InputSandbox
This is a list of files, which are sent to the system as the components of your job. Typically the executable of your program, and some supplementary files. The size of the files, sent via the InputSandbox, should be small (<10MegaBytes). Large files (as large input data files) should be communicated to the job by other ways, e.g. via AFS, NFS, or http (using e.g. wget), or via the Grid Storage System.
OutputSandbox
This is a list of files, which are retrieved after the job has finished. Typically the file containing the StdOut / StdError and some output files. The size of the files, retrieved via OutputSandbox, should be small (<100MegaBytes). Large files (as large output data files) should be transfered by other means, e.g. via the Grid Storage System.
Requirements
This optional variable may be a logical expression, specifying requirements for site or the node, where the job is going to be executed. Member("Some_software", other.GlueHostApplicationSoftwareRunTimeEnvironment) means the requirement of the software Some_software on the target node. other.GlueCEPolicyMaxWallClockTime>=running_time means the requirement for such queues, where the job execution time limit is larger than the specified running_time (in minutes). The requirement other.GlueHostMainMemoryRAMSize>=memory means requirement for such execution nodes, which have larger memory than the specified memory amount (in MegaBytes). A requirement of the form other.GlueCEUniqueID=="grid109.kfki.hu:2119/jobmanager-lcgpbs-hungrid" would mean the requirement that the job should be sent to the computing element (queue) grid109.kfki.hu:2119/jobmanager-lcgpbs-hungrid.

Once you prepared the JDL file, you can look for the available queues, which are capable of running your job, by the command:

> glite-wms-job-list-match  --vo your_vo  testjob.jdl
This will return a list of Grid queues (computing elements), which are capable of executing your job.

The job can be submitted by the command:

> glite-wms-job-submit  --vo your_vo  -a  testjob.jdl
This will return a sURL address, which is a unique identifier of your job, which shall be denoted by jobID in the followings.

The status of the job can be viewed by:

> glite-wms-job-status  jobID
This will return the current status of your job.

If your job has failed to be ran by the Grid system, the logging may be retrieved by:

> glite-wms-job-logging-info  jobID
This will return the logging info on your job. A convenient way to find out failure reasons is:
> glite-wms-job-logging-info  -v 3  jobID  |  grep  "reason"  | uniq
This will return all the available most detailed logging info on your job, and shall print lines containing the string "reason", furthermore shall suppress multiple printing of consecutive identical lines.

If your job has properly finished, you can retrieve the outputs by the command:

> glite-wms-job-output  jobID
This will retrieve the content of the OutputSandbox into the directory /tmp/jobOutput/yourusername_jobID. It is also possible to specify some other directory name by the --dir switch.

For further information, look at the man pages of the above commands, and maybe also to the man pages of other glite-wms- commands. For further references on simple job submission, see https://edms.cern.ch/file/722398//gLite-3-UserGuide.pdf. Also a complete description of the JDL language is available there.

-- AndrasLaszlo - 10 Jul 2009

Edit | Attach | Watch | Print version | History: r15 | r13 < r12 < r11 < r10 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r11 - 2009-07-10 - AndrasLaszlo
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback