LoadLeveler on JUBL
IBM Tivoli Workload Scheduler LoadLeveler (Version 3.4.0) is used as batch system on BlueGene/L.
Using LoadLeveler
Job submission to LoadLeveler is done using a job command file. The job command file is a shell script containing keywords embedded in comments beginning with # @. These keywords inform LoadLeveler of the resources required for the job to run, the program to execute, where to write output files and the job environment.Two sample job scripts can be found
Most of the keywords are the same as used for LoadLeveler scripts on JUMP. But there are some BlueGene specific things:
- You have to indicating your job as an BlueGene job with # @ job_type = bluegene. Otherwise the job is executed as a serial job on the login node without allocating a BlueGene partition.
- The size of a job has to be specified by using # @ bg_size OR # @ bg_shape.
- The bg_size keyword specifies the number of compute nodes the job should use. BlueGene/L only allows partitions including 32, 128 and multiples of 512 compute nodes. Thus bg_size of 1 specifies a partition of size 32 and bg_size of 129 specifies a partition of size 512.
-
The bg_shape keyword specifies the shape of the partition at the base partition (midplane) level, not at the compute node level. A bg_shape value 1x2x1 means 1 base partitions in the x direction, 2 in the y direction and 1 in the z direction, which are two midplanes = 1024 compute nodes. bg_shape defines the logical dimensions of your partition. For an efficient scheduling LoadLeveler may allocate physically one of three permutations (1x2x1, 2x1x1, 1x1x2) and ensures the correct mapping of the MPI-tasks.
If - and only if - you are using your own mapfile (-mapfile option in the mpirun command) or your application relies on a correct physical size of the partition you have to use the bg_rotate = FALSE keyword together with bg_shape. This indicates LoadLeveler that only the requested shape satisfies the job requirement.
-
The topology of the partition can be specified with the bg_connection keyword, which can be one of the three values: MESH (default), TORUS and PREFER_TORUS.
This choice can have a big influence on the performance of your application. In case of doubt always add# @ bg_connection = TORUS
to your job script.
A detailed description of the BlueGene specific keywords and a
table of
the core general keywords is give here:
Job File Keywords
On a BlueGene/L system, the program to execute is always the mpirun command. In other words, preparing a job for submission requires you to create a job command file that passes the appropriate arguments to the mpirun command. There are two ways to specify the application, LoadLeveler should execute:
- Adding the mpirun call after the # @ queue statement in the job file. (s. Sample 1)
- Using the executable
and arguments keywords:
# @ executable has to point to mpirun and # @ arguments keeps all the arguments for the mpirun call. This case is shown in Sample 2
Since LoadLeveler automatically selects the appropriate partition to run the job on, the –partition option should not be specified in the mpirun command.
The number of MPI-tasks can still be controlled with the -np
option, the execution
mode (coprocessor mode / virtual node mode) is specified with -mode CO
or -mode VN inside the argument list. A
detailed description of the ralation between bg_size/bg_shape, the
number of allocated compute nodes and the number of MPI-tasks can be
found in the
FAQ's.
Submitting a LoadLeveler Job
Jobs are submitted with-
llsubmit <jobfile name>
Some useful LoadLeveler commands are listed here:
| Command | Short Description | Man page |
| llsubmit | Submits a job to LoadLeveler. | >> |
| llq | Shows queued and running jobs | >> |
| llq –b | Shows BlueGene jobs. | >> |
| llcancel <job_id> | Delete a queued or running job. | >> |
| llstatus | Displays system information. | >> |
| llclass | Shows information about defined classes. | >> |
Interactive Parallel Applications
To start an interactive parallel application, use the FZJ specific procedure:llrun [llrun_options] <mpirun_options>
For more information on llrun see llrun.
Documentation
See also: LoadLeveler Documentationlast change 13.08.2007 |
