Using the LLview client

After starting LLview with the command

> llview

the main window of LLview came up and LLview will try to get data from the defined data source. If this is the first call of LLview and the system default configuration file contains no site specific information, a local configuration of the LLview is necessary. At least the data source option llqxml (executing LLview on the LoadLeveler machine) or WWW (accessing the data from a web server) has to be selected. Additionally the path to the llqxml command or the web address and authorization information have to be specified.

LLview Snapshot
Main window of LLview

The components of the main window

The main window is divided up in different areas. On top of the window is a combined menu bar and status bar. The lower part of the window is constructed like a 'notebook'. Here you can select between the Node display window, the Running job list, the Waiting job list, and the History window (if enabled).

The menu bar contains on the left side the File Option menu and on the right side the Help menu. The Change entry of the Option menu opens a new windows in which all elements of LLview can be configured (see the page "Managing Configuration Options ").

The status bar contains a entry for defining the step time between two updates. The automatic update of the information displayed by LLview can be disabled by the active option. A direct update of the information can be forced by the reload button (). The next entry allows to search for userids (by regexp) in the job list, and in the list of running and waiting jobs. The last three entries of the status bar shows the time of the last update, the time to the next update and the current selected data source. The time of the last update is the timestamp the XML data is recorded on the LoadLeveler machine. If the WWW data source is selected, an update event will only request the data from the www server again, this is independent from update of the data on the web server.

Node Display

The node display is the main element of LLview. It shows a graphical representation of the Cluster Nodes and the usage of their processors by the jobs running under the control of LoadLeveler. This windows contains following elements: usage bar,nodes,job list,info box and Statistics. Most of these elements are mouse pointer sensitive. This means that moving the mouse pointer over a processor box, a job list entry or a colored rectangle of the usage bar all other display elements highlights also the corresponding information. Moving the mouse over the machine picture in the node display shows the usage of this node in the Info box.

For each node of the IBM cluster the Nodes element of LLview displays the node name, the memory and cpu usage, the node state and for each processor a colored box, corresponding to the job running on this processor. The information about cpu and memory usage are derived from the LoadLeveler data access entries "LL_MachineLoadAverage", "LL_MachineFreeRealMemory64" and "ConsumableMemory". The dark blue part of the memory bar shows the real memory usage of this node and the light blue part the requested memory (ConsumableMemory). The Nodes element displays only processor boxes for active processor. All information about the available nodes is stored in the XML file generated by llqxml. Therefore are nodes which are temporarily not available also not visible in the node display.

The Job List on the left side of the node display contains the list of running jobs. Job information are the number of requested processor, the job owner, the consumed wall clock time of the job, the requested wall clock time, a flag indicating that the job is running under UNICORE control, the job class, the job specifiers and the estimated end time of the job. This information implies that the job scheduling in done on a wall clock time basis and the nodes are not used in a time shared mode. The job specifiers gives a more exact description about the number of started processes. The specifiers contains the number of nodes (n), number of processes on each node (p) and the number threads per process.
The sorting order of this list can be changed by clicking on the Header keyword of the corresponding column. Clicking again switch the direction of sorting order. This will indicated by a small triangle below the header.

The Usage bar on the top of the window shows the utilization of the whole machine. The jobs are marked as small rectangles and are sorted by job size. This element gives you a fast overview about the fragmentation of the machine. The white part of the usage indicates the free nodes the grey part indicates the number of process which are currently not available. The numbers behind the usage bar describe the usage in percent, the number of free processors and number of completely free nodes. The notsh entry describes the number of processors which are wasted by job running in the not_shared nodes usage mode.

The Info Box is an ASCII based display element which provides additional information about the object which is actually under the mouse pointer. Moving this pointer the information in the box will be automatically updated. The three boxes on the left side of the info box are mouse sensitive. Activating one of these shows in the info box the list of currently new jobs (IN), just finished jobs (OUT) and the top ten waiting jobs (WAIT) sorted by their system priority.

There are two statistics display elements: [new]
The first one shows different histograms of the current state of the scheduling system. The diagrams are fully configurable by selecting values for the x- and y-axis from a list of collected statistic data like job size, waiting time, job wall clock time or queue name. The x-axis will automatic scattered in value ranges. x- and y-axis can have linear or logarithmic (log2, log10) scaling. The screnshot above shows statistic window configured for 5 different diagrams which can be select by small rectangles on the left side of the diagram.
The second statistic (history) window shows the usage of the system for the last three days. The diagram shows th history of two values: the number of processors used by small jobs and the number of processors used by large jobs. Moving the mouse over this window shows the corresponding values in the info window.

The two additional windows of the node book Running and Waiting are ASCII base list of the currently running and waiting jobs in the LoadLeveler queues. The search pattern in the status bar will also be applied to this lists.

The note book window History shows the history of the machine usage in a graphical display. Therefore the information of the usage bar will be displayed vertically and appended to previous information.

LLview Snapshot
History window of LLview

There is a new window (Prediction) which is currently in a experimental state: [new]

LLview Snapshot
Job scheduling prediction window

The job scheduling prediction window show the result of a simulation of the scheduler. This simulation bases on the information stored in the XML delivered by the server part of LLview. LLview uses the priority value of each job, the current usage state of each node of the system and some global information about max starter in each job class and per user for this simulation. Futhermore LLview can simulate a Scheduler which is working in a Backfilling mode. This means that the scheduler select one of the waiting job as a Top dog which will be scheduled as soon as possible on the machine. All other jobs can be only be scheduled is this Top dog is not interfered.

The window shows a diagram with the time line in x-direction and the number of nodes in the y-direction. The colored job stacked on the left side of the diagram are the jobs currently running. The blue vertical line show the current position in the time lime. The blue jobs plotted on the right side of the diagram are jobs from the waiting queue. Its position shows the predicted starting time. The height of the job boxes correspond to the number of processors requested by this job. The length of the box is defined by the requested run time of the job (wall clock time limit).

Command Line Options

Following options are available when starting llview from command line:

llview [-source www|locdata|exec] -rcfile [-hist] -mc

-source sourcedefines the data source from which the data should be requested
www: from a web server
locdata: from a local tar file or local flat files
exec: execute the llqxml directly on the same host
-rcfile inifileuse this inifile for loading and saving the local configuration option
-histenable the history sub panel of the main window
-mcenable the multi cluster mode of LLview. The configuration file (default: .llview_mc.rc) defines a list of machines and coresseponding configuration files for this machines. [new]

Key Bindings

Following key bindings are available in the LLview window.

Mouse-3, Control-uData update/reload
Control-qExit
Control-oOpen/Close Option Panel
Control-pPrint node display in Postscript file ./llview.ps

last change 04.05.2005 | LLview | Print