Distributed Computation of Climate- and Weather Models

Partners: 
  • AWI - Alfred-Wegener Institute for Polar and Marine Search
  • GMD - National Research Center for Information Technology Sankt Augustin GmbH,

  •    Institute for Algorithms and Scientific Computing 
  • DKRZ Deutsches Klimarechenzentrum 
  • Contact Persons:  Dr. W. Hiller, Dr. W. Joppich
     
    Index:
     
    Goals The Atmosphere model
    The Ocean model The Coupler
    MetaMPI Implementation Future Application
    Computational Requirements Work in Progress


    *  Goals
    by Wolfgang Hiller (AWI)

    In the context of climate system models (CSM) the coupling of different physical models representing the components of the climate system has to be organised, since  a full climate model  needs an ocean model as well as a model for the atmosphere, the sea/land-ice and the biosphere. Those models exist seperately but combining them to one big programm (coupling them internally) leads to a computationally and memory demanding overall system.  Therefore different approaches have been worked out in the past (NCAR's CSM, Cerfac's OASIS, GMD's CISPAR and GRISSLi, just to mention some). The coupling of the different computers by a fast (gigabit) network is a prerequisite for these solutions.

    In this project we want to gain experience in the distributed computation of coupled climate models. In our case we will couple the atmospheric model IFS (Integrated Forecasting System, ECMWF, Reading) and the ocean model MOM-2 (Modular Ocean Model, GFDL, Princeton) by external couplers. Both IFS and MOM-2 serve only as technical prototypes for atmosphere and ocean model components, available within the scientific communitiy.

    Other models could have served as prototype models as well, however an optimized parallel version  of IFS and MOM-2 was available at DKRZ, GMD and AWI. So these models have been used to study the technical aspects of coupling subsystem models.

    The coupling will be realized as follows: the IFS program will run on the SP2 of the GMD and the MOM program will run on the Cray T3E of the research center in Jülich. The exchange of data will be realized by explicit message passing with MetaMPI via the gigabit testbed.

    * The Ocean model
    by Bernadette Fritzsch (AWI)

    The code of the ocean model is based on Modular Ocean Model (see Pacanowski, R.C.,  MOM 2 Documentation, User's Guide and Reference Manual. Tech. Rep. 3, 1995, GFDL Ocean Group.). It was developed at GFDL in Princeton and is used by a large community of ocean modellers. The ice model is internally coupled basing on Hibler (see Hibler, W.D. A Dynamic Thermodynamic Sea Ice Model, Journ. Phys. Ocean. 9, p. 815, 1979).
     

    Picture of a temperature distribution by MOMplatzVector field on the globe
     

    Configuration
    A global model configuration is used which works on a T63 grid with 29 layers (194 x 92 x 29 grid points). The topographie was derived from ETOPO5 and was modified in some small areas around Greenland-Scotland-ridge, the west end of mediterranean sea and in the  arctic region. To avoid problems with the singularity on the north pole a single land point is set there.

    Scalability and performance
    The model was parallelised using a data decompositioning technique which leads to a data parallel programming model. The communication between processing elements is made via the shmem library of SGI/CRAY.

    In the model we find the following routines which are time critical:

    The percentage of CPU time each routine consumes depends on the configuration of the model. The most time consuming  parts are in the ocean model. The performance data are measured for 16 PEs on a T3E-600 at AWI.
     
    routine  percentage 
    of CPU time
    performance 
    total (Gflop)
    performance 
    per PE (Mflop)
    TRACER 
    35% 
    1.05
    69.0
    CLINIC 
    2%
    0.71
    44.3
    TROPIC
    11%
    0.53
    35.2
    DRIFT
    16%
    0.83
    51.6

    Because the ice-covered region is significantly smaller than the ocean, the model's scalability is limited here.
     

    Scaling of the ocean/ice model with number of PEs
     
     

    * The Atmosphere model
    by Johannes Quaas (GMD)

    There exist several global atmospheric models which could have been selected for the project. The Integrated Forecasting  System (IFS) of the European Center for Medium Range Weather Forecast (ECMWF) in Reading has been chosen because it is already parallelized using the message passing interface (MPI) and because GMD-SCAI has experience in running this program on different platforms, especially on the IBM SP2 in GMD-SCAI.
     

    Configuration
    The IFS is a Spectral Model. The partial differential equations are solved using spectral transform techniques. Different parts of the algorithm use different data spaces like grid point space, Fourier space, and spectral space. As the data dependences change the data has to be exchanged among the parallel processes (transposition technique).
    There is a limited number of standard resolutions which are used for climate simulation and weather forecast, namely T21, T42, T63, T106, and T213. The T denotes the way of truncating the fourier sums, the number (21, 42, ...) stands for the wavenumber of the highest resolved spherical harmonics on a given Gaussian grid, and therefore also represents a spatial resolution. The number of vertical levels L can be chosen appropriately: typical values are 19 or 31. Adapted to the scale of the ocean model the resolution standards T63L19 or T63L31 will be used within the project.

    The way of coupling
    For long-term calculations the IFS uses climatological mean values from files. The quantities needed typically describe the surface (whether there is soil, ocean, ice or snow), the surface temperature, the evaporation and the heat fluxes. Instead of these mean values from files the data derived from the other models, for instance from the ocean model, will be used.

    IFS on the SP2
    The IFS is implemented on the SP2 at the GMD. Most of the above mentioned resolutions have been used here for different forecast periods.
     

    * The Coupler
    by Wolfgang Joppich (GMD)

    The external couplers CSM flux coupler of NCAR, OASIS of CERFACS, the coupling library COCOLIB of the GMD project CISPAR, and the GRISSLi coupling interface of the GMD project GRISSLi were investigated. The very general approach of both GRISSLi coupling interface and COCOLIB is highly attractive. Nevertheless, the principal parallelism of these interfaces will not be exploited totally due to the configuration of the gigabit connection and due to the software architecture of the parallel applications. The sequentiality of both OASIS and CSM, which in principle is a drawback, is not essential in this application -- it helps to keep the software structure of the atmospheric model and the ocean model as unchanged as possible. Further, the particular coupling interfaces like OASIS and CSM have been developed especially for the given problem. Finally, due to the special hardware configuration and the software used within the gigabit connection the CSM Flux Coupler  by NCAR (Boulder) appeared most promising. It was originally designed to couple an atmospheric model to models for ice, ocean and land. The version 4 uses MPI for communication and is therefore portable and easy to use.

    Scheme of the flux coupler's structure

    Each symbol in this diagramm represents a process. They can be located on different computing systems. The modification of the CSM flux coupler to the situation of the project consists in cutting the left and right process for land and ice model and replacing atmosphere and ocean by the parallel IFS and MOM.
     

    * The MPI Implementation
    by Stephan Frickenhaus (AWI) and  Olaf Heudecker (AWI)

    The standard MPI implementation on the MPP Cray T3E does not support neither the communication between different job groups (distinct programs) and nor between different computers. For this reason another implementation of MPI has to be used.

    MetaMPI
    Just in time for this project Pallas GmbH developed a Meta-MPI  that allows multi-process and multi-computer MPI communication for the Cray T3E and the IBM SP2. Performance tests of Meta-MPI were carried out across the Gigabit Testbed West, yielding a transfer rate of upto 93 MBit/s.

    Performance of MetaMPI across GTBW
    During these tests a concept has been developed for splitting up the basic communicator into communicators for the parallelisation of the models and a communicator for the coupling. This concept is implemented in a F90-module, which is used in common by the models as well as the coupler. Meta-MPI is now successfully applied for message passing in MOM, IFS as well as the CSM-coupler.

    * Work in Progress
    Having fixed the necessary surface boundary data to exchange between the models, at the moment we plug in the communications into the atmosphere model IFS, the ocean/ice model MOM2 and the CSM-coupler.

    *  Future Application
    Simulation by René Redler (AWI / University Kiel) , visualisation by Udo Göbel (AWI)

    To give an example of the high variabilty of ocean currents as simulated by high resolution ocean models the calculated sea surface height in the Atlantic is shown as an animation below. The model for this simulation is based on a similar parallel version of the ocean model used within the framework of this project. The model data shown for this simulation have been calculated using climatological data (monthly means) for the atmospherical forcing at the ocean surface. With the general availability of model coupling techniques developed in the scope of this project, coupled runs with an atmospheric model component could be done operationally in the future addressing the effect of higher temporal variability in the ocean due to atmospheric forcing e.g. on biological processes in the upper watercolumn.

    The model encompasses the Atlantic between 70° S and 70° N. The horizontal resolution is 1/3°  in latitude and 1/3°  time cos(latitude) in longitude, the vertical is discretized in 45 levels with a minimum resolution of 10 m in the uppermost levels increasing to 200 m below a depth of 1000 m.

    The model configuration shown in the animation used a restoration toward climatological values for temperature and salinity along the artificially closed northern and southern boundaries. Likewise the impact of Mediterranean water through the Strait of Gibraltar is handled via restoration toward climatological values. In- and outflow across the western boundary (Drake Passage) and eastern boundary (30° E) is realized using open boundary conditions which allow a flux across these boundaries.

    A more physical interpretation of what is shown in the animation can be summarized as follows:

    If the ocean were at rest the sea surface elevation would coincide with the geoid. (The geoid is defined as the equipotential surface of the earth's gravitation field.) On such a surface no work against the gravitation force is required to move a particle from point A to B.

    Deviations of the sea surface from the geoid are induced by ocean currents. These deviations reach maximum values of more than 1 m in strong western boundary currents like the Gulf Stream. In the ocean interior, however, the sea surface elevations induced by dynamical processes are in the order of 10 to 50 cm.

    Assuming the ocean circulation in geostrophic and hydrostatic balance these surface elevations can be related to geostrophic surface current velocities Us and Vs via:

    Sketch of the sea surface elevation plus the basic formulas
    where f represents the Coriolisparameter, g the gravitation acceleration and zeta the sea surface elevation. For simplicity carthesian coordinates are used with x positive westward and y positve northward. In this example the velocity is directed into the plane on the northern hemisphere.

    To make the variability of surface currents visible the sea surface height anomaly is shown in the animation; the temporal mean has been subtracted at each grid point.

            Anomalies of sea surface height

    Clickable map, mpegs for different zones (1..6)

    A time series of 3-day mean SSH fields covering a period of 3 years has been used for the animation. Colors indicate the deviation on each model grid point from its 3 year mean value. Selected areas show the activity in the Falklands-Malvinas confluence zone (2), the Agulhas Retroflection (3), eddy activity along
    the Subantartic Polar Front (1). Time series showing the area of the Caribbean Sea and the North Brazil Current (4+5) and the Subpolar North Atlantic have been calculated from the 1/3° FLAME North Atlantic Model.

    Related Projects

         WOCE - Numerical modelling of the Agulhas Region
         MAST II DYNAMO - Dynamics of North Atlantic Models
     

    *  Computational Requirements
    Typical computational costs and performance values for an ocean application (high resolution Atlantic) are given for a 120 Processor Element run on a T3E-600 of the AWI parallel MOM2 version with essential optimizations Rainer Johanni and Klaus Ketelsen from SGI/Cray:
     

    Region_name Av. Time Total average Max Min Number
      [sec] Gflop Mflop     calls
    DRIVER: 5972.63 6.027 50.22 51.47 40.75 1
    -CLINIC: 1458.59 8.319 69.33 69.83 68.32 18300
    -TRACER: 1537.44 7.493 62.44 64.52 60.65 18300
    -CONGRAD: 1256.56 3.885 32.37 33.58 26.96 4575
    -ADV_VEL: 283.28 8.851 73.76 74.76 72.49 18300
    -LOADMW: 383.21 1.439 11.99 12.12 11.51 18300
    -CONVCT2: 109.15 16.218 135.15 202.88 85.98 18300
    -VMIXC: 371.02 8.105 67.54 75.70 49.52 18300

    Shown here are the results obtained with the performance analysis tool (PAT) which provides a low-overhead method for estimating the amount of time spent in functions of an application. The average values are computed from data of each of the 120 PEs.
    The respective routines are:

    Computational Cost

    For 360 days integration time on a T3E-600 (a T3E-900 needs approximately 15% less CPU time) Compute Time is 35800 Sec CPU / PE, 1.65E-07 sec per grid point.

    Model I/O rates

    Context  Time span  amount of data 
    model output  6 h  18 * 171207192 Bytes = 
    2.87 GB 
    max. throughput 1 sec  70 MB/s 

    The routine writing the prognostic variables to disk during the integration is highly parallel. If there are 2 PEs in i-direction (2-dimensional data segmentation) for example, each second PE collects the data for its j-lines and writes it to temporary files. When using 120 PEs this means, that 60 PEs are writing at the same time. In a post-production job the files are merged. The total transfer rate can be up to 70 MB/s.

    Summary
    As typical model interpretation times are always decades (summing up to some 500000 CPU sec elapsed time (about 6 days)) it is clearly evident that without Metacomputing coupling techniques in a Gigabit network environment in the future more refined model runs (in terms of spatial and temporal resolution) would be difficult to achieve. 


    KFA-HomepageZAM-HomepageKeyword Search 
    Forschungszentrum Jülich, ZAM, Th.Eickermann@fz-juelich.de
    20-Aug-1999

    URL: <http://www.fz-juelich.de/gigabit/gtbw_distributedclimate.html>