Sie sind auf Seite 1von 23

Met Office Unified Model I/O Server

Paul Selwood

Crown copyright Met Office


I/O Server motivation

Crown copyright Met Office


Some History

I/O has always been a problem for NWP, more


recently for climate
~2003 application level output buffering
~2008 very simple, single threaded I/O
servers added for benchmarking
Intercepted low-level open/write/close
Single threaded
Some benefit, but limited
Not addressed scaling issues message numbers

Crown copyright Met Office


Old UM I/O Restart Files

Crown copyright Met Office


Old UM I/O - Diagnostics

Crown copyright Met Office


Why I/O Server approach?

Full parallel I/O difficult with our packing


Free CPUs available
Spare memory available
Chance to re-work old infrastructure
Our file format is neither GRIB or netCDF.

Crown copyright Met Office


Diagnostic flexibility
Variables (primary and derived)
Output times
Temporal processing (e.g. accumulations, extrema,
means)
Spatial processing (sub-domains, spatial means)
Variable to unit mapping
Basic output resolution is a 2D field

Crown copyright Met Office


Key design decisions
Parallelism over output streams
Output streams distributed over servers
Server is threaded
Listener receives data & puts in queue
Writer processes queue including packing
Ensures asynchronous behaviour
Shared FIFO queue
Preserves instruction order
Metadata/Data split
Data initially stored on compute processes
Data of same type combined into large messages
Crown copyright Met Office
Parallelism in I/O Servers
Multiple I/O streams in
typical job
I/O servers spread
among nodes
Can utilise more
memory
Will improve
bandwidth to disk

Crown copyright Met Office


Automatic post-processing

Model can trigger automatic post-processing


Requests dealt with by I/O Server
FIFO queue ensures integrity of data

Crown copyright Met Office


How data gets output
Thread 1
Thread 0

Compute I/O
Listener Writer

Crown copyright Met Office


I/O Server development

Initial version Synchronous data transmission


Asynchronous diagnostic data
Asynchronous restart data
Amalgamated data
Asynchronous metadata
Load balancing
Priority messages with I/O Server

Crown copyright Met Office


Lots of diagnostic output

Which processes are I/O servers


Stall messages
Memory log
Timing log
Full log of metadata / queue

All really useful for tuning!

Crown copyright Met Office


Lots of tuneable parameters

Number and spacing of I/O servers


Memory for I/O servers
Number of local data copies
Number of fields to amalgamate
Load balancing options
Timing tunings

+ standard I/O tunings (write block size) etc


Crown copyright Met Office
Overloaded servers

Crown copyright Met Office


I/O Servers keeping up!

Crown copyright Met Office


MPI considerations

Differing levels of MPI threading support


Best with MPI_THREAD_MULTIPLE
OK with MPI_THREAD_FUNNELED

MPI tuning
Want metadata to go as quickly as possible
Want data transfer to be truly asynchronous
Dont want to interfere with model comms (e.g. halo
exchange)
Currently use 19 environment variables!
Crown copyright Met Office
Deployment

July 2011 Operational global forecasts


January 2012 Operational LAM forecasts
February 2012 High resolution climate work

Not currently used in


Operational ensembles
Low resolution climate work
Most research work

Crown copyright Met Office


Global Forecast Improvement

QG QG QU
00/12 06/18

Time 777s 559s 257s

%age 19% 28% 27%

Total saving: over 21 node-hours per day


Crown copyright Met Office
Impact on High Resolution
Climate
N512 resolution AMIP
59 GB restart dumps
Modest diagnostics
Cray XE6 with up to 9K
cores

All in-run output hidden


Waits for final restart
dump
Most data buffered on
client side
Crown copyright Met Office
Current and Future
Developments
MPI Parallel I/O servers
Multiple I/O servers per stream
Gives more memory per stream on server
Reduced messaging rate per node
Parallel packing
Potential for parallel I/O

Read ahead
Potential for boundary conditions / forcings
Some possibilities for initial condition
Crown copyright Met Office
Parallel I/O server improvement
Before

After

Crown copyright Met Office


Questions and answers

Crown copyright Met Office

Das könnte Ihnen auch gefallen